Introduce selves and interest in topic Please, please, please ask questions. No question too basic (although some might be too complex for me, but I’ll do my best). I don’t want to see on the evaluation form at the end of the day, pretty good, but it didn’t answer my question.
Metadata Workshop-Maastricht - November 6, 2008 - Presentation Transcript
AMY BENSON 6 november 2008 What’s New with AACR2? And the Rest of the Bibliographic Universe, too
Overview
Terms and definitions
What do all the acronyms mean?
Categories of metadata schemes and tools
How do they relate to each other?
Uses and functions
What do you do with them?
It’s all about me
Which ones will affect my work?
Metadata Standards
Metadata format standards
XML
Metadata element sets
MARC, MODS, ONIX, DC, EAD, TEI, CDGM
Metadata content and value standards
AACR, RDA, DACS, CCO
FRBR
Access
OAI, Next generation catalogs
Metadata - What is it?
Data about data
Information about any aspect of a resource - size, location, topic, origin, use, audience, creator, quality, access rights, reviews… the list is endless
An aid to the discovery, identification, assessment, and management of described entities
Captured or created
Types of Metadata
Descriptive/Discovery
What is it?
How can I find it?
How can I get to it?
Structural
What files comprise it?
Which file is page one?
Administrative
What do I need to know to manage it?
Can I use it?
How was it created?
What needs to be preserved?
Metadata - Who needs it?
Impact of metadata on collection access
Improves service to users
Provides the means for resource discovery, grouping, filtering, matching user needs
Drives functionality/capabilities
Keyword searching works only for resources that are text-based - excludes photographs, data sets, objects, maps, audio, video…
Google Image Labeler
Metadata decisions should reflect project goals
Interoperability
Interoperability allows different computer systems, networks, and software to work together
Usually achieved by following standards
Allows different systems to make use of same data
Generally, an increase in specialization results in a decrease in interoperability
Important feature of metadata in today’s world
Interoperability
Can increase awareness and use of collections
Reduces geographic and domain-specific isolation of collections
Likely to assist / promote the longevity of data and collections
One-stop access to the universe of online resources
eXtensible Markup Language (XML)
Developed by WWW Consortium (W3C)
Open, international standard
A structure for storing and tagging information, without prescribing how the information is displayed or used
Platform independent
A way to use the same data for many different purposes
Facilitates the sharing of data across institutions and projects
XML - Background
Markup languages
Label/tag information
Structure documents
TOC, Chapters, Index, etc.
Help computers “understand” the data
Use tags, similar to HTML
<title>Gone with the wind</title>
XML allows for coding of hierarchical relationships – often necessary for complex documents, 3-D objects, archives, etc.
XML is extensible – an important feature that allows tags to be created by users or a community of users
XML defines the syntax, but not the data elements that make up an XML document
XML - Elements Example
list (book+)
book (title, author+, date+, year, comment, code)
title <value>
author (aulast, aufirst)
aulast <value>
aufirst <value>
date (day, month)
day <value>
month <value>
year <value>
comment <value>
code <value>
XML Record Example
<book>
<title>Weaving the Web</title>
<author>
<aulast>Berners-Lee,</aulast>
<aufirst>Tim</aufirst>
</author>
<date>
<day>6</day>
<month>January</month>
</date>
<year>2002</year>
<comment>Interesting topic, but not too well written.</comment>
<code>nonfiction</code>
</book>
XML – DTDs and Schemas
Tags, definitions, and requirements are set and adhered to by a community of users
MARC XML, RecipeML, EAD
Two methods of defining specific XML implementations
DTDs (Document Type Definition)
Schemas
Lay out the logical structure of the data
Establish rules about which elements a document may have, which are required, which can repeat, etc.
Establish a root element, parent and child elements, and where data can be placed within hierarchy
RecipeML
XML – Ways to use XML
XML-encoded data is able to be re-used in multiple contexts
Due to its ability to be easily parsed, software can transform it in countless ways, thereby allowing:
Easy migration paths
Alternative displays
On-the-fly response to user needs
XML prescribes the structure of a document, or record but not content or display
Transform XML for display via style sheets (XSL) and transformations (XSLT)
XML File
XML File Transformation via XSL and XSLT
MARC
MAchine-Readable Cataloging (MARC)
Standard used to exchange, use, and interpret bibliographic information in libraries
Long, established, successful history
Large quantity of MARC data exists
Weak on rights information, etc.
Low extensibility
Highly interoperable within the library community, but not beyond
MARC Format
Leader / fixed field
Coded values
Tags / fields
Numeric labels for specific data elements
Indicators
Additional information about content in the field
Subfields
Segment data in fields into smaller units
MARC
Basic Tag Groups
0XX Control information, numbers, codes
Example: 020 for ISBN
1XX Main entry
Example: 100 for personal name
2XX Titles, edition, imprint (in general, the title, statement of responsibility, edition, and publication information )
Example: 245 for title
3XX Physical description, etc.
Example: 300 for extent
4XX Series statements (as shown in the book)
Example: 490 for untraced series, or traced differently
MARC
Basic Tag Groups
5XX Notes
Example: 520 for summary note
6XX Subject added entries
Example: 650 for topical subject heading
7XX Added entries other than subject or series
Example: 700 for added entry, personal name
8XX Series added entries (other authoritative forms)
Example: 830 for series added entries in title form
9XX Locally-defined uses
Example: 949 for barcode numbers
MARC Record
MARC XML
Future of MARC
Can it survive?
MARC XML developed by the Library of Congress (LC)
Allows representation of a complete MARC record in XML
LC has developed a schema, stylesheets, tools, and crosswalks
Will support new transformations for new uses of MARC data and into other standards such as MODS, EAD, ONIX, DC
MARCXML Example
Metadata Object Description Schema (MODS)
Set of 20 bibliographic elements - a subset of the MARC 21 Format for Bibliographic Data
XML-based standard
Alternative to MARC
Can be used for conversion of existing MARC records or to create new resource description records
Useful for library applications that want to go beyond the OPAC
MODS Elements
TitleInfo
Name
TypeOfResource
Genre
PublicationInfo
Language
PhysicalDescription
Abstract
TableOfContents
TargetAudience
Note
Cartographics
Subject
Classification
RelatedItem
Identifier
Location
AccessCondition
Extension
RecordInfo
MODS Elements
Elements can have sub-elements and attributes which provide refining detail for the element
Elements and sub-elements are repeatable, except in certain cases
Elements display in any order
More on MODS
http://www.loc.gov/standards/mods/
MODS Example
MODS Editor at Brown University
ONline Information eXchange (ONIX)
ONIX is the international standard for representing and communicating book industry product information in electronic form
Developed and maintained by EDItEUR and other international groups
XML-based
Focused on e-commerce of books
Synchronize the widely varying formats of major book wholesalers and retailers – interoperability
The need for richer book data online to improve sales
May appear in future library applications
ONIX
Other Metadata Standards
Encoded Archival Description (EAD)
Electronic Finding Aids
Document Data Initiative (DDI)
Data sets
Content Standard for Digital Geospatial Metadata (CDGM)
Primary standard for geospatial metadata
Visual Resources Association (VRA) Core
Visual culture and images that document them
Crosswalks
Crosswalks map an element from one scheme to its closest equivalent in another scheme
Convert data from one format to another - one that is potentially more widely accessible
Support cross-domain searching and interoperability of data
Dublin Core (DC)
A method of describing resources intended to facilitate the discovery of electronic resources
Designed to allow simple description of resources by non-catalogers as well as specialists
National and International standard
ANSI/NISO standard Z39.85-2001
ISO standard 15836
Includes 15 “core” elements
Dublin Core Elements
Title
Creator
Subject
Description
Publisher
Contributor
Date
Type
Format
Identifier
Source
Language
Relation
Coverage
Rights
Dublin Core
All elements optional and repeatable
Authority control not required
Simple and Qualified DC
Simple
Less Rich
Lowest common denominator
Qualified
More precise
Less interoperable
Extensible and flexible
“ Container” agnostic
Dublin Core Examples
Generic
Title=“The sound of music”
HTML
<meta name = "DC.Title" content = “The sound of music”>
<dc:title> The Sound of Music</dc:title> </metadata>
Systems for Metadata
Metadata has to be stored and maintained somewhere
Digital content management systems
Databases
Software tools
Administration
Access
DC Record in OCLC Connexion
C/W MARS Digital Treasures Hosted Repository Using CONTENTdm
Content Standards
AACR (Anglo-American Cataloguing Rules)
“ The rules cover the description of, and the provision of access points for, all library materials commonly collected at the present time.”
The current text is the 2nd ed, 2002 Revision (with 2003, 2004, and 2005 updates)
The Joint Steering Committee for Revision of AACR (JSC) is working on a new code, “RDA: Resource Description and Access” scheduled for publication in 2009
RDA: Why?
Seen by many as time to take an opportunity to simplify the code and establish it as a content standard for resource description for libraries and beyond
Intended to support the objectives of resource discovery and user tasks based on the FRBR model
Provide more consistency, less redundancy within the rules
Planned as Web-based product, but will also be available in print (somehow at some point)
RDA: Goals
Flexible framework for describing all types of resources – analog and digital
Simplify rules
Create data that is readily adaptable to new and emerging database structures
Encourage use beyond the library community
Create data that is compatible with existing records in online library catalogs
Generate records that contain data that is relevant and important to users
International Scope
Designed to be a multi-national content standard
Developed for use in English language communities, but can be used in other language communities
Independent of the format used to communicate information
Compatible with other standards for resource description and retrieval
RDA and FRBR
Functional Requirements for Bibliographic Records
A new view of the bibliographic universe
FRBR is part of the conceptual foundation for RDA and makes use of FRBR terminology
Result of a study undertaken from 1992-1997 by a group of experts and consultants under IFLA
A conceptual model that establishes entities in relationship within and among 3 basic categories
Works, Persons, Subjects
RDA will highlight FRBR relationships
FRBR Group 1 Entities
The FRBR model divides Group 1 (bibliographic entities) into four levels of representation - the building blocks of the FRBR model
Work
Expression
Manifestation
Item
FRBR Group 1 Entities
Work
A distinct intellectual or artistic creation, in the abstract
Expression
The intellectual or artistic realization of a work by an illustrator, translator, performer
Manifestation
The physical embodiment of an expression of a work - a published edition
Item
A single example of a manifestation (copy)
A Work
“ A Work is an abstract entity; there is no single material object one can point to as the work. We recognize the Work through individual realizations, or Expressions of the Work, but the Work itself only exists in the commonality of content between and among the various Expressions of the work.” – FRBR Final Report
Abstract concept
What we mean when we say we’ve read Tale of Two Cities, or Pride and Prejudice
An Expression
“ The intellectual realization of a Work” in some form - FRBR Final Report
Abstract concept
Forms
Revisions, updates, abridgements, enlargements, translations, annotations, critical editions, etc.
A Manifestation
“ The physical embodiment of an Expression of a Work ” FRBR Final Report
Set of objects – still somewhat abstract
Manuscripts, Books, Periodicals, Maps, Posters, Films, CD-ROMs, etc.
Manifestations of a Work take different forms
Example: E-text of Pride and Prejudice from Project Gutenberg versus the Oxford Illustrated edition
Level at which we traditionally catalog library materials
An Item
“ A single exemplar of a Manifestation ” – FRBR Final Report
One specific concrete physical object
Designates a copy, or the circulation level of a bibliographic entity
Items may vary where the variations are a result of actions external to the intent of the producer of the Manifestation
Houghton Library’s first edition of Gone with the Wind previously owned by Thomas Wolfe
Top two levels: abstract intellectual/artistic content Lower two levels: physical recording of content
Specific Example
Groups 2 and 3
Group 2: Actor Entities
Persons or corporate bodies “responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of the entities in the first group”
Group 3: Subject Entities
What a Work may be about including
Groups 1 & 2 (Other Works, People, Corporate Bodies)
Concepts
Objects
Events
Places
Advantages of FRBR
Better logic and organization to catalog
OPAC becomes simpler to navigate and understand
Easier to see all available Expressions/ Manifestations of a single Work
Easier to find desired resource when search results are grouped and related in meaningful ways
Display bibliographic entities within the context of related, established entities
Better understanding of relationships among related Works or Expressions
Advantages of FRBR
Ability to group individual records for Works to facilitate navigation and selection
Enable ILL holds at different levels depending on patron needs
Work, Expression, Manifestation, Item
Create catalog records once at the Work level, often including subject headings and classification, then use and expand on those records for Expressions/Manifestations
Traditional OPAC Display
OCLC WorldCat Analysis
Sample of 996 records (Manifestations)
Used computer algorithms to identify Works from within the group
78% have only a single Manifestation
99% of Works in WorldCat have 7 or fewer Manifestations
1% of the whole benefits the most from FRBRization
Provide a clearly defined, structured frame of reference for relating the data that are recorded in authority records to the needs of the users of those records
Deals with entities related to authority data
Expose the relationships between persons (or personas), names, and access points
Assist in an assessment of the potential for international sharing and use of authority data both within the library sector and beyond
Virtual International Authority File
Current draft dated April 1, 2007
RDA: Quick Overview
Designed for the digital environment
Description and access of all digital ( and analog) resources
Usable by libraries and other metadata communities
Working with RDA
Product developed as a subscription-based, XML-driven, online web system
Will include sample workflows, a core element set, customized views, links to rules, local notes, saved search profile s
Take the cataloger through the various data elements to be included in the resource description
Describe the purpose and scope of each element
Where to look for that element
How to record it
Build cataloger’s judgement
RDA and MARC
RDA establishes a clear line of separation between the recording of data and the presentation of data
Content vs. display
ISBD punctuation will be one option in an appendix
AACR2 and MARC are separate standards
RDA will remain a separate standard
RDA assists with the creating the content of the bibliographic record
MARC21 is one possible schema for encoding records created using RDA
RDA will be able to be used with any metadata standard such as MODS or Dublin Core
RDA: Impact on Libraries and Systems
The Joint Steering Committee is striving to minimize need for retrospective adjustments to pre-RDA records
RDA instructions are designed to be independent of the format, medium, or system used to store or communicate the data
ILS systems may implement RDA in different ways
Intended to be adaptable to newly-emerging database structures
RDA and FRBR would be better optimized in a relational database structure
RDA Timeline
Joint Steering Committee
http://www.collectionscanada.gc.ca/jsc/
Began as AACR3 in 2004
Renamed RDA in 2005
Reorganized in 2007 to follow FRBR
Scheduled for publication in 2009
Scheduled for implementation by the national libraries in 2010, pending evaluation
RDA Timeline: LC Plans
Oct. 2007 – CoP (Committee of Principals) issues statement on joint implementation of RDA
Jan. 2008 – report of the LC Working Group on the Future of Bibliographic Control released
http://www.loc.gov/bibliographic-future/
Recommends that LC “suspend work on RDA” until business case (return on investment) analyzed
May 2008 -LC, NLM, and NAL issue joint statement announcing their intention to evaluate RDA jointly to assist with implementation decision
June 2008 – LC’s official response to the LCWG report
RDA Timeline: Implementation
November 2008?? – first full draft of content to be released in online product for comment
Mid-January 2009 – comment period closes
Early March 2009 – JSC and CoP meet in Chicago. JSC finalizes review of comments received
Third quarter calendar 2009 – RDA is released
Last quarter calendar 2009–early 2010 – CoP national libraries evaluate RDA prior to implementation
RDA: Evaluation
LC, NLM, NAL plus 10-20 others
Selected libraries (PCC libraries, including small libraries in NACO funnel projects)
Library school
Archives
Non-MARC users
OCLC, Ex Libris, and other vendors
Criteria for evaluation under development
Usability, Technical, Financial criteria
Other Content Standards
International Standard Bibliographic Description (ISBD)
A family of standards to regularize the form and content of bibliographic descriptions
Available for different material types: monographs, computer files, etc.
Designed to promote record sharing and exchange
Other Content Standards
Book Industry Standards And Communications (BISAC)
Metadata Committee developed a Best Practices document
Intended as a response to the question, “I’ve downloaded the ONIX documentation. Now what?”
Its overriding purpose is to detail what data should be supplied and how it should be supplied
Other Content Standards
Describing Archives: A Content Standard (DACS)
Designed to facilitate consistent, appropriate, and self-explanatory description of archival materials and creators of archival materials
Replaces Archives, Personal Papers, and Manuscripts (APPM)
Other Content Standards
Cataloging Cultural Objects (CCO)
A guide to describing cultural works and their images
Provides guidelines for selecting, ordering, and formatting data used to populate catalog records
Designed to promote good descriptive cataloging, shared documentation, and enhanced end-user access
A project of the Visual Resources Association
Access: Open Archives Initiative (OAI)
A tool that supports interoperability among multiple databases
OAI goal: coarse-granularity resource discovery
Supports cross-database searching
Aggregates metadata from multiple community-specific repositories
Data providers expose (make available) the metadata for their collections
Service providers harvest the exposed metadata from data providers and aggregate it
OAI
OAI Protocol for Metadata Harvesting
Metadata content must be encoded in XML and have a corresponding XML schema for validation
Metadata must be supplied in unqualified Dublin Core format, at least
Other metadata formats are optional, but recommended
Metadata may optionally include a link to the actual content / resource
OAI Infrastructure repository repository repository repository Harvester Service Provider DC DC DC DC DC
OAI Infrastructure user Harvested Repository search Original repository
Book Details from U. of Chicago
Boston College
VuFind – Integrating Data Sources
OCLC’s WorldCat Local
OCLC’s Fiction Finder
OCLC’s Fiction Finder
Social Data
Libraries have some very useful data
When made available in standardized formats the data can be used in new ways
Wall of Books
iGoogle widget
Embrace data from users
Seek outside sources of information to bring in that might enhance the user experience
Wall of Books created by AADL Patron
xISBN
A Web Service that takes as input one ISBN and returns a list of other ISBNs of associated intellectual works – other expressions and manifestations
Search results on one specific ISBN can be misleading
Results intended for use by computer systems to generate new, more complete searches such as in an OPAC
xISBN Web Service Result
Library Lookup
Library Look Up
Metadata’s Ideal Profile
Metadata Characteristics
Standards-based
Consistent
Descriptive
Sharable
Contextual
Modular
Adjustable
Portable
Questions? Amy Benson Librarian/Archivist for Digital Initiatives Schlesinger Library Radcliffe Institute for Advanced Study Harvard University [email_address]
0 comments
Post a comment