Introduce selves and interest in topic Please, please, please ask questions. No question too basic (although some might be too complex for me, but I’ll do my best). I don’t want to see on the evaluation form at the end of the day, pretty good, but it didn’t answer my question.
1 Favorite
Amit Ranjan, CoFounder & COO at SlideShare, favorited this 11 months ago
AMY BENSON 5 november 2008 What’s New with AACR2? And the Rest of the Bibliographic Universe, too.
Overview
Terms and definitions
What do all the acronyms mean?
Categories of metadata schemes and tools
How do they relate to each other?
Uses and functions
What do you do with them?
It’s all about me
Which ones will affect my work?
Metadata Standards
Metadata format standards
XML
Metadata element sets
MARC, MODS, ONIX, DC, EAD, TEI, CDGM
Metadata content and value standards
AACR, RDA, DACS, CCO
FRBR
Access
METS, Z39.50, OAI, Next generation catalogs
Metadata - What is it?
Data about data
Information about any aspect of a resource - size, location, topic, origin, use, audience, creator, quality, access rights, reviews… the list is endless
An aid to the discovery, identification, assessment, and management of described entities
Captured or created
Types of Metadata
Descriptive/Discovery
What is it?
How can I find it?
How can I get to it?
Structural
What files comprise it?
Which file is page one?
Administrative
What do I need to know to manage it?
Can I use it?
How was it created?
What needs to be preserved?
Metadata - Who needs it?
Impact of metadata on collection access
Improves service to users
Provides the means for resource discovery, grouping, filtering, matching user needs
Drives functionality/capabilities
Keyword searching works only for resources that are text-based - excludes photographs, data sets, objects, maps, audio, video…
Google Image Labeler
Metadata decisions should reflect project goals
Interoperability
Interoperability allows different computer systems, networks, and software to work together
Usually achieved by following standards
Allows different systems to make use of same data
Generally, an increase in specialization results in a decrease in interoperability
Important feature of metadata in today’s world
Interoperability
Can increase awareness and use of collections
Reduces geographic and domain-specific isolation of collections
Likely to assist / promote the longevity of data and collections
One-stop access to the universe of online resources
eXtensible Markup Language (XML)
Developed by WWW Consortium (W3C)
Open, international standard
A structure for storing and tagging information, without prescribing how the information is displayed or used
Platform independent
A way to use the same data for many different purposes
Facilitates the sharing of data across institutions and projects
XML - Background
Markup languages
Label/tag information
Structure documents
TOC, Chapters, Index, etc.
Help computers “understand” the data
Use tags, similar to HTML
<title>Gone with the wind</title>
XML allows for coding of hierarchical relationships – often necessary for complex documents, 3-D objects, archives, etc.
XML is extensible – an important feature that allows tags to be created by users or a community of users
XML defines the syntax, but not the data elements that make up an XML document
XML - Elements Example
list (book+)
book (title, author+, date+, year, comment, code)
title <value>
author (aulast, aufirst)
aulast <value>
aufirst <value>
date (day, month)
day <value>
month <value>
year <value>
comment <value>
code <value>
XML Record Example
<book>
<title>Weaving the Web</title>
<author>
<aulast>Berners-Lee,</aulast>
<aufirst>Tim</aufirst>
</author>
<date>
<day>6</day>
<month>January</month>
</date>
<year>2002</year>
<comment>Interesting topic, but not too well written.</comment>
<code>nonfiction</code>
</book>
XML – DTDs and Schemas
Tags, definitions, and requirements are set and adhered to by a community of users
MARC XML, RecipeML, EAD
Two methods of defining specific XML implementations
DTDs (Document Type Definition)
Schemas
Lay out the logical structure of the data
Establish rules about which elements a document may have, which are required, which can repeat, etc.
Establish a root element, parent and child elements, and where data can be placed within hierarchy
RecipeML
XML – Ways to use XML
XML-encoded data is able to be re-purposed: re-used in multiple contexts
Due to its ability to be easily parsed, software can transform it in countless ways, thereby allowing:
Easy migration paths
Alternative displays
On-the-fly response to user needs
Transform XML for display via style sheets (XSL) and transformations (XSLT)
XSL - XSLT
XML prescribes the structure of a document, or record but not content or display
eXtensible Stylesheet Language (XSL) is used to display the XML in user-friendly ways
Different stylesheets render the data in different ways
Similar to Cascading stylesheets used with X/HTML
XML Stylesheet Language Transformations (XSLT) is a markup language and programming syntax for processing XML
Is most often used to:
Transform XML to HTML for delivery to standard web clients
Transform XML from one set of XML tags to another
XML File
XML File Transformation via XSL and XSLT
MARC
MAchine-Readable Cataloging (MARC)
Standard used to exchange, use, and interpret bibliographic information in libraries
Long, established, successful history
Large quantity of MARC data exists
Weak on rights information, etc.
Low extensibility
Highly interoperable within the library community, but not beyond
MARC
Basic tag groups
0XX Control information, numbers, codes
Example: 020 for ISBN
1XX Main entry
Example: 100 for personal name
2XX Titles, edition, imprint (in general, the title, statement of responsibility, edition, and publication information )
Example: 245 for title
3XX Physical description, etc.
Example: 300 for extent
4XX Series statements (as shown in the book)
Example: 490 for untraced series, or traced differently
MARC
Basic tag groups
5XX Notes
Example: 520 for summary note
6XX Subject added entries
Example: 650 for topical subject heading
7XX Added entries other than subject or series
Example: 700 for added entry, personal name
8XX Series added entries (other authoritative forms)
Example: 830 for series added entries in title form
9XX Locally-defined uses
Example: 949 for barcode numbers
MARC Record
MARC XML
Future of MARC
Can it survive?
MARC XML developed by the Library of Congress (LC)
Allows representation of a complete MARC record in XML
LC has developed a schema, stylesheets, tools, and crosswalks
Will support new transformations for new uses of MARC data and into other standards such as MODS, EAD, ONIX, DC
MARCXML Example
Metadata Object Description Schema (MODS)
Set of 20 bibliographic elements - a subset of the MARC 21 Format for Bibliographic Data
XML-based standard
Alternative to MARC
Can be used for conversion of existing MARC records or to create new resource description records
Useful for library applications that want to go beyond the OPAC
MODS Elements
TitleInfo
Name
TypeOfResource
Genre
PublicationInfo
Language
PhysicalDescription
Abstract
TableOfContents
TargetAudience
Note
Cartographics
Subject
Classification
RelatedItem
Identifier
Location
AccessCondition
Extension
RecordInfo
MODS Elements
Elements can have sub-elements and attributes which provide refining detail for the element
Elements and sub-elements are repeatable, except in certain cases
Elements display in any order
More on MODS
http://www.loc.gov/standards/mods/
MODS Example
MODS Editor at Brown University
ONline Information eXchange (ONIX)
ONIX is the international standard for representing and communicating book industry product information in electronic form
Developed and maintained by EDItEUR and other international groups
XML-based
Focused on e-commerce of books
Synchronize the widely varying formats of major book wholesalers and retailers – interoperability
The need for richer book data online to improve sales
May appear in future library applications
ONIX
Other Metadata Standards
Encoded Archival Description (EAD)
Electronic Finding Aids
Text Encoding Initiative (TEI)
Electronic texts
Content Standard for Digital Geospatial Metadata (CDGM)
Primary standard for geospatial metadata
Visual Resources Association (VRA) Core
Visual culture and images that document them
Crosswalks
Crosswalks map an element from one scheme to its closest equivalent in another scheme
Convert data from one format to another - one that is potentially more widely accessible
Support cross-domain searching and interoperability of data
Dublin Core (DC)
A method of describing resources intended to facilitate the discovery of electronic resources
Designed to allow simple description of resources by non-catalogers as well as specialists
National and International standard
ANSI/NISO standard Z39.85-2001
ISO standard 15836
Includes 15 “core” elements
Dublin Core Elements
Title
Creator
Subject
Description
Publisher
Contributor
Date
Type
Format
Identifier
Source
Language
Relation
Coverage
Rights
Dublin Core
All elements optional and repeatable
Authority control not required
Simple and Qualified DC
Simple
Less Rich
Lowest common denominator
Qualified
More precise
Less interoperable
Extensible and flexible
“ Container” agnostic
Dublin Core Examples
Generic
Title=“The sound of music”
HTML
<meta name = "DC.Title" content = “The sound of music”>
<dc:title> The Sound of Music</dc:title> </metadata>
Systems for Metadata
Metadata has to be stored and maintained somewhere
Digital content management systems
Databases
Software tools
Administration
Access
DC Record in OCLC Connexion
C/W MARS Digital Treasures Hosted Repository Using CONTENTdm
Content Standards
AACR (Anglo-American Cataloguing Rules)
“ The rules cover the description of, and the provision of access points for, all library materials commonly collected at the present time.”
The current text is the 2nd ed, 2002 Revision (with 2003, 2004, and 2005 updates)
The Joint Steering Committee for Revision of AACR (JSC) is working on a new code, “RDA: Resource Description and Access” scheduled for publication in 2009
RDA: Goals
Flexible framework for describing all types of resources – analog and digital
Simplify rules
Create data that is readily adaptable to new and emerging database structures
Encourage use beyond the library community
Create data that is compatible with existing records in online library catalogs
Generate records that contain data that is relevant and important to users
RDA: Why?
Seen by many as time to take an opportunity to simplify the code and establish it as a content standard for resource description for libraries and beyond
Intended to support the objectives of resource discovery and user tasks based on the FRBR model
Provide more consistency, less redundancy within the rules
Planned as Web-based product, but will also be available in print (somehow at some point)
International Scope
Designed to be a multi-national content standard
Developed for use in English language communities, but can be used in other language communities
Independent of the format used to communicate information
Compatible with other standards for resource description and retrieval
RDA and FRBR
FRBR is part of the conceptual foundation for RDA
RDA will include FRBR terminology
RDA will use the FRBR user tasks as the basis for defining a set of mandatory data elements
RDA will highlight relationships between Works, Expressions, Manifestations, and Items as well as among persons, corporate bodies, and families that play some role with respect to the resource being described
FRBR
Functional Requirements for Bibliographic Records
A new view of the bibliographic universe
Result of a study undertaken from 1992-1997 by a group of experts and consultants under IFLA
A conceptual model that establishes entities in relationship within and among 3 basic categories
Works, Persons, Subjects
Goal is to present bibliographic information in ways that better meet the needs of the end user
FRBR – Improve Navigation
Consolidate display of multiple versions
Enable users to navigate result sets to their level of interest
Deliver the most appropriate bibliographic records to users
Enhance browsing and discovery by taking advantage of relationships between topics, subjects, authors, locations …
Group 1: Entities
The FRBR model divides Group 1 (bibliographic entities) into four levels of representation - the building blocks of the FRBR model
Work
Expression
Manifestation
Item
Group 1: Entities
Work
A distinct intellectual or artistic creation, in the abstract
Expression
The intellectual or artistic realization of a work by an illustrator, translator, performer
Manifestation
The physical embodiment of an expression of a work - a published edition
Item
A single example of a manifestation (copy)
A Work
“ A Work is an abstract entity; there is no single material object one can point to as the work. We recognize the Work through individual realizations, or Expressions of the Work, but the Work itself only exists in the commonality of content between and among the various Expressions of the work.” – FRBR Final Report , p. 16
Abstract concept
What we mean when we say we’ve read Moby Dick, or Pride and Prejudice
An Expression
“ The intellectual realization of a Work” in some form – FRBR Final Report , p. 18
Abstract concept
Forms
Revisions, updates, abridgements, enlargements, translations, annotations, critical editions, etc.
A Manifestation
“ The physical embodiment of an Expression of a Work ” – FRBR Final Report , p. 20
Set of objects – still somewhat abstract
Manuscripts, Books, Periodicals, Maps, Posters, Films, CD-ROMs, etc.
Manifestations of a Work take different forms
Example: E-text of Pride and Prejudice from Project Gutenberg versus the Oxford Illustrated edition
Level at which we traditionally catalog library materials
An Item
“ A single exemplar of a Manifestation ” – FRBR Final Report , p. 22
One specific concrete physical object
Designates a copy, or the circulation level of a bibliographic entity
Items may vary where the variations are a result of actions external to the intent of the producer of the Manifestation
Houghton Library’s first edition of Gone with the Wind previously owned by Thomas Wolfe
FRBR Entity Levels Work: Expression: Manifestation: The Novel Orig. Text Transl. Critical Edition Paper PDF HTML The Movie Orig. Version
FRBR Entity Levels Work: Expression: Manifestation: Family of Works The Novel Orig. Text Transl. Critical Edition Paper PDF HTML The Movie Orig. Version
Relationships among the Bibliographic Entities
FRBR specifies particular relationships between classes of Group 1 entities:
a Work is realized through one or more Expressions
each of which is embodied in one or more Manifestations
each of which is exemplified by one or more Items
Top two levels: abstract intellectual/artistic content Lower two levels: physical recording of content
Specific Example
Group 2: Actor Entities
Persons or corporate bodies “responsible for the intellectual or artistic content, the physical production and dissemination, or the custodianship of the entities in the first group” IFLA Final Report , p. 13
Group 2 Work Expression Manifestation Item Person Corporate Body is owned by is produced by is realized by is created by
Group 3: Subject Entities
What a Work may be about
Subjects including
Groups 1 & 2 (Other Works, People, Corporate Bodies)
Concepts
Objects
Events
Places
Group 3 Work has as subject has as subject has as subject Expression Manifestation Item Person Corporate Body Work Concept Object Event Place
Advantages of FRBR
Ability to group individual records for Works to facilitate navigation and selection
Enable ILL holds at different levels depending on patron needs
Work, Expression, Manifestation, Item
Create catalog records once at the Work level, often including subject headings and classification, then use and expand on those records for Expressions/Manifestations
Advantages of FRBR
Better logic and organization to catalog
OPAC becomes simpler to navigate and understand
Easier to see all available Expressions/ Manifestations of a single Work
Easier to find desired resource when search results are grouped and related in meaningful ways
Display bibliographic entities within the context of related, established entities
Better understanding of relationships among related Works or Expressions
Traditional OPAC Display
OCLC WorldCat Analysis
Sample of 996 records (Manifestations)
Used computer algorithms to identify Works from within the group
78% have only a single Manifestation
99% of Works in WorldCat have 7 or fewer Manifestations
1% of the whole benefits the most from FRBRization
Provide a clearly defined, structured frame of reference for relating the data that are recorded in authority records to the needs of the users of those records
Deals with entities related to authority data
Expose the relationships between persons (or personas), names, and access points
Assist in an assessment of the potential for international sharing and use of authority data both within the library sector and beyond
Virtual International Authority File
Current draft dated April 1, 2007
RDA: Quick Overview
Designed for the digital environment
Description and access of all digital ( and analog) resources
Usable by libraries and other metadata communities
Working with RDA
Product developed as a subscription-based, XML-driven, online web system
Will include sample workflows, a core element set, customized views, links to rules, local notes, saved search profile s
Take the cataloger through the various data elements to be included in the resource description
Describe the purpose and scope of each element
Where to look for that element
How to record it
Build cataloger’s judgement
RDA and MARC
RDA establishes a clear line of separation between the recording of data and the presentation of data
Content vs. display
ISBD punctuation will be one option in an appendix
AACR2 and MARC are separate standards
RDA will remain a separate standard
RDA assists with the creating the content of the bibliographic record
MARC21 is one possible schema for encoding records created using RDA
RDA will be able to be used with any metadata standard such as MODS or Dublin Core
RDA: Impact on Libraries and Systems
The Joint Steering Committee is striving to minimize need for retrospective adjustments to pre-RDA records
RDA instructions are designed to be independent of the format, medium, or system used to store or communicate the data
ILS systems may implement RDA in different ways
Intended to be adaptable to newly-emerging database structures
RDA and FRBR would be better optimized in a relational database structure
RDA Timeline
Joint Steering Committee
http://www.collectionscanada.gc.ca/jsc/
Began as AACR3 in 2004
Renamed RDA in 2005
Reorganized in 2007 to follow FRBR
Scheduled for publication in 2009
Scheduled for implementation by the national libraries in 2010, pending evaluation
RDA Timeline – LC Plans
Oct. 2007 – CoP (Committee of Principals) issues statement on joint implementation of RDA
Jan. 2008 – report of the LC Working Group on the Future of Bibliographic Control released
http://www.loc.gov/bibliographic-future/
Recommends that LC “suspend work on RDA” until business case (return on investment) analyzed
May 2008 -LC, NLM, and NAL issue joint statement announcing their intention to evaluate RDA jointly to assist with implementation decision
June 2008 – LC’s official response to the LCWG report
RDA Timeline – Implementation
November 2008?? – first full draft of content to be released in online product for comment
Mid-January 2009 – comment period closes
Early March 2009 – JSC and CoP meet in Chicago. JSC finalizes review of comments received
Third quarter calendar 2009 – RDA is released
Last quarter calendar 2009–early 2010 – CoP national libraries evaluate RDA prior to implementation
RDA: Evaluation
LC, NLM, NAL plus 10-20 others
Selected libraries (PCC libraries, including small libraries in NACO funnel projects)
Library school
Archives
Non-MARC users
OCLC, Ex Libris, and other vendors
Criteria for evaluation under development
Usability, Technical, Financial criteria
RDA: Evaluation
Usability of RDA data and the online product
Ease of use for catalogers and users
Capability of library systems to accommodate RDA records
Applicability to a broad range of material
Technical considerations
Co-existence of RDA and AACR records
Functionality of RDA online product
System developments required prior to implementation
Financial considerations
Training, documentation
Time/cost of creating RDA records compared with AACR
Record conversion costs
Other Content Standards
International Standard Bibliographic Description (ISBD)
A family of standards to regularize the form and content of bibliographic descriptions
Available for different material types: monographs, computer files, etc.
Designed to promote record sharing and exchange
Other Content Standards
Book Industry Standards And Communications (BISAC)
Metadata Committee developed a Best Practices document
Intended as a response to the question, “I’ve downloaded the ONIX documentation. Now what?”
Its overriding purpose is to detail what data should be supplied and how it should be supplied
Other Content Standards
Describing Archives: A Content Standard (DACS)
Designed to facilitate consistent, appropriate, and self-explanatory description of archival materials and creators of archival materials
Replaces Archives, Personal Papers, and Manuscripts (APPM)
Other Content Standards
Cataloging Cultural Objects (CCO)
A guide to describing cultural works and their images
Provides guidelines for selecting, ordering, and formatting data used to populate catalog records
Designed to promote good descriptive cataloging, shared documentation, and enhanced end-user access
A project of the Visual Resources Association
Documentation: Guidelines and Best Practices
CDP Dublin Core Metadata Best Practices
Developed by the Collaborative Digitization Program
Guidelines for creating metadata records for digitized cultural heritage resources
Element set based on Dublin Core
Input Guidelines for Contributor Element
Access
METS
Z39.50
OAI
Next generation catalogs
METS
Metadata Encoding & Transmission Standard
A system for packaging metadata necessary for both the management of digital library objects within a repository and the exchange of such objects between repositories, or between repositories and their users
Used for: Digital collection repositories
Developed by the Digital Library Federation (DLF) and Library of Congress (LC)
METS
METS can be understood as a binder that unites metadata about a particular resource
A METS record includes six parts:
Header
Descriptive metadata
Administrative metadata
File groups
Structural map
Behavior section
METS Schema
Z39.50
Z39.50 is a search and retrieval protocol, maintained by LC, capable of operating over TCP/IP
Negotiates queries with multiple, separate databases
LC is also pursuing work on SRU, a standard search protocol for Internet search queries and CQL, a standard query syntax for representing queries, both based on Z39.50 semantics
Open Archives Initiative (OAI)
A tool that supports interoperability among multiple databases
OAI goal: coarse-granularity resource discovery
Supports cross-database searching
Aggregates metadata from multiple community-specific repositories
Data providers expose (make available) the metadata for their collections
Service providers harvest the exposed metadata from data providers and aggregate it
OAI
OAI Protocol for Metadata Harvesting
Metadata content must be encoded in XML and have a corresponding XML schema for validation
Metadata must be supplied in unqualified Dublin Core format, at least
Other metadata formats are optional, but recommended
Metadata may optionally include a link to the actual content / resource
OAI Infrastructure repository repository repository repository Harvester Service Provider DC DC DC DC DC
OAI Infrastructure user Harvested Repository search Original repository
Book Details from U. of Chicago
Boston College
VuFind – Integrating Data Sources
OCLC’s WorldCat.org
OCLC’s WorldCat Local
OCLC’s Fiction Finder
OCLC’s Fiction Finder
Other OCLC Products
WorldCat Cataloging Partners
Connexion
PromptCat
Bibliographic Notification
Acquisitions List
Social Data
Libraries have some very useful data
When made available in standardized formats the data can be used in new ways
Wall of Books
iGoogle widget
Embrace data from users
Seek outside sources of information to bring in that might enhance the user experience
Wall of Books created by AADL Patron
iGoogle with Top 10 Books at the Library
xISBN
A Web Service that takes as input one ISBN and returns a list of other ISBNs of associated intellectual works – other expressions and manifestations
Search results on one specific ISBN can be misleading
Results intended for use by computer systems to generate new, more complete searches such as in an OPAC
xISBN Web Service Result
Library Lookup
Library Look Up
Book Burro
From Amazon.com to the BPL
Metadata’s Ideal Profile
Metadata Characteristics
Standards-based
Consistent
Descriptive
Sharable
Contextual
Modular
Adjustable
Portable
Questions? Amy Benson Librarian/Archivist for Digital Initiatives Schlesinger Library Radcliffe Institute for Advanced Study Harvard University [email_address]
0 comments
Post a comment