The document discusses different types of metadata schemas used for digital collections, including Dublin Core (DC), Qualified Dublin Core (QDC), MARC, MARCXML, MODS, VRA Core, CDWA Lite, GEM, LOM, TEI, and EAD. It provides information on the purpose, content standards, limitations, and best uses of each schema. The document is intended as a workshop on metadata for digital collections.
Overview of a few Content Management Systems and how they can be used in libraries.
Final Project presentation for MLIS 7505 at Valdosta State University.
Standards to facilitate information exchange has always been a subject of concern.
To provide a flexible exchange format that could be used for converting data from libraries and information services of all types, UNESCO developed the Common Communication Format (CCF). The main aim of this format was to produce a method of organising bibliographic descriptions which could be exchanged between institutions. This format was to act as a link between the databases produced in different internal formats of libraries.
The arrival and enormous growth rate of digital contents have fundamentally changed the way in which content is made available to library users. In the recent years, libraries are acquiring more and more electronic resources (e-resources) because of perceived benefits, such as easy access to information and its comprehensiveness. Due to the influx of e-resources in libraries, the collection, acquisition, and maintenance of these resources have become complicated issues to deal with. This has forced libraries to devise strategies to manage and deliver e-resources conveniently. Therefore, “Management of E-resources” or “Electronic Resource Management” (ERM) has become a challenge for library professionals that needs to be addressed through research and practice. To meet these challenges, library professionals and content providers have decided to develop ‘Electronic Resource Management System’ (ERMS) for management of e-resources in a more systematic way.
Overview of a few Content Management Systems and how they can be used in libraries.
Final Project presentation for MLIS 7505 at Valdosta State University.
Standards to facilitate information exchange has always been a subject of concern.
To provide a flexible exchange format that could be used for converting data from libraries and information services of all types, UNESCO developed the Common Communication Format (CCF). The main aim of this format was to produce a method of organising bibliographic descriptions which could be exchanged between institutions. This format was to act as a link between the databases produced in different internal formats of libraries.
The arrival and enormous growth rate of digital contents have fundamentally changed the way in which content is made available to library users. In the recent years, libraries are acquiring more and more electronic resources (e-resources) because of perceived benefits, such as easy access to information and its comprehensiveness. Due to the influx of e-resources in libraries, the collection, acquisition, and maintenance of these resources have become complicated issues to deal with. This has forced libraries to devise strategies to manage and deliver e-resources conveniently. Therefore, “Management of E-resources” or “Electronic Resource Management” (ERM) has become a challenge for library professionals that needs to be addressed through research and practice. To meet these challenges, library professionals and content providers have decided to develop ‘Electronic Resource Management System’ (ERMS) for management of e-resources in a more systematic way.
Information repackaging is a process to repackage the analyzed, consolidate information in that form which is more suitable & usable for library users. Customization of information taking into account the needs and characteristics of the individual or user groups and matching them with the information to be provided so that diffusion of information occurs.
A presentation on Digital Library Architecture (components of digital library) by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
The prime objective of any library is to meet the information requirements of its clients most effectively. To meet this objective, the library builds the collection in a planned manner and offers a variety of information services to inform the users what is available and whatever latest has been published in their areas of interest. All these services generate requests from the users for the original documents. The service that supplies the required document to the user on demand is known as Document Delivery Service.
Information repackaging is a process to repackage the analyzed, consolidate information in that form which is more suitable & usable for library users. Customization of information taking into account the needs and characteristics of the individual or user groups and matching them with the information to be provided so that diffusion of information occurs.
A presentation on Digital Library Architecture (components of digital library) by Rupesh Kumar A, Assistant Professor, Department of Studies and Research in Library and Information Science, Tumkur University, Tumakuru, Karnataka, India.
The prime objective of any library is to meet the information requirements of its clients most effectively. To meet this objective, the library builds the collection in a planned manner and offers a variety of information services to inform the users what is available and whatever latest has been published in their areas of interest. All these services generate requests from the users for the original documents. The service that supplies the required document to the user on demand is known as Document Delivery Service.
Alphabet soup: CDM, VRA, CCO, METS, MODS, RDF - Why Metadata MattersNew York University
This presentation given to University of Iowa Libraries on Nov. 17, 2014, discussing 1) the alphabet soup of metadata standards, e.g. CDM, VRA, CCO, METS, MODS, RDF, including sample tagging and their applications for digital libraries, and 2) why metadata matters. It does not address metadata issues and tools for metadata creation, extraction, transformation, quality control, syndication and ingest.
This presentation was provided by Vinod Chachra of VTLS Inc. during the NISO event "Next Generation Discovery Tools: New Tools, Aging Standards," held March 27 - March 28, 2008.
Data-driven Applications with conStructMike Bergman
The first unveiling of conStruct, a structured content system for enabling Drupal to be driven by structured (RDF) data. conStruct also is based on the platform-independent structWSF Web services framework, the provides dataset collaboration over the Web. Presentation is from SemTech 2009.
Some Options for Non-MARC Descriptive MetadataJenn Riley
Riley, Jenn. "Some Options for Non-MARC Descriptive Metadata." Presentation to Indiana University Library Technical Services Cataloging Division, December 9, 2008.
Designing the Garden: Getting Grounded in Linked DataJenn Riley
Riley, Jenn. “Designing the Garden: Getting Grounded in Linked Data.” Beyond the Looking Glass: Real World Linked Data. What Does it Take to Make it Work? ALCTS Preconference, San Francisco, CA, June 26, 2015.
Riley, Jenn. “Launching metaware.buzz.” Panelist, Experimental Scholarly Publishing: Building New Models with Distributed Communities of Practice”, Digital Library Federation Forum, October 28, 2014, Atlanta, GA.
Riley, Jenn. “Getting Comfortable with Metadata Reuse.” O Rare! Performance in Special Collections: The 54th Annual RBMS Preconference, Minneapolis, June 23 – 26, 2013
The Open Archives Initiative and the Sheet Music ConsortiumJenn Riley
Dunn, Jon and Jenn Riley. “The Open Archives Initiative and the Sheet Music Consortium.” Digital Library Program Brown Bag Presentation, October 10, 2003.
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...Jenn Riley
Dalmau, Michelle and Jenn Riley. "Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and Searching of an Online Photograph Collection." Digital Library Program Brown Bag Presentation, May 17, 2004.
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...Jenn Riley
Riley, Jenn. "Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums." Digital Library Program Brown Bag Presentation, October 19, 2005.
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...Jenn Riley
Riley, Jenn. "Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Libraries and Museums." Digital Library Program Brown Bag Presentation, October 19, 2005.
Challenges in the Nursery: Linking a Finding Aid with Online ContentJenn Riley
Johnson, Elizabeth, and Jenn Riley. "Challenges in the Nursery: Linking a Finding Aid with Online Content." Digital Library Program Brown Bag Presentation, March 8, 2006.
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
2. 3/6/07 INCOLSA Workshop 2
Many definitions of metadata
“Data about data”
“Structured information about an information
resource of any media type or format.”
(Caplan)
“Any data used to aid the identification,
description and location of networked
electronic resources.” (IFLA)
…
3. 3/6/07 INCOLSA Workshop 3
Refining a definition
Other characteristics
Structure
Control
Origin
Machine-generated
Human-generated
In practice, the term often covers data and
meta-metadata
4. 3/6/07 INCOLSA Workshop 4
Some uses of metadata
By information specialists
Describing non-traditional materials
Cataloging Web sites
Navigating digital objects
Managing digital objects over the long term
Managing corporate assets
By novices
Preparing Web sites for search engines
Describing Eprints
Managing personal CD collections
5. 3/6/07 INCOLSA Workshop 5
Metadata and cataloging
Depends on what you mean by:
metadata, and
cataloging!
But, in general:
Metadata is broader in scope than cataloging
Much metadata creation takes place outside of libraries
Good metadata practitioners use fundamental
cataloging principles in non-MARC environments
Metadata created for many different types of materials
Metadata is NOT only for Internet resources!
6. 3/6/07 INCOLSA Workshop 6
Metadata in digital library projects
Searching
Browsing
Display for users
Interoperability
Management of digital objects
Preservation
Navigation
7. 3/6/07 INCOLSA Workshop 7
Some types of metadata
Type Use
Descriptive metadata Searching
Browsing
Display
Interoperability
Technical metadata Interoperability
Digital object management
Preservation
Preservation metadata Interoperability
Preservation
Rights metadata Interoperability
Digital object management
Structural metadata Navigation
9. 3/6/07 INCOLSA Workshop 9
Creating descriptive metadata
Digital library content management systems
ContentDM
ExLibris Digitool
Greenstone
Library catalogs
Spreadsheets & databases
XML
10. 3/6/07 INCOLSA Workshop 10
Creating other types of metadata
Technical
Stored in content management system
Stored in separate Excel spreadsheet
Structural
Created and stored in content management system
METS XML
GIS
Using specialized software
Content markup
In XML
11. 3/6/07 INCOLSA Workshop 11
Descriptive metadata
Purpose
Description
Discovery
Some common general schemas
Dublin Core (unqualified and qualified)
MARC
MARCXML
MODS
LOTS of domain-specific schemas
12. 3/6/07 INCOLSA Workshop 12
Simple Dublin Core (DC)
15-element set
National and international standard
2001: Released as ANSI/NISO Z39.85
2003: Released as ISO 15836
Maintained by the Dublin Core Metadata
Initiative (DCMI)
Other players
DC Usage Board
DCMI Communities
DCMI Task Groups
13. 3/6/07 INCOLSA Workshop 13
DCMI mission
The mission of DCMI is to make it easier to
find resources using the Internet through the
following activities:
Developing metadata standards for discovery
across domains,
Defining frameworks for the interoperation of
metadata sets, and,
Facilitating the development of community- or
disciplinary-specific metadata sets that are
consistent with items 1 and 2
14. 3/6/07 INCOLSA Workshop 14
DC Principles
Original principles
“Core” across all knowledge domains
No element required
All elements repeatable
1:1 principle
DC Abstract Model
“A reference against which particular DC encoding
guidelines can be compared” model
Two schools of thought on its development
Clarifies model underlying the metadata standard
Overly complicates a standard intended to be simple
15. 3/6/07 INCOLSA Workshop 15
Content/value standards for DC
None required
Some elements recommend a content
or value standard as a best practice
Relation
Source
Subject
Type
Coverage
Date
Format
Language
Identifier
16. 3/6/07 INCOLSA Workshop 16
Some limitations of DC
Can’t indicate a main title vs. other
subordinate titles
No method for specifying creator roles
W3CDTF format can’t indicate date ranges or
uncertainty
Can’t by itself provide robust record
relationships
17. 3/6/07 INCOLSA Workshop 17
Good times to use DC
Cross-collection searching
Cross-domain discovery
Metadata sharing
Describing some types of simple resources
Metadata creation by novices
19. 3/6/07 INCOLSA Workshop 19
Qualified Dublin Core (QDC)
Adds some increased specificity to
Unqualified Dublin Core
Same governance structure as DC
Same encodings as DC
Same content/value standards as DC
Listed in DMCI Terms
Additional principles
Extensibility
Dumb-down principle
20. 3/6/07 INCOLSA Workshop 20
Types of DC qualifiers
Additional elements
Element refinements
Encoding schemes
Vocabulary encoding schemes
Syntax encoding schemes
21. 3/6/07 INCOLSA Workshop 21
DC qualifier status
Recommended
Conforming
Obsolete
Registered
22. 3/6/07 INCOLSA Workshop 22
Limitations of QDC
Widely misunderstood
No method for specifying creator roles
W3CDTF format can’t indicate date ranges or
uncertainty
Split across 3 XML schemas
No encoding in XML (yet) officially endorsed
by DCMI
23. 3/6/07 INCOLSA Workshop 23
Best times to use QDC
More specificity needed than simple DC, but
not a fundamentally different approach to
description
Want to share DC with others, but need a few
extensions for your local environment
Describing some types of simple resources
Metadata creation by novices
25. 3/6/07 INCOLSA Workshop 25
MAchine Readable Cataloging
(MARC)
Format for the records in library catalogs
Used for library metadata since 1960s
Adopted as national standard in 1971
Adopted as international standard in 1973
Maintained by:
Network Development and MARC Standards
Office at the Library of Congress
Standards and the Support Office at the
National Library of Canada
26. 3/6/07 INCOLSA Workshop 26
More about MARC
Actually a family of MARC standards
throughout the world
U.S. & Canada use MARC21
Structured as a binary interchange format
ANSI/NISO Z39.2
ISO 2709
Field names
Numeric fields
Alphabetic subfields
27. 3/6/07 INCOLSA Workshop 27
Content/value standards for MARC
None required by the format itself
But US record creation practice relies
heavily on:
AACR2r
ISBD
LCNAF
LCSH
28. 3/6/07 INCOLSA Workshop 28
Limitations of MARC
Use of all its potential is time-consuming
OPACs don’t make full use of all possible
data
OPACs virtually the only systems to use
MARC data
Requires highly-trained staff to create
Local practice differs greatly
29. 3/6/07 INCOLSA Workshop 29
Good times to use MARC
Integration with other records in OPAC
Resources are like those traditionally found in
library catalogs
Maximum compatibility with other libraries is
needed
Have expert catalogers for metadata creation
31. 3/6/07 INCOLSA Workshop 31
MARC in XML (MARCXML)
Copies the exact structure of MARC21 in an
XML syntax
Numeric fields
Alphabetic subfields
Implicit assumption that content/value
standards are the same as in MARC
32. 3/6/07 INCOLSA Workshop 32
Limitations of MARCXML
Not appropriate for direct data entry
Extremely verbose syntax
Full content validation requires tools external
to XML Schema conformance
33. 3/6/07 INCOLSA Workshop 33
Best times to use MARCXML
As a transition format between a MARC
record and another XML-encoded metadata
format
Materials lend themselves to library-type
description
Need more robustness than DC offers
Want XML representation to store within
larger digital object but need lossless
conversion to MARC
35. 3/6/07 INCOLSA Workshop 35
Metadata Object Description
Schema (MODS)
Developed and managed by the Library of
Congress Network Development and MARC
Standards Office
For encoding bibliographic information
Influenced by MARC, but not equivalent
Usable for any format of materials
First released for trial use June 2002
MODS 3.2 released late 2006
36. 3/6/07 INCOLSA Workshop 36
MODS differences from MARC
MODS is “MARC-like” but intended to be
simpler
Textual tag names
Encoded in XML
Some specific changes
Some regrouping of elements
Removes some elements
Adds some elements
37. 3/6/07 INCOLSA Workshop 37
Content/value standards for MODS
Many elements indicate a given content/value
standard should be used
Generally follows MARC/AACR2/ISBD
conventions
But not all enforced by the MODS XML
schema
Authority attribute available on many
elements
38. 3/6/07 INCOLSA Workshop 38
Limitations of MODS
No lossless round-trip conversion from and to
MARC
Still largely implemented by library community
only
Some semantics of MARC lost
39. 3/6/07 INCOLSA Workshop 39
Good times to use MODS
Materials lend themselves to library-type
description
Want to reach both library and non-library
audiences
Need more robustness than DC offers
Want XML representation to store within
larger digital object
41. 3/6/07 INCOLSA Workshop 41
Visual Resources Association
(VRA) Core
From Visual Resources Association
Separates Work from Image
Library focus
Inspiration from Dublin Core
Version 3.0 released on 2002
Version 4.0 currently in Beta
42. 3/6/07 INCOLSA Workshop 42
Categories for the Description of
Works of Art (CDWA) Lite
Reduced version of the Categories for the
Description of Works of Art (512 categories)
From J. Paul Getty Trust
Museum focus
Conceived for record sharing
43. 3/6/07 INCOLSA Workshop 43
Structure standards for learning
materials
Gateway to Educational Materials (GEM)
From the U.S. Department of Education
Based on Qualified Dublin Core
Adds elements for instructional level, instructional method,
etc.
“GEM's goal is to improve the organization and accessibility
of the substantial collections of materials that are already
available on various federal, state, university, non-profit, and
commercial Internet sites.”*
IEEE Learning Object Metadata (LOM)
Elements for technical and descriptive metadata about
learning resources
* From <http://www.thegateway.org/about/documentation/schemas>
44. 3/6/07 INCOLSA Workshop 44
Text Encoding Initiative (TEI)
TEI in Libraries
For encoding full texts of documents
Literary texts
Letters
…etc.
Requires specialized search engine
Delivery requires specialized software or
offline conversion to HTML
45. 3/6/07 INCOLSA Workshop 45
Encoded Archival Description
(EAD)
Maintained by the Society for American
Archivists EAD Working Group
Markup language for archival finding aids
Designed to accommodate multi-level
description
Requires specialized search engine
Delivery requires specialized software or
offline conversion to HTML
EAD 1.0 released in 1998
EAD2002 finalized in December 2002
46. 3/6/07 INCOLSA Workshop 46
Levels of control
Data structure standards (e.g., MARC)
Data content standards (e.g., AACR2r)
Encoding schemes
Vocabulary
Syntax
High-level models (e.g., FRBR)
Very few metadata standards include a
counterpart to the AACR “chief source of
information”
47. 3/6/07 INCOLSA Workshop 47
Some data content standards
Anglo-American Cataloging Rules, 2nd
edition (AACR2)
Scheduled to be replaced by RDA in 2009
Describing Archives: A Content Standard
(DACS)
Replaces APPM
Cataloging Cultural Objects (CCO)
First content standard explicitly designed for
these materials
51. 3/6/07 INCOLSA Workshop 51
Functional Requirements of
Bibliographic Records (FRBR) model
WORK
EXPRESSION
MANIFESTATION
ITEM
is realized through
is embodied in
is exemplified by
52. 3/6/07 INCOLSA Workshop 52
Using FRBR principles in
metadata creation
Don’t need to take the model literally
For unique materials, much simplification is
possible
Make sure you know how your practices
conform to the high-level model
Be consistent in these practices
53. 3/6/07 INCOLSA Workshop 53
How do I pick standards? (1)
Institution
Nature of holding institution
Resources available for metadata creation
What others in the community are doing
Capabilities of your delivery software
The standard
Purpose
Structure
Context
History
54. 3/6/07 INCOLSA Workshop 54
How do I pick standards? (2)
Materials
Genre
Format
Likely audiences
What metadata already exists for these materials
Project goals
Robustness needed for the given materials and users
Describing multiple versions
Mechanisms for providing relationships between records
Plan for interoperability, including repeatability of elements
More information on handout
55. 3/6/07 INCOLSA Workshop 55
Assessing materials for ease of
metadata creation
Number of items?
Homogeneity of items?
Foreign language?
Published or unpublished?
Specialist needed?
How much information is known?
Any existing metadata?
56. 3/6/07 INCOLSA Workshop 56
Assessing currently existing metadata
Machine-readable?
Divided into fields?
What format?
What content standards?
Complete?
57. 3/6/07 INCOLSA Workshop 57
Assessing software capabilities
Are there templates for standard metadata
formats?
Can you add/remove fields to a template?
Can you create new templates?
Can you add additional clarifying information
without creating a separate field?
Personal vs. corporate names
Subject vocabulary used
Is there an XML export? Does it produce valid
records?
58. 3/6/07 INCOLSA Workshop 58
Case studies in choosing standards
Describe your institution
Describe one collection you’d like to digitize
Describe your technical infrastructure
60. 3/6/07 INCOLSA Workshop 60
Technical metadata
For recording technical aspects of digital objects
For long-term maintenance of data
Migration
Emulation
Much can be generate automatically, but not all
Some examples:
NISO Z39.87: Data Dictionary – Technical Metadata for Digi
& MIX
Schema for Technical Metadata for Text
Forthcoming standard for audio from the Audio
Engineering Society
LC VMD draft schema for technical metadata for video
files
61. 3/6/07 INCOLSA Workshop 61
Image technical metadata
Might include:
Color space
Bit depth
Byte order
Compression scheme
Camera settings
Operator name
62. 3/6/07 INCOLSA Workshop 62
Text technical metadata
Might include:
Character set
Byte order
Font/script
Language
63. 3/6/07 INCOLSA Workshop 63
Audio technical metadata
Might include:
Byte order
Checksum
Sample rate
Duration
Number of channels
64. 3/6/07 INCOLSA Workshop 64
Video technical metadata
Might include:
Bits per sample
Calibration information
Sample format
Signal format
65. 3/6/07 INCOLSA Workshop 65
Preservation metadata
The set of everything you need to know to
preserve digital objects over the long term
Information that supports and documents the
digital preservation process
Includes technical metadata but also other
elements
Covers elements such as checksums,
creation environment, and change history
PREMIS is the prevailing model
66. 3/6/07 INCOLSA Workshop 66
Rights metadata
Machine- or human-readable indications of
rights information for a resource
Can be used to determine if a user can
access a resource
Can indicate rights holder of a resource for
payment purposes
Some current schemas
METS rights
XrML
ODRL
67. 3/6/07 INCOLSA Workshop 67
Structural metadata
For creating a logical structure between
digital objects
Multiple copies/versions of same item
Multiple pages within item
Multiple sizes of each page
Meaningful groups of content
Often handled transparently by a delivery
system
METS is the current primary standard
68. 3/6/07 INCOLSA Workshop 68
Why you should care about these
standards
You will migrate from your current system to
another, probably in the next few years
File formats become obsolete
We have too many interesting collections to
have to re-do work we’ve already done
Standards promote interoperability
69. 3/6/07 INCOLSA Workshop 69
Building “Good digital
collections”*
Interoperable – with the important goal
of cross-collection searching
Persistent – reliably accessible
Re-usable – repositories of digital
objects that can be used for multiple
purposes
*Institute for Museum and Library Services. A Framework of Guidance for Building Good Digital
Collections. Washington, D.C.: Institute for Museum and Library Services, November 2001.
http://www.niso.org/framework/Framework2.html
70. 3/6/07 INCOLSA Workshop 70
Building “Good digital
collections”
Interoperable – with the important goal of
cross-collection searching
Persistent – reliably accessible
Re-usable – repositories of digital objects that
can be used for multiple purposes
Good metadata promotes good digital
collections.
71. 3/6/07 INCOLSA Workshop 71
Sharing your metadata
Harvesting
Collects metadata, processes it, and stores it locally to
respond to user queries
Open Archives Initiative Protocol for Metadata
Harvesting
Federated searching
Transmits user queries to multiple destinations in real
time
ILS vendors currently offering these products
Protocols used
Z39.50
SRU
72. 3/6/07 INCOLSA Workshop 72
OAI Protocol Structure
Intentionally designed to be simple
Data providers
Have metadata they want to share
“Expose” their metadata to be harvested
Service providers
Harvest metadata from data providers
Provide searching of harvested metadata
from multiple sources
Can also provide other value-added services
73. 3/6/07 INCOLSA Workshop 73
Data Providers
Set up a server that responds to harvesting
requests
Required to expose metadata in simple
Dublin Core (DC) format
Can also expose metadata in any other
format expressible with an XML schema
74. 3/6/07 INCOLSA Workshop 74
Service Providers
Harvest and store metadata
Generally provide search/browse access to
this metadata
Can be general or domain-specific
Can choose to collect metadata in formats
other than DC
Generally link out to holding institutions for
access to digital content
75. 3/6/07 INCOLSA Workshop 75
Advantages for Libraries
Any existing rules for description can be
used
Can share metadata without sacrificing local
granularity
Location of unique materials by many users
Domain-specific service providers
Middle ground between Google and OCLC
One of a suite of tools to provide users with
access to all of your materials
76. 3/6/07 INCOLSA Workshop 76
Why share metadata?
Benefits to users
One-stop searching
Aggregation of subject-specific resources
Benefits to institutions
Increased exposure for collections
Broader user base
Bringing together of distributed collections
Don’t expect users will know about your
collection and remember to visit it.
77. 3/6/07 INCOLSA Workshop 77
Why share metadata with OAI?
“Low barrier” protocol
Shares metadata only, not content,
simplifying rights issues
Same effort on your part to share with one or
a hundred service providers (basically)
Wide adoption in the cultural heritage sector
Quickly eclipsed older methods such as
Z39.50
78. 3/6/07 INCOLSA Workshop 78
Three possible architectures
OAIHarvester
Digital asset management system
Metadata
creation
module
OAI data
provider
module
Transformation
Metadata
creation
system
Stand-alone
OAI data
provider
Transformation
DC
QDC MODS
MARCXML
DC MARCXML
QDC MODS
Metadata
creation
module
Static
Repository
Gateway
Transformation
79. 3/6/07 INCOLSA Workshop 79
Basic metadata sharing workflow
Create metadata, thinking about shareability
Determine format(s) you wish to share your metadata
in
Transform records into versions appropriate for
sharing via OAI
Validate transformed metadata
Load transformed metadata into OAI data provider
Test with OAI Repository Explorer
Communicate with service providers
See what your metadata looks like once a service
provider harvests it
80. 3/6/07 INCOLSA Workshop 80
Preparing your metadata for sharing
Map to common formats; also called
“crosswalking”
To create “views” of metadata for specific
purposes
Mapping from robust format to more general
format is common
Mapping from general format to more robust
format is ineffective
81. 3/6/07 INCOLSA Workshop 81
Crosswalks (1)
For transforming between metadata formats
Usually refers to transforming between
content standards rather than structure
standards, but not always
Mapping from more robust format to less
robust format effective; mapping from simpler
format to more robust format less so
Good practice to create and store most
robust metadata format possible, then create
other views for specific needs
82. 3/6/07 INCOLSA Workshop 82
Crosswalks (2)
Can be in many formats
Logical sets of rules [example]
Actual code [example]
Often need to tweak a generic crosswalk for a
specific implementation
Accommodating local practice
Adding institution-specific information
Adding context not available locally
83. 3/6/07 INCOLSA Workshop 83
Types of mapping logic
Mapping the complete contents of one field to
another
Splitting multiple values in a single local field
into multiple fields in the target schema
Translating anomalous local practices into a
more generally useful value
Splitting data in one field into two or more
fields
Transforming data values
Boilerplate values to include in output
schema
84. 3/6/07 INCOLSA Workshop 84
Metadata as a view of the resource
There is no monolithic, one-size-fits-all
metadata record
Metadata for the same thing is different
depending on use and audience
Harry Potter as represented by…
a public library
an online bookstore
a fan site
85. 3/6/07 INCOLSA Workshop 85
Choice of vocabularies as a
view
Names
LCNAF: Michelangelo Buonarroti, 1475-
1564
ULAN: Buonarroti, Michelangelo
Places
LCSH: Jakarta (Indonesia)
TGN: Jakarta
Subjects
LCSH: Neo-impressionism (Art)
AAT: Pointillism
86. 3/6/07 INCOLSA Workshop 86
Finding the right balance
Metadata providers know the materials
Document encoding schemes and controlled
vocabularies
Document practices
Ensure record validity
Aggregators have the processing power
Format conversion
Reconcile known vocabularies
Normalize data
Batch metadata enhancement
87. 3/6/07 INCOLSA Workshop 87
What does this record describe?
identifier: http://name.university.edu/IC-FISH3IC-X0802]1004_112
publisher: Museum of Zoology, Fish Field Notes
format: jpeg
rights: These pages may be freely searched and displayed.
Permission must be received for subsequent distribution in
print or electronically.
type: image
subject: 1926-05-18; 1926; 0812; 18; Trib. to Sixteen Cr. Trib. Pine
River, Manistee R.; JAM26-460; 05; 1926/05/18; R10W;
S26; S27; T21N
language: UND
source: Michigan 1926 Metzelaar, 1926--1926;
description: Flora and Fauna of the Great Lakes Region
Example courtesy of Sarah Shreeves, University of Illinois at Urbana-Champaign
89. 3/6/07 INCOLSA Workshop 89
Shareable metadata defined
Metadata for aggregation with records from other
institutions
Promotes search interoperability - “the ability to
perform a search over diverse sets of metadata
records and obtain meaningful results” (Priscilla
Caplan)
Is human understandable outside of its local
context
Is useful outside of its local context
Preferably is machine processable
90. 3/6/07 INCOLSA Workshop 90
6 Cs and lots of Ss of shareable
metadata
Content
Consistency
Coherence
Context
Communication
Conformance
Metadata standards
Vocabulary and encoding standards
Descriptive content standards
Technical standards
91. 3/6/07 INCOLSA Workshop 91
Content
Choose appropriate vocabularies
Choose appropriate granularity
Make it obvious what to display
Make it obvious what to index
Exclude unnecessary “filler”
Make it clear what links point to
92. 3/6/07 INCOLSA Workshop 92
Consistency
Records in a set should all reflect the same
practice
Fields used
Vocabularies
Syntax encoding schemes
Allows aggregators to apply same
enhancement logic to an entire group of
records
93. 3/6/07 INCOLSA Workshop 93
Coherence
Record should be self-explanatory
Values must appear in appropriate elements
Repeat fields instead of “packing” to explicitly
indicate where one value ends and another
begins
94. 3/6/07 INCOLSA Workshop 94
Context
Include information not used locally
Exclude information only used locally
Current safe assumptions
Users discover material through shared
record
User then delivered to your environment for
full context
Context driven by intended use
95. 3/6/07 INCOLSA Workshop 95
Communication
Method for creating shared records
Vocabularies and content standards used in
shared records
Record updating practices and schedules
Accrual practices and schedules
Existence of analytical or supplementary
materials
Provenance of materials
96. 3/6/07 INCOLSA Workshop 96
Conformance to Standards
Metadata standards (and not just DC)
Vocabulary and encoding standards
Descriptive content standards (AACR2, CCO,
DACS)
Technical standards (XML, Character
encoding, etc)
97. 3/6/07 INCOLSA Workshop 97
Before you share…
Check your metadata
Appropriate view?
Consistent?
Context provided?
Does the aggregator have what they need?
Documented?
Can a stranger tell you what the record
describes?
98. 3/6/07 INCOLSA Workshop 98
The reality of sharing metadata
We can no longer afford to only think about our local
users
Creating shareable metadata will require more work
on your part
Creating shareable metadata will require our vendors
to support (more) standards
Creating shareable metadata is no longer an option,
it’s a requirement
Indiana is moving toward a portal of Indiana-related
digital content – you should be planning for this now
99. 3/6/07 INCOLSA Workshop 99
Putting it all into practice
Develop written documentation
Develop a quality control workflow for
metadata creation
Share your findings with others
Get better with every new online collection
100. 3/6/07 INCOLSA Workshop 100
Further information
jenlrile@indiana.edu
These presentation slides: <
http://www.dlib.indiana.edu/~jenlrile/presentations/incolsa2007/incolsa.ppt>
Metadata librarians listserv: <http:
//metadatalibrarians.monarchos.com>
Priscilla Caplan: Metadata Fundamentals for
all Librarians, 2003
Editor's Notes
Extensibility: via Application Profiles and local qualifiers. Local qualifiers maybe not kosher but there are no metadata police. Usually.
Recommended: Elements, Element Refinements, and DCMI-maintained Vocabulary Terms (e.g., member terms of the DCMI Type Vocabulary) useful for resource discovery across domains.
Conforming: Elements, Element Refinements and Application Profiles may be assigned a status of conforming. Elements and Element Refinements assigned a status of conforming are those for which an implementation community has a demonstrated need and which conform to the grammar of Elements and Element Refinements, though without necessarily meeting the stricter criteria of usefulness across domains or usefulness for resource discovery.
Obsolete: For Elements and Element Refinements that have been superseded, deprecated, or rendered obsolete. Such terms will remain in the registry for use in interpreting legacy metadata.
Registered: Used for Vocabulary Encoding Schemes and language translations for which the DCMI provides information but not necessarily a specific recommendation.
Can use as much or as little authority control as you want. CVs not required – use if you think they’re important for material. Can use collection-level description instead of item-level description if you want. Shared metadata only for discovery purposes – not necessarily complete description. Complete description is done locally. Domain-specific service providers can be for library interests, or merge library materials with those held in archives, museums, etc.