Presentation given by Aileen O'Carroll, Policy Manager at the Digital Repository of Ireland, in the Digital Humanities Active Learning Space, University College Cork, as part of a day-long DRI Training session on 'Preparing Digital Collections'. This presentation introduces the concept of metadata, introduces standards, methods and controlled vocabularies. It follows earlier version of the presentation given by DRI staff at other events in 2015.
8. DRI Presents: Introduction to Metadata
Why is metadata important?
• Tells you what a digital object is, or what
it contains.
• Search and Discoverability
• Sharing / Re-use
15. DRI Presents: Introduction to Metadata
What makes good metadata?
October 9, 2013
Emily Grace Flanagan, 3
month birthday
At 22 Ashfield Rd,
Ranelagh
(Technical metadata in
camera EXIF data)
18. DRI Presents: Introduction to Metadata
What is a Metadata Standard?
• Set of fields for describing an object
• Common:
Dublin Core, EAD, MARC, MODS
Archives Libraries
19. DRI Presents: Introduction to Metadata
Seeing Standards: A Visualization of the Metadata Universe, Jenn Riley & Devin Becker,
http://jennriley.com/metadatamap/
22. DRI Presents: Introduction to Metadata
Simple Dublin Core Metadata Element Set
1. Title
2. Creator
3. Subject
4. Description
5. Publisher
6. Contributor
7. Date
8. Type
9. Format
10. Identifier
11. Source
12. Language
13. Relation
14. Coverage
15. Rights
23. DRI Presents: Introduction to Metadata
Simple Dublin Core Metadata Element Set
1. Title = Introduction to
Metadata
2. Creator = Aileen
O’Carroll
3. Subject
4. Description
5. Publisher
6. Contributor
7. Date = October 2016
8. Type
9. Format = powerpoint
10. Identifier
11. Source
12. Language= English
13. Relation
14. Coverage
15. Rights
25. DRI Presents: Introduction to Metadata
1. Title Ulysses
2. Creator James Joyce
3. Subject Stream of consciousness;
Modern novel;
Turn of century Dublin;
Book covers
4. Description Traces the character Leopold
Bloom as he walks around
Dublin on 16 June, 1904
Or
Scan of first edition, hard cover
5. Publisher Shakespeare and Company
6. Contributor
7. Date 1922
26. DRI Presents: Introduction to Metadata
Controlled Vocabularies
• A standardised set of terms that are accepted, defined,
and managed (agreed on by a community)
• A way to enable consistency in metadata, to facilitate
accurate search and retrieval
• Tend to be domain/discipline specific
• Find the one that best fits your collection
30. DRI Presents: Introduction to Metadata
1. Title Ulysses
2. Creator James Joyce
3. Subject Stream of consciousness;
Modern novel;
Turn of century Dublin;
Book covers
4. Description Traces the character Leopold
Bloom as he walks around
Dublin on 16 June, 1904
Or
Scan of first edition, hard cover
5. Publisher Shakespeare and Company
6. Contributor
7. Date 1922
32. DRI Presents: Introduction to Metadata
1. Title Ulysses
2. Creator James Joyce
3. Subject Turn of century Dublin;
4. Description Traces the character Leopold
Bloom as he walks around
Dublin on 16 June, 1904
Or
Scan of first edition, hard cover
5. Publisher Shakespeare and Company
6. Contributor
7. Date 1922
Dublin—History—20th century
Joyce, James, 1882-1941
36. DRI Presents: Introduction to Metadata
In Sum
• Metadata is data about data (description of a file)
• Good metadata = rich & consistent metadata
• Good metadata = discoverability
• Metadata Standards improve consistency and
interoperability (play well with others!)
• Vocabularies aid in creating consistent and meaningful
metadata
Define Metadata
Identify benefits of using standards-compliant metadata
The Four Books of Sentences (Libri Quattuor Sententiarum) is a book of theology written by Peter Lombard in the 12th century.
Peter Lombard, used extensive margin notes, for citations and this is considered by some to be the direct antecedent of modern scholarly footnotes.
Proofreading shorthand
Typesetting Markup
DRI is a trusted digital repository for Humanities and Social Sciences Data in Ireland, launched June 2015
Provides preservation and access to digital collections
Born digital and digitised collections including maps, photographs, letters,
audio-visual, sound, books,
oral histories, paintings..
Number of collections from different depositors
Key challenge – how to describe those collections, how to describe those objects, how to ensure that that people will be able to search across collections and find objects relevant to their interests, from different depositors.
Technical metadata – hardware, software, file formats, resolution, size
Preservation metadata – provenance, authenticity, preservation actions, responsibility (eg. PREMIS)
Structural metadata – physical/logical structure of digital resources (eg. METS)
Descriptive metadata – describes the digital resource; catalogue records/finding aids
Metadata: Data With a Purpose
Metadata is...
...constructed... (Metadata is wholly artificial, created by human beings.)
... for a purpose ... (There is no universal metadata. For metadata to be useful it has to serve a purpose.)
... to facilitate an activity... (There's something that you do with metadata.
Karen Coyle librarian
http://www.kcoyle.net/meta_purpose.html
Why use standard metadata?
Using standardised descriptive metadata means adhering to the best practices in your domain.
Standardised metadata allows you to control how records are described within your organisation too.
Enforcing standards allows greater searchability of your records.
Metadata sharing and interoperability is only possible when a standard is used.
Quality metadata enables analysis, manipulation and “value-added services”
Why use standard metadata?
Dublin Core, EAD, MARC, MODS
The first example of non-cataloging metadata is Dublin Core. Dublin Core grew out of a meeting in 1995 in Dublin, OHIO, home of OCLC, which has been the key sponsor of the DC initiative. Dublin Core's purpose is to provide a very simple set of metadata so that people can describe Web-based information resources quickly and easily, even people with no formal training in that activity. It has fifteen core elements (thus the name Dublin "Core"). These simple elements can be further defined to create more detailed metadata, but the core elements have found wide use on the Web and elsewhere, as we'll see further on.
All of the fields in Dublin Core are optional, and all are repeatable.
you can create a very minimal record just a date, a description, and a format ie fast, easty to do
MODS: 20 top level elements
EAD
Producing metadata
A handwritten or typewritten listing or finding aid
Can be easily read and understood
Can be accessible in physical or digital medium
Can be free-text searched
Machine readable metadata
In a format that can be understood by computers
Structured representation of information
Described using particular standards (eg. XML, HTML, RDF)
Allows processing, exchange and analysis
“Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.
It is defined by the W3C's XML 1.0 Specification and by several other related specifications, all of which are free open standards.”
An XML document consists of a set of elements, which can be nested within eachother
All elements must have an opening and closing tag of the form <tagname>… </tagname>