6. 6
XML standard for encoding finding aids
I. Basics - What is EAD?
XML (eXtensible Markup Language):
a set of rules for structuring data via markup
7. 7
XML standard for encoding finding aids
I. Basics - What is EAD?
Tag:
<unitdate era=“ce”>2011</unitdate>
Attribute:
<unitdate era=“ce”>2011</unitdate>
Element:
<unitdate era=“ce”>2011</unitdate>
8. Elements and attributes defined by a
Document Type Definition (DTD) or a
Schema
<bioghist> <bionote>
8
I. Basics - What is EAD?
XML standard for encoding finding aids
10. XML standard for encoding finding aids
Defined set of containers for descriptive data
EAD : DACS = MARC : AACR2
10
I. Basics - What is EAD?
11. XML standard for encoding finding aids
A description of records that gives the
repository physical and intellectual control over
the materials and that assists users to gain
access to and understand the materials (SAA)
Describing Archives: A Content Standard (DACS)
11
I. Basics - What is EAD?
12. What is EAD?
XML standard for encoding finding aids
I. Basics
12
13. What is EAD?
EAD encoding is not a substitute for
sound archival description!
I. Basics
13
15. EAD Finding Aid Structure
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ead SYSTEM "ead.dtd">
<?xml-stylesheet type="text/xsl" href="lbi2010.xsl"?>
II. Finding Aid
15
16. EAD Finding Aid Structure
<ead>
<eadheader>Information about repository and
finding aid</eadheader>
<archdesc>Description of archival
materials</archdesc>
</ead>
II. Finding Aid
16
17. Common Tags
• Structural and content tags
<eadheader>Many other tags</eadheader>
<date>July 4, 1776</date>
II. Finding Aid
17
18. Common Tags <eadheader>
• Finding aid author
<filedesc><titlestmt>
<author>Processed by Stanislav Pejša.</author>
</titlestmt></filedesc>
II. Finding Aid
18
19. Common Tags <archdesc>
• Biographical information
<bioghist><p>Joseph Roth was one of the most prominent
Austrian writers of the first half of the 20th
century.</p></bioghist>
• Controlled vocabulary
<controlaccess>
<geogname encodinganalog="651$a" source="lcsh"
authfilenumber="n 79040121">Austria</geogname>
</controlaccess>
II. Finding Aid
19
20. Common Tags <archdesc>
• Description of Subordinate Components
<dsc>
<c01 level="series">
<c02>Folder 1
<c03>Item 1</c03>
<c03>Item 2</c03>
</c02>
<c02>Folder 2</c02>
</c01>
II. Finding Aid
20
21. Common Tags <archdesc>
• Description of Subordinate Components
A Component <c> provides information about the content,
context, and extent of a subordinate body of materials.
Each <c> element identifies an intellectually logical section
of the described materials. The physical filing
separations between components do not always
coincide with the intellectual separations.
From EAD Tag library <http://www.loc.gov/ead/tglib/elements/c.html>
II. Finding Aid
21
22. Common Tags <archdesc>
• Description of Subordinate Components
<dsc>
<c01 level="series">
<did>
<unittitle id="serII">Series II: Addenda</unittitle>
<unitdate normal="1985/1996">1985-1996</unitdate>
</did>
<c02>Subordinate elements, such as folders</c02>
</c01>
II. Finding Aid
22
23. Common Tags <archdesc>
• Description of Subordinate Components
<c02>
<did>
<container type="box">2</container>
<container type="folder">1</container>
<unittitle>Articles</unittitle>
<unitdate>1985-1994</unitdate>
</did>
</c02>
II. Finding Aid
23
24. Common Tags <archdesc>
• Digital Archival Object (<dao>)
<c02>
<did> […]
<unittitle>Articles</unittitle>
</did>
<dao
href="http://www.archive.org/stream/josephroth_07_r
eel07#page/n218/mode/1up" actuate="onrequest"
linktype="simple" show="new"/>
</c02>
II. Finding Aid
24
48. Other Uses
• Integration with other standards (e.g. EAC-CPF)
• Open Archives Initiative – Protocol for Metadata
Harvesting (OAI-PMH)
III. Implementation: Using EAD
48
49. Other Uses
• EAD consortia
• Metadata for digitized collections
• Faceted searching
• Bulk updates
III. Implementation: Using EAD
49
50. Why Use EAD?
• EAD is an internationally-adopted standard
• EAD paves the path to a structured data future
Combs et al, 2010: Over, Under, Around, and Through: Getting Around Barriers to
EAD Implementation
III. Implementation: Using EAD
50
51. The Future of EAD
• Alpha release of new schema, documentation,
and migration tools, August 2012
• Public presentations (SAA Annual Meeting,
webinars, etc.), August 2012
• Beta release of schema, documentation, and
migration tools, January 15, 2013
• New version of EAD released with tag library and
migration tools, July 1, 2013
2012-03-19 email to EAD listserv from Technical Subcommittee for EAD
III. Implementation: Using EAD
51
60. Exercise How To
60
IV. Exercises
1. Make the change in the XML
2. Hit the red arrow to transform the XML to
HTML
3. Examine the HTML in the browser
61. Processing the
Joseph Roth Addendum
You are a processing
archivist at the Leo Baeck
Institute. You have been
asked to process an
addendum to the Joseph
Roth Collection, and to
update the EAD finding
aid accordingly.
IV. Exercises
61
Austrian writer Joseph Roth (1894-1939)
64. The head archivist tells you that there is an error in
the biographical information. Roth’s mother’s
first name is Maria, not Mario.
Fix this typo.
IV. Exercises
64
Exercise 2:
Biographical Information
66. Looking at the existing controlled access points,
you realize that the subject term for Roth’s
birthplace, “Brody, Galicia” is incorrect. The
proper LC term is “Brody (Ukraine)”.
Correct the term.
IV. Exercises
66
Exercise 3a:
Geographic Information
68. Add the LC authority file number for “Brody
(Ukraine)”.
IV. Exercises
68
Exercise 3b:
Geographic Information
69. Go to LC authorities: http://id.loc.gov
Search for Brody (Ukraine)
<ead><archdesc><controlaccess>
<geogname encodinganalog="651bb0$a"
role="subject" source="lcsh"
authfilenumber="n88212572">Brody
(Ukraine)</geogname>
IV. Exercises
69
Exercise 3b:
Geographic Information
71. The addendum you are given is one folder,
consisting of material in Polish from a 2002
conference about Roth.
Add this folder to Series II: Addenda, and update
the rest of the finding aid accordingly.
IV. Exercises
71
Exercise 4:
Adding a New Folder
72. The addendum you are given is one folder,
consisting of material in Polish from a 2002
conference about Roth.
Add this folder to Series II: Addenda, and update
the rest of the finding aid accordingly.
IV. Exercises
72
Exercise 4:
Adding a New Folder
"
73. What needs to be added?
Where in the finding aid?
IV. Exercises
73
Exercise 4a:
Adding the Folder
78. Find the existing language information, and see if
you can understand the format. Add Polish to
the list of languages, at both the series and the
collection levels.
IV. Exercises
78
Exercise 4c:
Updating the Language
81. Add one sentence to the Series II scope note
reflecting the additional folder.
IV. Exercises
81
Exercise 4d:
Updating the Series II Scope Note
82. <ead><archdesc><dsc><c01
level="series"><scopecontent><p>This series
consists of material that was added to the
collection after the inventory was drafted and
the bulk of the collection organized. […] Also
included are materials from a 2002 conference in
Poland.</p></scopecontent>
IV. Exercises
82
Exercise 4d:
Updating the Series II Scope Note
83. Link to the digitized version of the material in the
additional folder using this link:
http://bit.ly/x7944b
IV. Exercises
83
Exercise 5:
Adding a link to the digital object
85. The head archivist has asked you to print out
copies of your EAD finding aid for the reading
room. Create a print-friendly HTML file.
IV. Exercises
85
Exercise 6:
Creating a Print-Friendly File
86. Find a stylesheet and save it in your EAD folder.
(We’ve done this for you – thanks Syracuse!)
Change the stylesheet declaration:
<?xml-stylesheet type="text/xsl" href="eadprint-su.xsl"?>
IV. Exercises
86
Exercise 6:
Creating a Print-Friendly File
88. The head librarian has asked you to supply a MARC
record for your archival collection. Generate a
MARCXML record from this EAD.
IV. Exercises
88
Exercise 7:
Generating a MARC Record
89. Find an appropriate stylesheet.
(We’ve done this for you)
Set up a new transformation scenario.
IV. Exercises
89
Exercise 7:
Generating a MARC Record
http://www.flickr.com/photos/carowallis1/2314716161/sizes/m/in/photostream/
Will be available on slideshare – many links on images and in text in the later portion of the presentation
Familiar with html? Similar (tags aka mark-up), but data structure, not display
XML (eXtensible Markup Language): set of rules for structuring data via markup
DTD and schema define the buckets; the list of tags in the tag library (we’ll see later) is defined here.
Move to schema is coming; more flexible; not something you need to know right away
http://www.flickr.com/photos/linneberg/4481309196/sizes/m/in/photostream/
http://www.flickr.com/photos/johnkay/3539126525/sizes/m/in/photostream/
Note that it is hierarchical – nested. Parent elements apply to child elements.
Encoding standards are rules for defining buckets; content standards are rules for the information inside
http://www.flickr.com/photos/linneberg/4481309196/sizes/m/in/photostream/
Xml, EAD, MARC are ways to structure your data, they are not the same as the descriptive data such as the finding aid, the catalog record, etc.
An EAD-encoded finding aid is split into info about institution/FA (metametadata) and info about materials (the finding aid)
id.loc.gov
<p> to structure text
So-called “empty element” – all the data is within the tag
Looking at the real thing
Extremely unlikely you will be asked to type it all out by hand. Temples, programs, guidance.
Software is free (like kittens, not like beer)
Designed by archivists: interface is intuitive
Manages most common archival processes
Designed for metadata standards
Output – html, ead
Built on a database (MySQL)
Web-based, but still need MySQL backend
EAD import/export
SAA archon webinar
Sandbox on archon website: <http://www.archon.org/sandbox.php>
Going to be combined with AT
Basic, powerful XML editor. You can safely ignore about 95% of the buttons and drop-downs, but will do things like suggest valid tags and attributes, close tags, and validate as you go. This is what we use.
Notetab Pro
Text editor
In conjunction with free downloads from EAD Cookbook
Free, once installed reasonably friendly
https://code.google.com/p/eaditor/
More complex but powerful tool – works on native XML, not database (like AT/archon). For the pro implementor.
A simple text editor – OK for simple tinkering; hard to actually use.
http://www.loc.gov/ead/tglib/element_index.html
http://www2.archivists.org/standards
XSLT (Extensible Stylesheet Language Transformations) is a declarative, XML-based language used for the transformation of XML documents.
Here, the EAD tag processinfo is converted into HTML.
XSLT (Extensible Stylesheet Language Transformations) is a declarative, XML-based language used for the transformation of XML documents.
Here, the EAD tag processinfo is converted into HTML.
Hard to predict, but the data are structured so you can be flexible.
http://www.oclc.org/research/publications/library/2010/2010-04.pdf
Hard to predict, but the data are structured so you can be flexible.
http://www.oclc.org/research/publications/library/2010/2010-04.pdf
XSLT (Extensible Stylesheet Language Transformations) is a declarative, XML-based language used for the transformation of XML documents.
Here, the EAD tag processinfo is converted into HTML.
We’ll be logically consistent, but in real world there are more things to correct and consider.