Archives hub ead 2010_extended

Lisa Jeskins and Bethan Ruddock
Archives Hub
Mimas

By the end of today’s session we will have
given you an introduction to:
• what interoperability means
• what XML is, what it does and why it is important
• EAD structure and syntax
• EAD and hierarchies
• UK Archives Discovery Network (UKAD)

 the ability of two or more systems or
components to exchange information and to
use the information that has been exchanged
(IEEE Standard Computer Dictionary )

 the ability to exchange/share data
 integration of information resources presented in
different formats
 within a domain or across domains
 advantages of cross-searching
 XML facilitates interoperability

 Data exchange standards such as:
◦ Z39.50
◦ SRU

 user can easily search across and retrieve
resources from a wealth of systems
 moving beyond individual websites for
individual resources (silo approach)

 http://www.ukoln.ac.uk/interop-focus/
◦ to explore, publicise and mobilise the benefits and
practice of effective interoperability across diverse
information sectors

 Extensible Markup Language
 XML is a grammatical system for creating languages:
◦ a meta-language
 Use XML to design your own markup language,
consisting of meaningful tags that describe the data
they contain
 Create a language for describing…anything

 XML does not do anything itself. It is pure
information wrapped in XML tags
 You must use other means to send, receive or
display the data
XML XML technologies
is used by to create
Detailed
description
to view in a
browser
Summary
entry to
view in a
browser
PDF for
print

 XML is not about content, though there might be
certain restrictions on content
 XML is essentially about structure
 Creating a consistent structure via XML tagging enables
content to be easily identified (by machines) and used
flexibly

<title> Alice in Wonderland </title>
*XML allows you to define your tags*
<book>Alice in Wonderland</book>
<filmtitle>Alice in Wonderland</filmtitle>
<tag> content </tag>

 Attributes are simple name/value pairs
associated with an element
<tag attribute_name=“attribute_value”>content</tag>
<language>English</language>
<language langcode=“eng”>English</language>
<date normal=“2004”>20 Sept 2004</date>

<tag attribute_name=”attribute_value”>content</tag>
<tree>hornbeam</tree>
<tree type=”deciduous”>hornbeam</tree>
<date normal=”2004”>20 May 2004</date>
<date>20 May 2004</date>
This is an XML element

<trees>
<tree type=“deciduous”>
<species>oak</species>
<fruit>acorn</fruit>
</tree>
<tree type=“coniferous”>
<species>pine</species>
<fruit>pine cone</fruit>
</tree>
</trees>

<catalog>
<cd>
<title>OK Computer</title>
<artist type=“band”>Radiohead</artist>
<genre>pop</genre>
<year>1997</year>
</cd>
<cd>
<title>Stanley Road</title>
<artist type=“solo”>Paul Weller</artist>
<genre>pop</genre>
<year>1995</year>
</cd>
</catalog>
<title>Stanley Road</title>
<artist>Paul Weller</artist>
<type>solo</type>
<genre>pop</genre>
<year>1995</year>

Alice in Wonderland
Lewis Carroll
1 volume
hardback

Title Alice in Wonderland
Author Lewis Carroll
Extent 1 volume
Format hardback

<books>
<title>Alice in Wonderland</title>
<author>Lewis Carroll</author>
<extent>1 volume</extent>
<format>hardback</location>
</books>

 a root element is required
<catalog>
…..all your tags and content…
</catalog>
 closing tags are required
 case matters

 elements must be properly nested
<physdesc>
<extent>10 boxes</extent>
</physdesc>
<physdesc>
<extent>10 boxes</physdesc>
</extent>

 attribute values must be enclosed in quotation marks,
e.g. langcode=“fre”
 element names must obey some basic rules
◦ e.g. cannot start with numbers or punctuation characters,
cannot contain spaces
◦ e.g. <cd name> or <?name> would be incorrect

Look at the following recipe for
Chocolate Brownies – How
would use XML to mark this up?
(I’m reliably informed the recipe
works!)

 375g butter
 375g dark chocolate
 1 tablespoon vanilla extract
 6 eggs
 500g sugar
 225g plain flour
 Preheat the oven to 180°C, 350°F or gas mark 4. Grease a swiss roll tin or
oblong baking dish. Melt the chocolate and butter in a bowl over a
saucepan of hot water. Add the vanilla and set the mixture aside until it is
lukewarm.
 Whisk the eggs and sugar into the mixture. Sift in the flour and baking
powder and fold gently until the mixture is just combined. Pour into the
greased tin and bake for 20 to 30 minutes until the brownie is cooked
around the edges, but still soft in the middle.
 Cool and cut into squares.
 Makes 48 brownies
Chocolate Brownies

<recipe>
<title>Chocolate Brownies</title>
<ingredients>
<item>375g butter</item>
<item>375g dark chocolate</item>
<item>1 tablespoon vanilla extract</item>
<item>6 eggs</item>
<item>500g sugar</item>
<item>225g plain flour</item>
</ingredients>
<method>
Preheat the oven to <temp>180°C, 350°F or gas mark 4</temp>.Grease a swiss roll tin or oblong
baking dish. Melt the chocolate and butter in a bowl over a saucepan of hot water. Add the vanilla
and set the mixture aside until it is lukewarm. Whisk the eggs and sugar into the mixture.
Sift in the flour and baking powder and fold gently until the mixture is just combined. Pour into
the greased tin and bake for <bakingtime>20 to 30 minutes</bakingtime> until the brownie is
cooked around the edges, but still soft in the middle.
Cool and cut into squares.
</method>
<serving>Makes 48 brownies</serving>
</recipe>
Possible XML
markup for recipe

<ingredient>375 g butter</ingredient>
Or
<ingredient>
<item>375 g butter</item>
</ingredient>
Or
<ingredient>
<type>butter</type>
<quantity>375 g</quantity>
</ingredient>

http://www.archiveshub.ac.uk/temp/recipe.xml

 Valid XML: rules specify elements and attributes
used and how used
 Valid XML provides consistency and facilitates the
exchange of data
 Valid XML is important for displaying, processing and
exchanging XML in a wider environment

 A Document Type Definition or Schema defines the
building blocks of an XML document
 It specifies elements and attributes and defines how
they can be used
 People can agree to use a common DTD/Schema for
interchanging data

<?xml version="1.0" encoding="UTF-16"?>
<!ELEMENT recipe (title, intro?, ingredients+, method, serving*)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT intro (#PCDATA)>
<!ELEMENT ingredients (item+)>
<!ELEMENT item (#PCDATA)>
<!ELEMENT method (p+)>
<!ELEMENT p (#PCDATA | temp | bakingtime)*>
<!ELEMENT temp (#PCDATA)>
<!ELEMENT bakingtime (#PCDATA)>
<!ELEMENT serving (#PCDATA)>

 Schemas perform the same task as DTDs
 Schemas use XML syntax
 Schemas support complex data types
 Easier to describe allowable content
 One XML document can point to more than one
schema

<?xml version="1.0"?>
<note
xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3schools.com note.xsd">
<note>
<to>Rachel</to>
<from>John</from>
<heading>Reminder</heading>
<body>Don't forget the concert!</body>
</note>

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com" elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

XML file DTD or Schema Valid XML
Blue
Elephant
Papers
……………………
…………
Blue
Elephant
Papers Browse
List

 Use XML technologies – for displaying, retrieving,
transforming, manipulating
 XSLT – Extensible Stylesheet Language for
Transformations
 Many technologies available to manipulate XML
documents

 transformation involves the reading in of an XML file
and an XSLT file to a processor, which can then
generate some output – typically HTML
XSLT
XML
processor
HTML output

 HTML is ONLY for display, typically in a Web browser
 HTML tags do not describe the content
 HTML cannot easily be extracted by machines for
different purposes
 XML tags can be specified by anyone; HTML tags are
prescribed

HTML: <h1> Papers of Peter Rowe </h1>
XML: <title> Papers of Peter Rowe </title>
HTML: 21 May 2004 
XML: <date> 21 May 2004 </date>

 International standard, supported by the W3C
 It is open, licence free and platform neutral
 It is human and machine readable
 XML documents are text documents

 XML does not determine the presentation of
the data
◦ use stylesheets to present XML data
◦ with proprietary systems content is inextricably bound up
with format
 Hierarchical structure – good for archive
descriptions!

 XML is the main basis for defining data
exchange languages
 Meaningful tags facilitate extraction – data
can be manipulated as required

 All publicly funded bodies should use XML for
data exchange (e-GIF)
 XML has been widely adopted commercially
as well as in the public sector

 XML is:
◦ simple
◦ flexible
◦ great for data exchange
 XML must be:
◦ well-formed
◦ valid
 DTDs and Schemas:
◦ to create valid XML
◦ provide tags, attributes and rules
 XML requires other XML technologies
◦ e.g. stylesheets can transform XML for display

 EAD = Encoded Archival Description
 EAD is XML for finding aids
 A data structure standard – not a content standard
 A structure that allows finding aids to be indexed,
searched, retrieved and navigated
 Compatible with ISAD(G)

EAD is:
 Flexible enough to deal with all types of finding aids:
single or multi-level, long or short, lists or calendars
etc.
 Used to create new finding aids as well as converting
old ones to standardised form
 Used to share data between systems

 EAD is maintained and developed by an
international working group
 Develops and publishes documentation and
tools: tag library, guidelines, EAD Cookbook,
websites

<ead> EAD root element
<eadheader> EAD file information wrapper
</eadheader>
<archdesc> Finding aid wrapper
<did></did> Core collection information wrapper
</archdesc>
</ead>

<archdesc>
<eadheader>
<did>
sub-fonds descriptions

<eadheader>
<eadid>
<filedesc>
<titlestmt>
<titleproper>
<profiledesc>
<revisiondesc>
EAD file information
Identifier
Title
Creation
Revision

Within <archdesc> there are elements for:
 Description
 Presentation
 Hierarchy

<archdesc>
<did>
<scopecontent>
<bioghist>
<arrangement>
<controlaccess>
Archival description
Descriptive information
Scope and Content
Biographical/Admin. History
Arrangement
Access points

<did>
<unitid>
<unititle>
<unitdate>
<origination>
<repository>
<physdesc>
<extent>
<genreform>
<physfacet>
<physloc>
<container>
<abstract>
</did>
Descriptive information
Reference
Title
Covering dates
Creator(s)
Repository
Physical description
Extent
Form
Physical Facet
Location
Container type
Brief description

<archdesc level="fonds">
<did>
<unitid>GB 0001 Foster</unitid>
<unittitle>Papers of Dr Foster</unittitle>
<unitdate normal = "1820-1833">1820-1833</unitdate>
<repository>University of Gloucestershire</repository>
<physdesc>
<extent>1 box</extent>
<physfacet>Four folders of letters, 230 folios</physfacet>
</physdesc>
<langmaterial><language langcode=“eng”>English<language>
</langmaterial>
<origination>Dr Foster</origination>
</did>

<acqinfo>
<custodhist>
<appraisal>
<processinfo>
<accruals>
<altformavail>
<accessresrict>
<userestrict>
<prefercite>
Acquisition information
Custodial history
Appraisal and selection
Process Information
Accruals information
Copies
Access restrictions
User restrictions
Citation information

<bibliography>
<fileplan>
<otherfindaid>
<relatedmaterial>
<separatedmaterial>
<index>
Publication note
Classification scheme
Other finding aids
Related material
Separated material
Keywords

<controlaccess>
<name>
<corpname>
<persname>
<famname>
<geogname>
<occupation>
<function>
<genreform>
<subject>
Controlled access headings
Names (general)
Corporate body name
Personal name
Family name
Place name
Occupations
Functions (administrative)
Genre and Form
Subject

<head>
; <lb>
<emph>; <blockquote>
<list><item>;
<chronlist><chronitem>;
<ref>; <ptr>; <dao>
Headings
Layout
Italics and quotes
Lists
References, pointers
and links to digital objects

<head>
; <lb>
<emph>; <blockquote>
<list><item>;
<chronlist><chronitem>;
<ref>; <ptr>; <dao>
Headings
Layout
Italics and quotes
Lists
References, pointers
and links to digital objects
NB: EAD is NOT about the presentation
of your finding aids, but about their
syntax. Separate software will take care
of the display of the information.

ISAD(G) (v.2)
3.1.1 Reference code(s)
3.1.2 Title
3.1.3 Dates of creation
3.1.4 Level of description
3.1.5 Extent of the unit
3.2.1 Name of creator
3.2.2 Administrative/Biographical
history
3.2.3 Custodial history
3.2.4 Immediate source of acquisition
3.3.1 Scope and content
3.3.2 Appraisal, destruction and
scheduling
EAD 2002
<unitid> countrycode and
repositorycode attributes
<unittitle>
<unitdate>
<archdesc> and <c> level attribute
<physdesc>, <extent>
<origination>
<bioghist>
<custodhist>
<acqinfo>
<scopecontent>
<appraisal>

3.3.3 Accruals
3.3.4 System of arrangement
3.4.1 Access conditions
3.4.2 Copyright/Reproduction
3.4.3 Language of material
3.4.4 Physical characteristics
3.4.5 Finding aids
3.5.1 Location of originals
3.5.2 Existence of copies
3.5.3 Related units of description
3.5.4 Publication note
3.6.1 Note
<accruals>
<arrangement>
<accessrestrict>
<userestrict>
<langmaterial>
<phystech>
<otherfindaid>
<originalsloc>
<altformavail>
<relatedmaterial> and <separatedmaterial>
<bibliography>
<odd>

 EAD version 1 DTD
 EAD 2002 DTD
 EAD 2002 Schema
 Available from http://www.loc.gov/ead/
 Human-readable version: EAD Tag Library (Society of
American Archivists)

 Library of Congress Official EAD site:
http://www.loc.gov/ead/
 Tag Library: http://www.loc.gov/ead/tglib/index.html
 EAD Roundtable Help Pages:
http://www.archivists.org/saagroups/ead/

ISAD(G) states that to be a conformant archival
description a finding aid must:
 Be hierarchical
◦ Description from the general to the specific
◦ Information relevant to the level of description
◦ Linking of descriptions (logical sequence)
◦ Non-repetition of information
 Contain a minimum set of data elements

 Recommended elements for lower level
descriptions:
◦ reference code
◦ title
◦ date(s)
◦ extent of the unit of description
◦ level of description

ISAD(G) levels:
 Fonds
 Sub-fonds
 Series
 Sub-series
 File
 Item
EAD levels:
<archdesc>
<dsc><c01>
<c02>
<c03>
<c04>
<c05>

<ead>…
<archdesc>
[collection level description here]
◦ <dsc>
<c01>[series] description 1
<c02>[file] description 1</c02>
<c02>[file] description 2
<c03>[item] 1</c03>
<c03>[item] 2</c03>
</c02>
</c01>
<c01>[series] description 2....
◦ </dsc>
</archdesc>
</ead>
c02 c02
c03 c03
c01

<c01 level = "subfonds">
<did>
<unitid>GB 0324 MS 54</unitid>
<unittitle>Correspondence files</unittitle>
<unitdate>1920-1945</unitdate>
<physdesc><extent>4 files</extent></physdesc>
</did>
<scopecontent>…</scopecontent>
<c02 level = "series">
<did>…</did>
<scopecontent>…</scopecontent>
</c02>
</c01>

 EAD supports two ways of representing levels
 <c> is used in A2A, <c0*> on the Hub
 Slightly easier to use <c0*>, as the numbers give you
more of an idea of the level you are working at

<dsc type="combined">
<c level="series">
<did> <unitid>Series 1</unitid>
<unittitle>Correspondence</unittitle> </did>
<scopecontent>[...]</scopecontent>
<c level="subseries">
<did> <unitid>Subseries 1.1</unitid>
<unittitle>Outgoing Correspondence</unittitle> </did>
<c level="file"> <did> <unittitle>AbbingerAldrich</unittitle> </did>
</c> </c> </c> </dsc>

 XML is a meta-language for creating mark-up
languages
 XML files require other technologies for display,
processing, etc.
 For archive finding aids EAD is the DTD/Schema to
use

 It is XML, which is an international standard
 It is a simple and effective way of structuring content
and providing meaning
 Machines can manipulate the content in all sorts of
ways
 It is a great format to store finding-aids

 Effective cross-searching requires:
◦ Interoperability
 which requires
◦ Common standards

 UKAD: http://www.ukad.org/
 To promote the opening up of data and to offer capacity for such
a cross-searching capability across the UK archive networks and
online repository catalogues
 To lead and support resource discovery through the promotion of
relevant national and international standards
 To support the development and use of name authorities

 To advocate for the reduction of cataloguing
backlogs and the retro-conversion of hard-copy
catalogues
 To promote access to digitized and digital archives
via cross-searching resource discovery systems.
 To work with other domains and potential funders to
promote archive discovery

 Fairly loose structure
 Meetings about twice a year
 Forum for discussion, sharing, connecting and collaborating
 Creating a framework for activities (matrix)
◦ International/national/regional
◦ Meeting UKAD objectives, e.g. open up data; standards-based resource
discovery; retro-conversion

 Not many UK archives currently using EAD as a storage format
 EAD will increasingly be used as an export format from
proprietary database systems like CALM, for use in XML-based
gateways such as Aim25 and the Archives Hub
 New software becoming available all the time, which makes it
easier to create, search and display XML – much of this is
open source and often free

 Differences in how EAD is used
 Encourages interoperability but still requires work to
ensure seamless cross-searching
 EAD is flexible and includes a large number of tags
which has advantages and disadvantages

 XML is an international standard for sharing
information
 EAD is the XML language for archival finding aids
 EAD is not a content standard
 Use ISAD(G) for content guidelines and thesauri or
authority files for index terms

 You have used the Archives Hub’s EAD editor to
create EAD records
 XML Editors, such as XMetal or XMLspy can provide
help with validating and with selecting tags and
attributes
 EAD will become increasingly important

Archives hub ead 2010_extended

Archives hub ead 2010_extended

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Archives hub ead 2010_extended

Similar to Archives hub ead 2010_extended (20)

More from Lisa Jeskins

More from Lisa Jeskins (12)

Archives hub ead 2010_extended

Editor's Notes