1. 1. EAD Revision
2. EAC-CPF: an
introduction
Timothy Ryan Mendenhall
Leo Baeck Institute
2012 March 28
2. EAD Revision
Timetable:
Currently: analyzing comments
submitted during open comment
period
December 2012: draft schema for
revision and comment
August 2013: release of new schema
3. EAD Revision
What to expect:
Migration plan
Interoperability:
• better support for the semantics of
relationships (cf. EAC-CPF, RDA)
Interchange:
• data interchange trumps presentation
• promote uniform and predictable use to
enable better interchange of data.
4. EAD Revision: the details. . .
Schema only -- DTD will be
deprecated
Simplification:
Reduced number of tags
Deprecate presentation-oriented tags
like <emph>, <head>, <table>
DTD Schema
5. EAD Revision: the details. . .
Simplification:
Simplified header
Simplified hierarchical structure
• <c01>, <c02> etc merge into
undifferentiated <c> tags
• Wrapper and structural tags like <dsc>
might be deprecated
6. EAD Revision: the details. . .
Make EAD more database-friendly:
Less mixed content, more tagged data
More specific, granular tags: e.g. forenames
and surnames
More flexibility for normalizing dates (multiple
dates, ranges of dates, etc. Cf. EAC, RDA)
Geo-tagging
“Profiles” of tag sets for different types of
repositories
7. EAD Revision: the details. . .
Extend potential for language
qualifications:
<geogname language=”ger”>Köln
(Deutschland)</geogname>
and/or
<geogname language=”eng”>Cologne
(Germany)</geogname>
8. Date-centered model: Goals
Improve machine-readability of
finding aids
Aid in the sharing of finding aid
data across platforms, CMS’s,
languages, countries, and different
aggregators
Move away from the document
model: finding aid as a fluid,
malleable record, not a fixed
document
9. Affect on CJH
Likely minimal – migration paths will
be made available
Conversion from EAD-DTD to EAD-
Schema
Creation of task force?
Resources, stylesheets available
Creation of new EAD templates
New possibilities!
11. Basics
EAC-CPF: Encoded Archival Context
– Corporate Bodies, Persons,
Families
XML vocabulary
Based on ISAAR-CPF: int’l standard
related to ISAD(G)
Adopted by SAA in 2011: standard for
archival authority data
12. Features
Parallels many RDA changes
Increased granularity of data
• E.g. life dates split into birth and death
dates
Emphasis on relations
• With other resources
• With other corporate bodies, persons,
families
• With functions
13. Features
Compatibility with existing authority
data (LCSH, etc.)
Wrapper elements allow wholesale
inclusion of outside metadata, i.e.
authority MARC-XML
Great flexibility for alternate names,
variant forms, local implementations
14. Features
Accomodates 4 different types of
“entities”
Single identity
Multiple identity
• Many in one (single EAC-CPF instance)
• One in many (multiple instances)
Alternative sets (i.e. variant records)
15. Why EAC?
Part of broader move towards
semantic web, linked open data (LOD)
Better end-user experience
Improves capacity for faceted
searching
More intuitive web interfaces
Standardization of authority data
Sharing of authority data
Eventually – saves time
17. Basic structure
Like EAD (and MARC), divided into
control and descriptive sections:
<eac-cpf>
<control> […] </control>
<cpfDescription>
[ALTERNATE
<multipleIdenties><cpfDescription> . . .]
</cpfDescription>
</eac-cpf>
18. Basic structure : Control
Administrative data about the record
itself
Required elements:
• recordId
• maintenanceAgency
• maintenanceStatus
• maintenanceHistory
• languageDeclaration
• sources
19. Basic structure : Control
Optional elements
Allow for local customization
Use of other identifiers for same entity
(i.e. from other thesauri, other national
libraries, etc.)
20. Basic structure : cpfDescription
Descriptive section <cpfDescription>
For most records: single <identity>
For complex identities
• many-in-one, corporate and compound
entities
• multiple <cpfDescription> elements
wrapped in <multipleIdentities> tag
21. Basic structure : cpfDescription
Required: <identity>
Optional:
<description>
<relations>
<alternateSet> -- alternate records for
the same entity imported from a
different authority system, such as
LCSH, VIAF, or a different national
library.
22. Basic structure : cpfDescription
Descriptive section: required <identity>
Most complex element
Parallels RDA changes:
• Increased functionality for parallel and
variant forms of names
• Can distinguish between “authorized” and
“preferred” forms of a name
• Increased granularity (parts of names,
dates)
• Ability to qualify variant forms of names
by “use dates”
23. Basic structure : cpfDescription
Optional <description>
Very similar to RDA, but encoded in XML
<existDates>
• <date>, <dateRange>, <dateSet>
<places>
• May be qualified by dates and roles
• Place of residence, place of birth, place of
death, etc.
24. Basic structure : cpfDescription
Optional <description>
All may be qualified by dates:
<occupations>
<functions>
<legalStatus> (corporate body)
<mandates> (corporate body)
25. Basic structure : cpfDescription
Optional <description>
“Free text” descriptive sections:
<biogHist> -- same as in EAD
<generalContext> -- “general social
and cultural context
<structureOrGenealogy>
• Structure of corporate bodies
• Genealogy of individuals, families
26. Basic structure : cpfDescription
Relations section:
<cpfRelation> -- relations to other
“entities”
<functionRelation>
<resourceRelation>
<objectXMLWrap> to include other
records, portions of other records
27. Basic structure : cpfDescription
Relations section:
All have “relation type” attributes to help
specify the type of relation:
• cpf: Family, associative, hierarchical-child,
hierarchical-parent, etc.
• Functions: controls, owns, performs, etc.
• Resources: creator, subject, etc.
To include other records, portions of other
records:
• <objectXMLWrap>
• <objectBinWrap>
28. Implementation at CJH?
Via Digitool?
Similar to MARC to EAD
• Wholesale batch conversion
• Issues:
• data cleanup
• Skeletal data
• Resolving differences in existing biographical
notes, etc.
• Digitool’s interface – not good for “active”
records needing frequent syncing, updating
29. Implementation at CJH?
Via Digitool?
Steps required:
• Batch export of authority data from Aleph
• LCNAF is also available for download
• MARC to EAC stylesheet
• Google Refine: cleanup data
• Ingest to Digitool
• EAC to HTML stylesheet
• Google Refine: resolution with existing
EADS
30. Implementation at CJH?
Via Digitool?
Potential for *labor-intensive* edits
• Roles within collection
• Center-wide agreement on relator terms
(RDA?), manually updating EAD “role”
attributes
• Expansion of biogHist,
structureOrGenealogy, etc.
Custom database outside of Digitool?
Eventually – ArchivesSpace?
31. Future potential
Crowd-sourcing
Relationship data
Function data (“Is correspondent”, “Is
subject” etc)
Genealogical data
Harvesting biographical, historical and
genealogical data
DBPedia
JewishGen
32. Resources
MARC-XML to EAC stylesheets
Entire LCSH, LCNAF available for
download (MADS/RDF):
http://id.loc.gov/download/
EAC-Pages:
http://eac.staatsbibliothek-berlin.de/
EAC listserv = EAD listserv
Editor's Notes
Image found at: http://blogs.extremeexperts.com/tag/migration/