Your SlideShare is downloading. ×
Archives hub ead 2010_extended
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Archives hub ead 2010_extended


Published on

Extended version of Archives Hub presentation

Extended version of Archives Hub presentation

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide
  • This is just one way that the recipe could be marked up. This would be valid XML. Notice the pairing of the tags and that this is well nested.
  • Key UKAD partners: Access 2 Archives, Archives Hub, AIM25, Archives Wales, Genesis, Janus, National Register of Archives, Scottish Archives Network, A Vision of Britain
  • Transcript

    • 1. Lisa Jeskins and Bethan Ruddock Archives Hub Mimas
    • 2. By the end of today’s session we will have given you an introduction to: • what interoperability means • what XML is, what it does and why it is important • EAD structure and syntax • EAD and hierarchies • UK Archives Discovery Network (UKAD)
    • 3.  the ability of two or more systems or components to exchange information and to use the information that has been exchanged (IEEE Standard Computer Dictionary )
    • 4.  the ability to exchange/share data  integration of information resources presented in different formats  within a domain or across domains  advantages of cross-searching  XML facilitates interoperability
    • 5.  Data exchange standards such as: ◦ Z39.50 ◦ SRU
    • 6.  user can easily search across and retrieve resources from a wealth of systems  moving beyond individual websites for individual resources (silo approach)
    • 7.  ◦ to explore, publicise and mobilise the benefits and practice of effective interoperability across diverse information sectors
    • 8.  Extensible Markup Language  XML is a grammatical system for creating languages: ◦ a meta-language  Use XML to design your own markup language, consisting of meaningful tags that describe the data they contain  Create a language for describing…anything
    • 9.  XML does not do anything itself. It is pure information wrapped in XML tags  You must use other means to send, receive or display the data XML XML technologies is used by to create Detailed description to view in a browser Summary entry to view in a browser PDF for print
    • 10.  XML is not about content, though there might be certain restrictions on content  XML is essentially about structure  Creating a consistent structure via XML tagging enables content to be easily identified (by machines) and used flexibly
    • 11. <title> Alice in Wonderland </title> *XML allows you to define your tags* <book>Alice in Wonderland</book> <filmtitle>Alice in Wonderland</filmtitle> <tag> content </tag>
    • 12.  Attributes are simple name/value pairs associated with an element <tag attribute_name=“attribute_value”>content</tag> <language>English</language> <language langcode=“eng”>English</language> <date normal=“2004”>20 Sept 2004</date>
    • 13. <tag attribute_name=”attribute_value”>content</tag> <tree>hornbeam</tree> <tree type=”deciduous”>hornbeam</tree> <date normal=”2004”>20 May 2004</date> <date>20 May 2004</date> This is an XML element
    • 14. <trees> <tree type=“deciduous”> <species>oak</species> <fruit>acorn</fruit> </tree> <tree type=“coniferous”> <species>pine</species> <fruit>pine cone</fruit> </tree> </trees>
    • 15. <catalog> <cd> <title>OK Computer</title> <artist type=“band”>Radiohead</artist> <genre>pop</genre> <year>1997</year> </cd> <cd> <title>Stanley Road</title> <artist type=“solo”>Paul Weller</artist> <genre>pop</genre> <year>1995</year> </cd> </catalog> <title>Stanley Road</title> <artist>Paul Weller</artist> <type>solo</type> <genre>pop</genre> <year>1995</year>
    • 16. Alice in Wonderland Lewis Carroll 1 volume hardback
    • 17. Title Alice in Wonderland Author Lewis Carroll Extent 1 volume Format hardback
    • 18. <books> <title>Alice in Wonderland</title> <author>Lewis Carroll</author> <extent>1 volume</extent> <format>hardback</location> </books>
    • 19.  a root element is required <catalog> …..all your tags and content… </catalog>  closing tags are required  case matters
    • 20.  elements must be properly nested <physdesc> <extent>10 boxes</extent> </physdesc> <physdesc> <extent>10 boxes</physdesc> </extent>
    • 21.  attribute values must be enclosed in quotation marks, e.g. langcode=“fre”  element names must obey some basic rules ◦ e.g. cannot start with numbers or punctuation characters, cannot contain spaces ◦ e.g. <cd name> or <?name> would be incorrect
    • 22. Look at the following recipe for Chocolate Brownies – How would use XML to mark this up? (I’m reliably informed the recipe works!)
    • 23.  375g butter  375g dark chocolate  1 tablespoon vanilla extract  6 eggs  500g sugar  225g plain flour  Preheat the oven to 180°C, 350°F or gas mark 4. Grease a swiss roll tin or oblong baking dish. Melt the chocolate and butter in a bowl over a saucepan of hot water. Add the vanilla and set the mixture aside until it is lukewarm.  Whisk the eggs and sugar into the mixture. Sift in the flour and baking powder and fold gently until the mixture is just combined. Pour into the greased tin and bake for 20 to 30 minutes until the brownie is cooked around the edges, but still soft in the middle.  Cool and cut into squares.  Makes 48 brownies Chocolate Brownies
    • 24. <recipe> <title>Chocolate Brownies</title> <ingredients> <item>375g butter</item> <item>375g dark chocolate</item> <item>1 tablespoon vanilla extract</item> <item>6 eggs</item> <item>500g sugar</item> <item>225g plain flour</item> </ingredients> <method> <p>Preheat the oven to <temp>180°C, 350°F or gas mark 4</temp>.Grease a swiss roll tin or oblong baking dish. Melt the chocolate and butter in a bowl over a saucepan of hot water. Add the vanilla and set the mixture aside until it is lukewarm. Whisk the eggs and sugar into the mixture.</p> <p>Sift in the flour and baking powder and fold gently until the mixture is just combined. Pour into the greased tin and bake for <bakingtime>20 to 30 minutes</bakingtime> until the brownie is cooked around the edges, but still soft in the middle.</p> <p>Cool and cut into squares.</p> </method> <serving>Makes 48 brownies</serving> </recipe> Possible XML markup for recipe
    • 25. <ingredient>375 g butter</ingredient> Or <ingredient> <item>375 g butter</item> </ingredient> Or <ingredient> <type>butter</type> <quantity>375 g</quantity> </ingredient>
    • 26.
    • 27.  Valid XML: rules specify elements and attributes used and how used  Valid XML provides consistency and facilitates the exchange of data  Valid XML is important for displaying, processing and exchanging XML in a wider environment
    • 28.  A Document Type Definition or Schema defines the building blocks of an XML document  It specifies elements and attributes and defines how they can be used  People can agree to use a common DTD/Schema for interchanging data
    • 29. <?xml version="1.0" encoding="UTF-16"?> <!ELEMENT recipe (title, intro?, ingredients+, method, serving*)> <!ELEMENT title (#PCDATA)> <!ELEMENT intro (#PCDATA)> <!ELEMENT ingredients (item+)> <!ELEMENT item (#PCDATA)> <!ELEMENT method (p+)> <!ELEMENT p (#PCDATA | temp | bakingtime)*> <!ELEMENT temp (#PCDATA)> <!ELEMENT bakingtime (#PCDATA)> <!ELEMENT serving (#PCDATA)>
    • 30.  Schemas perform the same task as DTDs  Schemas use XML syntax  Schemas support complex data types  Easier to describe allowable content  One XML document can point to more than one schema
    • 31. <?xml version="1.0"?> <note xmlns="" xmlns:xsi="" xsi:schemaLocation=" note.xsd"> <note> <to>Rachel</to> <from>John</from> <heading>Reminder</heading> <body>Don't forget the concert!</body> </note>
    • 32. <?xml version="1.0"?> <xs:schema xmlns:xs="" targetNamespace="" xmlns="" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
    • 33. XML file DTD or Schema Valid XML Blue Elephant Papers …………………… ………… Blue Elephant Papers Browse List
    • 34.  Use XML technologies – for displaying, retrieving, transforming, manipulating  XSLT – Extensible Stylesheet Language for Transformations  Many technologies available to manipulate XML documents
    • 35.  transformation involves the reading in of an XML file and an XSLT file to a processor, which can then generate some output – typically HTML XSLT XML processor HTML output
    • 36.  HTML is ONLY for display, typically in a Web browser  HTML tags do not describe the content  HTML cannot easily be extracted by machines for different purposes  XML tags can be specified by anyone; HTML tags are prescribed
    • 37. HTML: <h1> Papers of Peter Rowe </h1> XML: <title> Papers of Peter Rowe </title> HTML: <b> 21 May 2004 </b> XML: <date> 21 May 2004 </date>
    • 38.  International standard, supported by the W3C  It is open, licence free and platform neutral  It is human and machine readable  XML documents are text documents
    • 39.  XML does not determine the presentation of the data ◦ use stylesheets to present XML data ◦ with proprietary systems content is inextricably bound up with format  Hierarchical structure – good for archive descriptions!
    • 40.  XML is the main basis for defining data exchange languages  Meaningful tags facilitate extraction – data can be manipulated as required
    • 41.  All publicly funded bodies should use XML for data exchange (e-GIF)  XML has been widely adopted commercially as well as in the public sector
    • 42.  XML is: ◦ simple ◦ flexible ◦ great for data exchange  XML must be: ◦ well-formed ◦ valid  DTDs and Schemas: ◦ to create valid XML ◦ provide tags, attributes and rules  XML requires other XML technologies ◦ e.g. stylesheets can transform XML for display
    • 43.  EAD = Encoded Archival Description  EAD is XML for finding aids  A data structure standard – not a content standard  A structure that allows finding aids to be indexed, searched, retrieved and navigated  Compatible with ISAD(G)
    • 44. EAD is:  Flexible enough to deal with all types of finding aids: single or multi-level, long or short, lists or calendars etc.  Used to create new finding aids as well as converting old ones to standardised form  Used to share data between systems
    • 45.  EAD is maintained and developed by an international working group  Develops and publishes documentation and tools: tag library, guidelines, EAD Cookbook, websites
    • 46. <ead> <eadheader> </eadheader> <archdesc> <did></did> </archdesc> </ead>
    • 47. <ead> EAD root element <eadheader> EAD file information wrapper </eadheader> <archdesc> Finding aid wrapper <did></did> Core collection information wrapper </archdesc> </ead>
    • 48. <archdesc> <eadheader> <did> sub-fonds descriptions
    • 49. <eadheader> <eadid> <filedesc> <titlestmt> <titleproper> <profiledesc> <revisiondesc> EAD file information Identifier Title Creation Revision
    • 50. Within <archdesc> there are elements for:  Description  Presentation  Hierarchy
    • 51. <archdesc> <did> <scopecontent> <bioghist> <arrangement> <controlaccess> Archival description Descriptive information Scope and Content Biographical/Admin. History Arrangement Access points
    • 52. <did> <unitid> <unititle> <unitdate> <origination> <repository> <physdesc> <extent> <genreform> <physfacet> <physloc> <container> <abstract> </did> Descriptive information Reference Title Covering dates Creator(s) Repository Physical description Extent Form Physical Facet Location Container type Brief description
    • 53. <archdesc level="fonds"> <did> <unitid>GB 0001 Foster</unitid> <unittitle>Papers of Dr Foster</unittitle> <unitdate normal = "1820-1833">1820-1833</unitdate> <repository>University of Gloucestershire</repository> <physdesc> <extent>1 box</extent> <physfacet>Four folders of letters, 230 folios</physfacet> </physdesc> <langmaterial><language langcode=“eng”>English<language> </langmaterial> <origination>Dr Foster</origination> </did>
    • 54. <acqinfo> <custodhist> <appraisal> <processinfo> <accruals> <altformavail> <accessresrict> <userestrict> <prefercite> Acquisition information Custodial history Appraisal and selection Process Information Accruals information Copies Access restrictions User restrictions Citation information
    • 55. <bibliography> <fileplan> <otherfindaid> <relatedmaterial> <separatedmaterial> <index> Publication note Classification scheme Other finding aids Related material Separated material Keywords
    • 56. <controlaccess> <name> <corpname> <persname> <famname> <geogname> <occupation> <function> <genreform> <subject> Controlled access headings Names (general) Corporate body name Personal name Family name Place name Occupations Functions (administrative) Genre and Form Subject
    • 57. <head> <p>; <lb> <emph>; <blockquote> <list><item>; <chronlist><chronitem>; <ref>; <ptr>; <dao> Headings Layout Italics and quotes Lists References, pointers and links to digital objects
    • 58. <head> <p>; <lb> <emph>; <blockquote> <list><item>; <chronlist><chronitem>; <ref>; <ptr>; <dao> Headings Layout Italics and quotes Lists References, pointers and links to digital objects NB: EAD is NOT about the presentation of your finding aids, but about their syntax. Separate software will take care of the display of the information.
    • 59. ISAD(G) (v.2) 3.1.1 Reference code(s) 3.1.2 Title 3.1.3 Dates of creation 3.1.4 Level of description 3.1.5 Extent of the unit 3.2.1 Name of creator 3.2.2 Administrative/Biographical history 3.2.3 Custodial history 3.2.4 Immediate source of acquisition 3.3.1 Scope and content 3.3.2 Appraisal, destruction and scheduling EAD 2002 <unitid> countrycode and repositorycode attributes <unittitle> <unitdate> <archdesc> and <c> level attribute <physdesc>, <extent> <origination> <bioghist> <custodhist> <acqinfo> <scopecontent> <appraisal>
    • 60. 3.3.3 Accruals 3.3.4 System of arrangement 3.4.1 Access conditions 3.4.2 Copyright/Reproduction 3.4.3 Language of material 3.4.4 Physical characteristics 3.4.5 Finding aids 3.5.1 Location of originals 3.5.2 Existence of copies 3.5.3 Related units of description 3.5.4 Publication note 3.6.1 Note <accruals> <arrangement> <accessrestrict> <userestrict> <langmaterial> <phystech> <otherfindaid> <originalsloc> <altformavail> <relatedmaterial> and <separatedmaterial> <bibliography> <odd>
    • 61.  EAD version 1 DTD  EAD 2002 DTD  EAD 2002 Schema  Available from  Human-readable version: EAD Tag Library (Society of American Archivists)
    • 62.  Library of Congress Official EAD site:  Tag Library:  EAD Roundtable Help Pages:
    • 63. ISAD(G) states that to be a conformant archival description a finding aid must:  Be hierarchical ◦ Description from the general to the specific ◦ Information relevant to the level of description ◦ Linking of descriptions (logical sequence) ◦ Non-repetition of information  Contain a minimum set of data elements
    • 64.  Recommended elements for lower level descriptions: ◦ reference code ◦ title ◦ date(s) ◦ extent of the unit of description ◦ level of description
    • 65. ISAD(G) levels:  Fonds  Sub-fonds  Series  Sub-series  File  Item EAD levels: <archdesc> <dsc><c01> <c02> <c03> <c04> <c05>
    • 66. <ead>… <archdesc> [collection level description here] ◦ <dsc> <c01>[series] description 1 <c02>[file] description 1</c02> <c02>[file] description 2 <c03>[item] 1</c03> <c03>[item] 2</c03> </c02> </c01> <c01>[series] description 2.... ◦ </dsc> </archdesc> </ead> c02 c02 c03 c03 c01
    • 67. <c01 level = "subfonds"> <did> <unitid>GB 0324 MS 54</unitid> <unittitle>Correspondence files</unittitle> <unitdate>1920-1945</unitdate> <physdesc><extent>4 files</extent></physdesc> </did> <scopecontent>…</scopecontent> <c02 level = "series"> <did>…</did> <scopecontent>…</scopecontent> </c02> </c01>
    • 68.  EAD supports two ways of representing levels  <c> is used in A2A, <c0*> on the Hub  Slightly easier to use <c0*>, as the numbers give you more of an idea of the level you are working at
    • 69. <dsc type="combined"> <c level="series"> <did> <unitid>Series 1</unitid> <unittitle>Correspondence</unittitle> </did> <scopecontent>[...]</scopecontent> <c level="subseries"> <did> <unitid>Subseries 1.1</unitid> <unittitle>Outgoing Correspondence</unittitle> </did> <c level="file"> <did> <unittitle>AbbingerAldrich</unittitle> </did> </c> </c> </c> </dsc>
    • 70.  XML is a meta-language for creating mark-up languages  XML files require other technologies for display, processing, etc.  For archive finding aids EAD is the DTD/Schema to use
    • 71.  It is XML, which is an international standard  It is a simple and effective way of structuring content and providing meaning  Machines can manipulate the content in all sorts of ways  It is a great format to store finding-aids
    • 72.  Effective cross-searching requires: ◦ Interoperability  which requires ◦ Common standards
    • 73.  UKAD:  To promote the opening up of data and to offer capacity for such a cross-searching capability across the UK archive networks and online repository catalogues  To lead and support resource discovery through the promotion of relevant national and international standards  To support the development and use of name authorities
    • 74.  To advocate for the reduction of cataloguing backlogs and the retro-conversion of hard-copy catalogues  To promote access to digitized and digital archives via cross-searching resource discovery systems.  To work with other domains and potential funders to promote archive discovery
    • 75.  Fairly loose structure  Meetings about twice a year  Forum for discussion, sharing, connecting and collaborating  Creating a framework for activities (matrix) ◦ International/national/regional ◦ Meeting UKAD objectives, e.g. open up data; standards-based resource discovery; retro-conversion
    • 76.  Not many UK archives currently using EAD as a storage format  EAD will increasingly be used as an export format from proprietary database systems like CALM, for use in XML-based gateways such as Aim25 and the Archives Hub  New software becoming available all the time, which makes it easier to create, search and display XML – much of this is open source and often free
    • 77.  Differences in how EAD is used  Encourages interoperability but still requires work to ensure seamless cross-searching  EAD is flexible and includes a large number of tags which has advantages and disadvantages
    • 78.  XML is an international standard for sharing information  EAD is the XML language for archival finding aids  EAD is not a content standard  Use ISAD(G) for content guidelines and thesauri or authority files for index terms
    • 79.  You have used the Archives Hub’s EAD editor to create EAD records  XML Editors, such as XMetal or XMLspy can provide help with validating and with selecting tags and attributes  EAD will become increasingly important