Accessing treasure  on lands and peoples Peter Burnhill Director, EDINA, University of Edinburgh
 
Inspired by a Keynote remark  by Professor Gillies …
 
Credits:  who planned the dive & dived the wreck <ul><li>The team within EDINA: Des Reid, Senior Software Engineer Dimitri...
A treasure to be unlocked
 
<ul><li>Digital Library has mixed parentage   - a ‘re-mix’ of the document tradition & the computation tradition </li></ul...
Heard report on work from the Dive Team On from the marvels of reading and interpreting of marks on paper of the notebook ...
Example of the XML EAD data (1)  <ul><li><!DOCTYPE ead PUBLIC &quot;+//ISBN 1-931666-00-8//DTD ead.dtd (Encoded Archival D...
Work at the Refactory The XML files were passed to engineers at EDINA …  import script in Perl that parses the XML and con...
Example of  the XML EAD data (2)  <ul><li><!--Insert controlaccess index terms here if needed  </li></ul><ul><li><!--Dele...
Example of  the XML EAD data (3)  <ul><li><language langcode=&quot;eng&quot;>English</language> </li></ul><ul><li></langma...
Work at the Refactory This structure is imported into  Solr –  software used …  to control searching copies of the text  (...
 
 
 
 
 
Tobar an Dualchais 6,000 new  items now available to search & play  Over 24,000 tracks of  stories, songs, music, poetry a...
Early work between EDINA & Special Collections <ul><li>SCIMSS  Special Collections Index of Manuscripts , 1995/96 </li></u...
web index was created from the Special Collections’ departmental sets of  180 binders comprising, in alphabetical order, a...
Early work between EDINA & Special Collections
 
 
 
 
Early work between EDINA & Special Collections
<ul><li>Launched in 2003, no subscription fee </li></ul><ul><li>Now used by 379 licensed institutions </li></ul><ul><li>Se...
<ul><li>3,000 hours of video footage </li></ul><ul><li>Collections include: Gaumont Newsreels, News at Ten, ITN News Repor...
<ul><li>Initially launched in 2004 </li></ul><ul><li>Getty Images to Sept 2010 </li></ul><ul><li>Digital Images for Educat...
1 million image, video and sound resources to discover & use 45 Collections so far 8 Collections so far British Library Ar...
Future activities?
Future activities?
Future activities?
 
…  a rich ecosystem  …  from food delivered to a one-time wreck
…  a rich ecosystem  …  from food delivered to a one-time wreck Thank you   http://edina.ac.uk
Upcoming SlideShare
Loading in...5
×

Accessing Treasure on lands and peoples

1,783
-1

Published on

Presented by Peter Burnhill at the "Alexander Carmichael: Collecting, Controversy and Contexts" conference, Edinburgh, 23-24 June 2011

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,783
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • The process of disintermediation
  • HLF funded; UoE; BBC; National Trust for Scotland Launched Dec 2010 EDINA and the University of Edinburgh Information Services (IS) were contracted in 2007 to produce a Production Control Application and a Cataloguing Application (for web-based input of metadata) for the Tobar an Dualchais project. Launched in 2006, the multi-million-pound Heritage Lottery-funded project will preserve, digitise and make available online thousands of hours of recordings from the archives of BBC Scotland, the National Trust for Scotland and the School of Scottish Studies at the University of Edinburgh. The Tobar an Dualchais project is based at Sabhal Mòr Ostaig, the Gaelic-language college on Skye. The collections comprise a wide variety of material, including: Stories recorded by John Lorne Campbell on wax cylinders in 1937 Folklore collected from all over Scotland by Calum Maclean in the 1950s Scots songs recorded by Hamish Henderson for the School of Scottish Studies from travelling people in the 1960s Conversations recorded on BBC Radio nan Gàidheal The project has employed more than 20 people, with skills in administration, computing, Gaelic and Scots. Digitising is being carried out in Edinburgh and South Uist in the Hebrides, and people have been employed as home cataloguers throughout Scotland. It is hoped that Tobar an Dualchais will stimulate the culture and economy of different parts of Scotland, including some of the areas which provided many of the original recordings. EDINA is developing the Tobar an Dualchais website .
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Initially 10 collections - 1000 films; 300hrs (stills and scripts) ETV; documentaries from Films of Scotland 1920-1982; PO Film Unit Archive; Healthcare Productions; Wellcome Trust; St George&apos;s Hospital; Sheffield Univ; IWF Media (bio medical sciences); Anglia TV; Trials of Alger Hiss EMOL - additional films in 2004/05: Amber, OU Worldwide; ETV; Digital Himalya. ShakespeareNov 05;Culverhouse - 06 - and became FSOL Sep 06;
  • Official launch Oct ‘08
  • Getty: 50k images; 40k at launch; 10k added during course of service; approx 3k images /decade from last 19th C to present day
  • All can search/browse metadata free2web user experience Click to play/view &amp; download according to licence &amp; credentials provide a single location from which to discover and explore the collections provide an easy to use interface that meets user expectations provide a good user experience in terms of content, usability and functionality develop coherent and compelling search and browse functionality support the use of multimedia content in teaching, learning and research simplify access to the content and make transparent the terms and conditions of use reduce the ongoing costs by running one single service rather than four build upon the collections by licensing images using an ‘user driven’ model --- Content: Portal : to view content, wherever hosted Purchased content: DIE, FSOL Images on demand: buy images and store them in Mediahub Blog
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Possible focus future activities Usage data: including possibilities for new services that might be offered around it Second round of the Digging into Data Challenge. first round, in 2009, 90 international research teams competed - eight projects were awarded grants.   In 2011, the Digging into Data Challenge has returned for a second round, this time much larger, with sponsorship from eight international research funders, representing Canada, the Netherlands, the United Kingdom, and the United States.   The idea behind the Digging into Data Challenge is to address how &amp;quot;big data&amp;quot; changes the research landscape for the humanities and social sciences. Now that we have massive databases of materials used by scholars in the humanities and social sciences -- ranging from digitized books, newspapers, and music to transactional data like web searches, sensor data or cell phone records -- what new, computationally-based research methods might we apply? As the world becomes increasingly digital, new techniques will be needed to search, analyze, and understand these everyday materials. Digging into Data challenges the research community to help create the new research infrastructure for 21st century scholarship.    Applicants will form international teams from at least two of the participating countries.  Winning teams will receive grants from two or more of the funding agencies and, two years later, will be invited to show off their work at a special conference sponsored by the eight funders.
  • Accessing Treasure on lands and peoples

    1. 1. Accessing treasure on lands and peoples Peter Burnhill Director, EDINA, University of Edinburgh
    2. 3. Inspired by a Keynote remark by Professor Gillies …
    3. 5. Credits: who planned the dive & dived the wreck <ul><li>The team within EDINA: Des Reid, Senior Software Engineer Dimitrios Sferopolous, Software Engineer Neil Mayo, Software Engineer </li></ul><ul><li>Jackie Clark, Web Designer </li></ul><ul><li>led by Christine Rees, Head of Bibliographic & Multimedia </li></ul><ul><li>And those to whom we all owe lots: (in Centre for Research Collections, IS Library & Collections) Kirsty Stewart (project manager and archivist) Lesley Bryson nee Doig (initial project manager) </li></ul><ul><li>Grant Buttars, Deputy University Archivist Andrew Wiseman, Researcher, TEI expert Donald William Stewart, Senior Project Researcher </li></ul><ul><li>led by Arnott Wilson (University Archivist) & John Scally (University Collections) </li></ul>
    4. 6. A treasure to be unlocked
    5. 8. <ul><li>Digital Library has mixed parentage - a ‘re-mix’ of the document tradition & the computation tradition </li></ul><ul><ul><li>“ approaches based on a concern with documents, with signifying records : archives, bibliography, documentation, librarianship, records management, and the like … [ Domain knowledge speak ] </li></ul></ul><ul><ul><li>“ approaches based on uses of formal techniques , whether mechanical (such as punch cards and data-processing equipment) or mathematical/computational (as in algorithmic procedures).” [ Software engineer speak ] </li></ul></ul><ul><ul><ul><li>Prof. Michael Buckland, Presidential Address, American Society for Information Science, JASIS’s 50th (1998) </li></ul></ul></ul><ul><ul><ul><li>http://people.ischool.berkeley.edu/~buckland/asis62.html </li></ul></ul></ul>Languages & Perspectives
    6. 9. Heard report on work from the Dive Team On from the marvels of reading and interpreting of marks on paper of the notebook entries … and the meticulous transcription into machine-readable text … and their tagging using Encoded Archival Description (EAD) with text in XML format* * * mark-up that software can process more easily
    7. 10. Example of the XML EAD data (1) <ul><li><!DOCTYPE ead PUBLIC &quot;+//ISBN 1-931666-00-8//DTD ead.dtd (Encoded Archival Description (EAD) Version 2002)//EN&quot; &quot;ead.dtd”> </li></ul><ul><li><c level=&quot;item&quot; id=&quot;GB-237-Coll-97-CW114-42”> </li></ul><ul><li><did> </li></ul><ul><li><unitid encodinganalog=&quot;isadg(2)311&quot; label=&quot;Reference code&quot;>GB 237 Coll-97/CW114/42</unitid> </li></ul><ul><li><physloc label=&quot;Shelfmark&quot; encodinganalog=&quot;shelfmark&quot;>CW102-121<physloc> </li></ul><ul><li><unittitle encodinganalog=&quot;isadg(2)312&quot;>Song about Uamh-an-Oir, accompanying story and notes</unittitle> </li></ul><ul><li><unitdate encodinganalog=&quot;isadg(2)313&quot; certainty=&quot;certain&quot; normal=&quot;&quot; type=&quot;inclusive&quot;>1867</unitdate> </li></ul><ul><li><repository label=&quot;Repository&quot; encodinganalog=&quot;NAHSTE31&quot;>Edinburgh University Library, Special Collections</repository> </li></ul><ul><li><physdesc label=&quot;Extent and Medium of the Unit of Description&quot; encodinganalog=&quot;isadg(2)315&quot; audience=&quot;external”> </li></ul><ul><li><extent>folio 67v, line 17 to folio 68r, line 4</extent> </li></ul><ul><li><dimensions/> </li></ul><ul><li></physdesc> </li></ul><ul><li><!--Replace language code if other than English with ISO 639-2 three letter language code. Add further language tags if necessary  </li></ul><ul><li><langmaterial> </li></ul><ul><li><language langcode=&quot;gla&quot;>Gaelic</language> </li></ul>
    8. 11. Work at the Refactory The XML files were passed to engineers at EDINA … import script in Perl that parses the XML and constructs the relational structure, with reference to an existing database schema as shown. <ul><li>The green boxes indicate high-level entities: catalogue entry, its transcript and images. </li></ul><ul><li>The pink ‘cat_* boxes’ are links from catalogue entries to such things as places, people and subjects. </li></ul>
    9. 12. Example of the XML EAD data (2) <ul><li><!--Insert controlaccess index terms here if needed  </li></ul><ul><li><!--Delete any tags not required--> </li></ul><ul><li><controlaccess encodinganalog=&quot;NAHSTE38”> </li></ul><ul><li><head>Index</head> </li></ul><ul><li><controlaccess encodinganalog=&quot;NAHSTE381”> </li></ul><ul><li><head>Subjects</head> </li></ul><ul><li><subject authfilenumber=&quot;218&quot;>Caves</subject> </li></ul><ul><li><subject authfilenumber=&quot;327&quot;>Dogs</subject> </li></ul><ul><li><subject authfilenumber=&quot;2665&quot;>Hair</subject> </li></ul><ul><li><subject authfilenumber=&quot;3910&quot;>Loss (of people or things)</subject> </li></ul><ul><li><subject authfilenumber=&quot;3814&quot;>Men</subject> </li></ul><ul><li><subject authfilenumber=&quot;3936&quot;>Rescues</subject> </li></ul><ul><li><subject>Waulking songs</subject> </li></ul><ul><li></controlaccess> </li></ul><ul><li><controlaccess encodinganalog=&quot;NAHSTE382”> </li></ul><ul><li><head>People</head> </li></ul><ul><li><persname authfilenumber=&quot;4708&quot;>| Mor Iain ic Dhòmhnaill Bhàin | fl1867 | Isle of Barra | Inverness-shire</persname> </li></ul><ul><li><persname authfilenumber=&quot;4278&quot;>MacNeil | Roderick | c1790-1875 | Ruaraidh an Rùma | crofter | Mingulay</persname> </li></ul><ul><li></controlaccess> </li></ul>
    10. 13. Example of the XML EAD data (3) <ul><li><language langcode=&quot;eng&quot;>English</language> </li></ul><ul><li></langmaterial> </li></ul><ul><li><origination label=&quot;Name of Creator(s)&quot; encodinganalog=&quot;isadg(2)321&quot;>Alexander Carmichael</origination> </li></ul><ul><li></did> </li></ul><ul><li><scopecontent encodinganalog=&quot;isadg(2)331”> </li></ul><ul><li><head>Scope and Content</head> </li></ul><ul><li><p>Song about Uamh-an-Oir probably collected from Roderick MacNeil, aged 88, crofter, Miùghlaigh/Mingulay beginning 'Na minn bheaga na minn bheaga/theaga, Dol eir creagan dol sna creag' composed of thirteen lines. Uamh-an-Oir is described as starting at Cliata cliff and going under Barra to Gearragaal east of Orasay [Uamh an Òir, Cliaid, Orasaigh, Barraigh/Isle of Barra]. </li></ul><ul><li>The story tells how five men went into the cave with dogs but only the dogs returned and they were hairless. 'The smith of Loch an Duin [Loch an Dùin] put out the torches. Great men sent them in against their will.' </li></ul><ul><li>Carmichael writes a note to himself to see Mor Iain ic Dhonuil Bhain [Mòr Iain ic Dhòmhnaill Bhàin] for the 'oran sith sung here at the luadh...She Knows all about the songs made'. A vocabulary note reads ' &quot;Fiallan fiadhaich&quot; An insect on the brain &amp;c!' Written transversely over the text in ink is 'Transcribed Book No III page 62 A[lexander] C[armichael]’. </li></ul><ul><li></p> </li></ul>
    11. 14. Work at the Refactory This structure is imported into Solr – software used … to control searching copies of the text (which have been normalised for more effective searching) … and for retrieval of text and images to be rendered on the website
    12. 20. Tobar an Dualchais 6,000 new items now available to search & play Over 24,000 tracks of stories, songs, music, poetry and factual information recorded in Scotland and further afield, from 1930s onwards. <ul><li>Thousands of oral recordings recorded in Scotland and further afield, from the 1930s onwards. </li></ul><ul><ul><li>including stories, songs, music, poetry and factual information. </li></ul></ul><ul><li>HLF funding </li></ul><ul><li>Joint project: Sabhal Mòr Ostaig, University of Edinburgh, BBC Scotland, National Trust for Scotland </li></ul>
    13. 21. Early work between EDINA & Special Collections <ul><li>SCIMSS Special Collections Index of Manuscripts , 1995/96 </li></ul><ul><ul><li>Once an ‘advanced’ Web service, now retired: Wayback Machine .. </li></ul></ul>
    14. 22. web index was created from the Special Collections’ departmental sets of 180 binders comprising, in alphabetical order, about 54,000 loose-leaf slips containing varied typescript dating from the 1930s.
    15. 23. Early work between EDINA & Special Collections
    16. 28. Early work between EDINA & Special Collections
    17. 29. <ul><li>Launched in 2003, no subscription fee </li></ul><ul><li>Now used by 379 licensed institutions </li></ul><ul><li>Several hundred hours of film, across a side range of subject areas and topics </li></ul><ul><li>Collections include: </li></ul><ul><ul><li>Imperial War Museum, Films of Scotland, Royal Mail Film Classics, Digital Himalaya, Culverhouse Classical Music, Logic Lane, Wellcome Film, Biochemical Society, Healthcare Productions, St George’s Medical School Collection, Education & Television Films Ltd, Amber Films, Performance Shakespeare </li></ul></ul><ul><li>Followed BUFVC/OU project for metadata, digitisation & rights clearance </li></ul><ul><li>http://www.filmandsound.ac.uk/ </li></ul>
    18. 30. <ul><li>3,000 hours of video footage </li></ul><ul><li>Collections include: Gaumont Newsreels, News at Ten, ITN News Reports, Channel 4 News, Reuters archives, Roving Report </li></ul><ul><li>60,000 news stories </li></ul><ul><ul><li>+ 25,000 ITN programme scripts </li></ul></ul><ul><ul><li>+ unreleased footage </li></ul></ul><ul><li>Launched in 2008, no subscription fee, uptake now </li></ul><ul><li>grown to 344 universities & colleges </li></ul><ul><li>http://www.nfo.ac.uk/ </li></ul><ul><li>worked with BUFVC who led project for metadata, digitisation and rights clearance </li></ul>
    19. 31. <ul><li>Initially launched in 2004 </li></ul><ul><li>Getty Images to Sept 2010 </li></ul><ul><li>Digital Images for Education from Oct 2010 </li></ul><ul><li>Schools service started in 2008 and ran until Sept 2010 </li></ul><ul><li>Engaging 88 subscribing universities & colleges </li></ul><ul><li>http://edina.ac.uk/eig/ </li></ul>
    20. 32. 1 million image, video and sound resources to discover & use 45 Collections so far 8 Collections so far British Library Archival Sound Recordings
    21. 33. Future activities?
    22. 34. Future activities?
    23. 35. Future activities?
    24. 37. … a rich ecosystem … from food delivered to a one-time wreck
    25. 38. … a rich ecosystem … from food delivered to a one-time wreck Thank you http://edina.ac.uk
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×