#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)

CONUL Conference
Jun. 1, 2018
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)
1 of 23

More Related Content

Similar to #nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)

Digital Tools in The Classroom: Omeka Workshop (Northeastern University)Digital Tools in The Classroom: Omeka Workshop (Northeastern University)
Digital Tools in The Classroom: Omeka Workshop (Northeastern University)jkmcgrath
Presentation of DanteSourcesPresentation of DanteSources
Presentation of DanteSourcesValentina Bartalesi Lenzi
How can the cultural heritage community best meet the challenges of email arc...How can the cultural heritage community best meet the challenges of email arc...
How can the cultural heritage community best meet the challenges of email arc...peterchanws
NECTAR_VRE1NECTAR_VRE1
NECTAR_VRE1Craig Bellamy
Using Web Archives for Studying Cultural Heritage Collaborative PlatformsUsing Web Archives for Studying Cultural Heritage Collaborative Platforms
Using Web Archives for Studying Cultural Heritage Collaborative PlatformsMarta Severo
Iiif to go   iiif vatican (7 minutes)Iiif to go   iiif vatican (7 minutes)
Iiif to go iiif vatican (7 minutes)Rachel Di Cresce

Similar to #nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)(20)

More from CONUL Conference

Library.Now – Welcome to the future!Library.Now – Welcome to the future!
Library.Now – Welcome to the future!CONUL Conference
Towards a CONUL Collective Collection -  Christoph Schmidt Supprian (Trinity ...Towards a CONUL Collective Collection -  Christoph Schmidt Supprian (Trinity ...
Towards a CONUL Collective Collection - Christoph Schmidt Supprian (Trinity ...CONUL Conference
The newly urgent future for libraries: A view from MIT - Chris Bourg (Directo...The newly urgent future for libraries: A view from MIT - Chris Bourg (Directo...
The newly urgent future for libraries: A view from MIT - Chris Bourg (Directo...CONUL Conference
What we learned - Dr. Melissa Highton (Director of Learning, Teaching and Web...What we learned - Dr. Melissa Highton (Director of Learning, Teaching and Web...
What we learned - Dr. Melissa Highton (Director of Learning, Teaching and Web...CONUL Conference
Launched into the Digital Age: Content Creation at Maynooth University Specia...Launched into the Digital Age: Content Creation at Maynooth University Specia...
Launched into the Digital Age: Content Creation at Maynooth University Specia...CONUL Conference
Streamlining Metadata Supply for ALL - Heather Sherman (BDS)Streamlining Metadata Supply for ALL - Heather Sherman (BDS)
Streamlining Metadata Supply for ALL - Heather Sherman (BDS)CONUL Conference

More from CONUL Conference(20)

Recently uploaded

Wireless LANs PPT.pptWireless LANs PPT.ppt
Wireless LANs PPT.pptDrTThendralCompSci
'RAY'-volution (Akademos-2021).pdf'RAY'-volution (Akademos-2021).pdf
'RAY'-volution (Akademos-2021).pdfAshishBagani2
Post Truth Presentation.pptxPost Truth Presentation.pptx
Post Truth Presentation.pptxHiralVaitha
Song-Based Lesson Plan: B2 First (FCE) Exam Prep with English LyricsSong-Based Lesson Plan: B2 First (FCE) Exam Prep with English Lyrics
Song-Based Lesson Plan: B2 First (FCE) Exam Prep with English LyricsMarcia Bonfim
Ethernet.pptEthernet.ppt
Ethernet.pptDrTThendralCompSci
Being at an RC: Expectations and Nitty-Gritty of Presentation Techniques, A R...Being at an RC: Expectations and Nitty-Gritty of Presentation Techniques, A R...
Being at an RC: Expectations and Nitty-Gritty of Presentation Techniques, A R...Assoc. Prof. Dr. Vinod Kumar Kanvaria

#nuntastic: transcribing Nano Nagle’s letters using collaborative transcription services - Audrey Drohan (University College Dublin)

Editor's Notes

  1. Good afternoon. Thank you for staying until the very last talk of the conference, bar the closing remarks. Today I’m talking about collaborative transcription services…
  2. …which UCD Digital Library used to help expose the wonderful content of one of our newest collections. There’s nothing too technology heavy in this talk, as I know you must be exhausted at this stage!
  3. So…Nano Nagle is the Foundress of the Presentation Sisters, which were set up in 1775. The tercentenary of her birth will be marked in June 2018, and will include the online publication of 17 of her letters. This collection was brought to us by Principal Investigator Prof. Deirdre Raftery, from the UCD School of Education, who had previously brought us in another nun collection called Loreto 1916. #nuntastic seems to be an official tag on twitter for describing these collections, hence my talk’s title. The physical letters are curated in Dublin, Cork, San Francisco and New York (clocking up an impressive 17,482 km between them), so by digitising the collection and facilitating the virtual reunification of such geographically dispersed material, we can save researchers quite a bit of air miles and money!
  4. Creating a digital collection from transatlantic archives was challenging enough – we could only scan the ones held in Dublin! As you can see, even the scans look different from each other. But this is a collection of handwritten letters, so we needed to figure out how to unlock the content of what is consider ‘OCR-resistant’ texts.
  5. So, we started how we normally start… The collection was profiled, digitised, given rights statements. It was also fully catalogued using authorised names and subject headings, and then ingested into our digital repository, Fedora. But at this stage the text inside the letter...the content…was not fully searchable.
  6. And as you can see…it’s fairly legible…ish!
  7. But we could do with a bit of help! Deirdre and her team actually did the transcriptions for the letters for us using MS Word…so we technically crowdsourced from a population of three! We’ve dealt with transcription before but not in any kind of elegant way, and not in a way where we could get help from outside the team. So we needed a solution to enable crowdsourcing of transcriptions, and that allowed us to add that content, and previously transcribed content, to the digital library.
  8. Now…the UCD Digital Library is already quite a complex infrastructure. We have repository software Fedora in the background, and we’ve implemented the IIIF framework, which Cillian has already described. IIIF allows us to do amazing things with images within the Mirador Image Viewer. And the framework also allows us to add value to the content. It starts with a canvas, to which you add the image, and through a manifest (or list of associated resources) you can connect in other things…including transcriptions.
  9. Now…we evaluated a few different transcription technology platforms, like Scribe and Transkribus, and we chose to go with FromThePage, as it had some features that really appealed to us, like the export of the transcriptions as TEI. What we needed, and what FromThePage provides, is the ability to push content out onto the platform, allow users to go from our Image Viewer to the content to help transcribe it, and we can then pull the transcriptions back into our system, which are then preserved as part of our preservation activities. Subsequent users can search the full text in the Digital Library, download the TEI to reuse for further research, view the transcribed text on the FtP platform, and eventually we hope to be able to offer the ability to view the transcribed text with the Image Viewer itself. And how do we do that? Well, thanks to interoperability between the two systems, we can use the IIIF API to push the content into FtP, and use the FtP Contribution API to pull the transcriptions back into the Digital Library.
  10. So, we start the whole workflow by ingesting the digital collection into Fedora and publishing online.
  11. At this stage when you go in, there is no connection with FtP
  12. To add a collection to FtP, you go to the platform’s dashboard and upload the collection using their import tools. You can load the files directly by uploading PDF or ZIP files, but in the case of Nano, we were able to use the IIIF manifest for the Nano collection in the Digital Library to pull the collection into FtP.
  13. Once Nano’s letters were uploaded onto the FtP platform, a small red pencil appeared in the Mirador Image Viewer to let us know the letters were available to be transcribed. We could also log in directly to the FtP platform. Clicking on the red pencil brings you directly into the FtP platform…
  14. …and into edit view, where you can transcribe. As I said, we already had the transcriptions done so for Nano we copied and pasted them into the editor.
  15. We then marked up some of the content as subjects. Here we only focused on People and Places, but you can customise what can be marked up. Once that was done, each letter got marked for review. The DL team also acted as editors, so once the collection was reviewed, the Needs Review box got unchecked abd we moved on to editing the subjects…
  16. As you can see, the marked up text for the Subjects becomes hyperlinked and these can be mapped to other instances of the same name.
  17. You can go into the subject categories to review them, do further corrections, augment with additional information, and link to other instances within the Nano letter’s collection.
  18. You can also view the relationships between the subjects and the letters they appear in.
  19. Then the marked up transcriptions for the collection of letters were ready to go back into the Digital Library. You can export out the content using the FtP dashboard – as I already said, we have it set up so that by using FtP’s Contributions API, our fedora repository can pull the content back into the Digital Library, once it satisfies the criteria for being complete. Solr can then index the transcriptions for full text searching, and the TEI files become available to download through the Letter’s descriptive record…
  20. Back in the Mirador Image Viewer, it now looks like this…complete with blue icon to denote that there is a transcription available. Currently to see the transcription and the letter side by side you have to go back into FtP. Hopefully, with future developments, you’ll be able to see the transcription on the same canvas within the Mirador Image Viewer.
  21. So…we may not have used external users with our collaborative transcription service but we did external experts. As this is still a pilot service, we were interested to see how we could push a collection through its workflow. And I have to say, there are more Pros than Cons. The big thing for us was being able to get the transcribed content back into our preservation system, and being able to enable full text searching on it. All of this possible thanks to IIIF and the interoperability between the UCD Digital Library and FromThePage.
  22. So, in conclusion: Crowdsourced transcription platforms are great. We’ve added other collections to FromThePage and even without promoting that fact, people are transcribing our content. - We’ve learned loads ourselves through this process We’ve had to do quite a bit of technical development, but the Brumfields who created FromThePage are a great team and are very open to making changes (so long as we pay them, bizarrely) And Nano’s collection of letters have been greatly enhanced by the process, and can now offer scholars the opportunity to engage with the content in new ways
  23. The Collected Letters of Nano Nagle, complete with transcribed text, will be available from June 8th this year. Thank you for your attention.