Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Enhancing the value of the digitised record, including OCR transcriptions, captioning and Amplify

37 views

Published on

Enhancing the value of the digitised record, including OCR transcriptions, captioning and Amplify Jenna Bain, Digital Projects Leader presented at Digital collecting for NSW public library staff, 27 May 2019

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

Enhancing the value of the digitised record, including OCR transcriptions, captioning and Amplify

  1. 1. Jenna Bain Digital Projects Leader Enhancing the value of your digital collections
  2. 2. What do I mean by accessibility?
  3. 3. What do I mean by accessibility? Web Content Accessibility Guidelines (WCAG 2.0 AA) Perceivable Operable Understandable Robust
  4. 4. Are you providing a meaningful research experience for your users?
  5. 5. Optical Character Recognition (OCR) Technology that enables you to convert documents, such as scanned PDFs or images of text, into machine readable data that is editable and searchable.
  6. 6. Trove
  7. 7. Crowdsourcing
  8. 8. intelligence (AI) that allows software applications to become more accurate in predicting outcomes and patterns without being explicitly programmed. Machine learning is a form of artificial
  9. 9. Google Maps Siri Cortana Netflix Spotify Email spam filters Facebook facial recognition
  10. 10. T I G gE R
  11. 11. aggingT magesI enerically forG g xploration &E esearchR
  12. 12. Animal – 99.84% Horse – 99.84% Mammal – 99.84% Wagon – 99.58% Horse cart – 99.58% Vehicle – 99.58% Transportation – 99.58% Wheel – 99.27% Person – 99.94% Man – 95.39% Drum – 94.01% Musical instrument – 88.62%
  13. 13. Insect – 97.39% Invertebrate – 97.39% Flea – 97.21% Animal – 97.21% Apparel – 99.76% Bonnet – 98.67% Hat – 98.67% Portrait – 96.25% Artwork – 95.24%
  14. 14. Boat – 98.36% Outdoor – 97.78% Water – 96.18% Rowboat – 95.95% Nature – 95.95% People – 95.92% Outdoors – 99.99% Nature – 99.99% Landscape – 99.99% Aerial view – 98.24% Plane – 96.72% Transportation – 91.12% Aircraft – 91.12%
  15. 15. Bird – 99.91% Animal – 99.91% Pelican – 92.97%
  16. 16. Accessibility ≠ perfection
  17. 17. “…a 25% error rate in word recognition would make a transcription unreadable but does not significantly reduce its usefulness for discovery by a search engine as peoples’ use of Oard, 2012 spoken language quite often includes a lot of redundancy”.
  18. 18. 13,000 sessions 204,000 corrections 6100 users 61 countries USAGE SINCE LAUNCH 126 transcripts complete
  19. 19. International visitation since launch
  20. 20. International visitation since launch
  21. 21. Amplify statewide rollout
  22. 22. onlineocr.net abbyy.com/en-au/finereader voicebase.com trint.com zooniverse.com cloud.google.com/vision imagga.com autotag.me Automated image tagging Crowd platforms digivol.ala.org.au Machine transcription OCR
  23. 23. Thank you  Questions?

×