Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An analysis of automatic transcription software – Work in Progress

22 views

Published on

This was a ALT Winter Conference Webinar on 12th December 2018
Webinar recording is available from https://eu.bbcollab.com/recording/617ff44008c6469b8274062821c08ce7

The use of video has increased dramatically over the years with the general reduction in cost of production and distribution. Today over 300 hours of video is being uploaded to YouTube every minute. The continued advancement of technological capabilities has enabled videos to be accessed easier than ever before on a variety of devices making it an increasingly popular medium for education.
This increased use of video in education as an instructional medium can be observed especially in online courses. Captioning and/or transcripts, make these resources accessible to people with disabilities and they also help all students including international learners. However, creating transcripts/captions manually can take lot of effort, both in terms of time and money.

Automatic Speech Recognition technologies have improved rapidly in the last couple of years. Both IBM and Microsoft have reached Word Error Rate (WER) almost in par with a professional human transcriber (Fogel, 2017; Lant, 2017). Transcript accuracy can be measured with WER and this is calculated using the formula: WER = (Substitution + Deletion + Insertions) / total number of words (Apone, Botkin, Brooks, & Goldberg, 2011).

Given the legal obligation to create accessible content it is important that new technologies and their capabilities are explored. In this presentation, I will be sharing some work in progress of assessing six automatic transcription services to explore how useful the current automatic transcription software could be in creating learning materials for the build environment sector.

References:
Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from http://ncam.wgbh.org/file_download/136

Fogel, S. (3 October 2017). IBM inches toward human-like accuracy for speech recognition. Retrieved from https://www.engadget.com/2017/03/10/ibm-speech-recognition-accuracy-record/

Lant, K. (23 August 2017). Microsoft’s Speech Recognition is Now as Good as a Human Transcriber. Retrieved from https://futurism.com/microsofts-speech-recognition-is-now-as-good-as-a-human-transcriber

Published in: Education
  • Be the first to comment

  • Be the first to like this

An analysis of automatic transcription software – Work in Progress

  1. 1. Realising your potential in the Built Environment ©UCEM An analysis of automatic transcription software: Work in progress Tharindu Liyanagunawardena
  2. 2. ©UCEM Plan • Background • Accessibility • University College of Estate Management • The Study • Data Collection • Analysis • Activity • Preliminary Results • Next Steps 2
  3. 3. ©UCEM Background • Dramatic in crease in video use • Over 500 million people watch video on Facebook every day1 • One billion hours of video watched on YouTube! 2 • 1.9 billion logged-in monthly users with YouTube2 • 70% YouTube watched on mobile devices2 • 2018 Video in Education report from Kaltura3 • Over 1500 surveyed • Use of lecture capture, ↑ 21% to 79% • Use of video by students for assignments 69% • Video feedback on student assignments in 35% of institutions 3 1. https://www.forbes.com/sites/tjmccue/2017/09/22/top-10-video-marketing-trends-and-statistics-roundup-2017/#3fe361da7103 2. https://www.youtube.com/intl/en-GB/yt/about/press/ 3. https://librarytechnology.org/pr/23610
  4. 4. ©UCEM Background • 2018 Video in Education report from Kaltura1 • Closed captions are in use at 52% of institutions • 34% use interactive video quizzes • Growing momentum for video creation • Students in K-12 (primary/secondary) institutions • 21% of institutions over half of their students are creating video • 15% in Higher Education Video-based learning experiences continue to expand and improve. I am excited to see usage trends on the rise, along with broader distribution of the various tools both in the hands of professors and students. Looking at our 2018/19 roadmap we continue to invest in the areas of video capture tools, interactivity, quizzing, and accessibility. Kaltura's Co-founder Dr. Michal Tsur 4 1. https://librarytechnology.org/pr/23610
  5. 5. ©UCEM Accessibility • Captions and Transcripts • Manual Captioning - Resource Intensive • Expensive • Time Consuming • Rule of Thumb allow four hours of transcribing for an hour of recording (Punch & Oancea, 2015) • Technology Advancing • Automatic Speech Recognition • Apple Siri, Amazon Alexa, • Google Duplex https://www.youtube.com/watch?v=7gh6_U7Nfjs 5 Punch, K. & Oancea, A. (2015). Introduction to Research Methods in Education (2nd eds). London: SAGE.
  6. 6. ©UCEM Accessibility • EU Accessibility Directive1 • New UK Regulations on Accessibility for Public Sector Bodies2 • 23rd September 2018 • Accessible VLEs – Making the most of the new regulations3 • The right thing to do • What stops us from doing the right thing? • Case of University of California Berkeley4 6 1. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016L2102 2. https://www.gov.uk/guidance/accessibility-requirements-for-public-sector-websites-and-apps 3. https://www.policyconnect.org.uk/appgat/research/accessible-vles-making-most-new-regulations 4. https://www.insidehighered.com/news/2017/03/06/u-california-berkeley-delete-publicly-available-educational-content
  7. 7. ©UCEM University College of Estate Management (UCEM) • 1919 – 2019 • Postal  Online • College of Estate Management  University College • Online Distance Education Materials and Accessibility: Case Study of University College of Estate Management1 • ALT South webinar: Online Learning Materials and Accessibility2 • Webinars • Captioning videos for accessibility: A case study of University College of Estate Management3 • A Rising Tide: How Closed Captions Can Benefit All Students4 7 1.https://www.researchgate.net/publication/307438821_Online_Distance_Education_Materials_and_Accessibility_Case_Study _of_University_College_of_Estate_Management 2. https://www.alt.ac.uk/civicrm/event/info%3Fid%3D337%26amp%3Breset%3D1 3. https://altc.alt.ac.uk/2017/sessions/captioning-videos-for-accessibility-a-case-study-1676/#gref 4. https://er.educause.edu/articles/2017/8/a-rising-tide-how-closed-captions-can-benefit-all-students
  8. 8. ©UCEM The study • Selected a set of software • Descript • IBM Watson Speech to Text (Watson) • Sonix • Synote • Trint • Zoom • 1000 Word Text • Property management • Construction management • Property and contract law • Building pathology • Sample 8
  9. 9. ©UCEM 9
  10. 10. ©UCEM 10
  11. 11. ©UCEM Data Collection • 10 expressed an interest in participating • 7 recordings • Length varied from 5:57 to 8:21minutes 11 Participant Gender Native English Speaker English Accent (as identified by participant) 1 Male Yes Generic British 2 Female No South American 3 Male Yes Generic Scottish 4 Male Yes Generic British 5 Female No South Asian 6 Male No Greek 7 Male No African
  12. 12. ©UCEM Analysis • MP3/MP4 • Recordings transcribed • Check with recording • Microsoft Word Compare • Word Error Rate (WER) WER = (Substitution + Deletion + Insertions) / N where N is the total number of words in the reference transcript (Apone, Botkin, Brooks, & Goldberg, 2011). 12 Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from http://ncam.wgbh.org/file_download/136
  13. 13. ©UCEM Word Error Rate • WER = (Substitution + Deletion + Insertions) / N • Example: (actual) this process will be quick (caption) this proswilling quick (actual) this process will be quick (caption) this proswilling **** ** quick S D D 12 Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from http://ncam.wgbh.org/file_download/136
  14. 14. ©UCEM Activity – Which recording will be transcribed best? 13 Can you guess which recording had the lowest Word Error Rate? In other words transcribed with fewest erroneous words Clips taken from each recording available at UCEM Blog Link available from chat window
  15. 15. ©UCEM Preliminary Results 14
  16. 16. ©UCEM Next Steps 15 • Quality of recording • Issues with WER • “Good enough” test • Experts on subject / disabled students
  17. 17. ©UCEM • Questions  tharindu@ucem.ac.uk @Tharindu__ 15

×