©UCEM
Automatic transcription software: Good enough
for accessibility?
Dr Tharindu Liyanagunawardena
A case study from built environment education
©UCEM
Plan
• Background
• Accessibility
• University College of Estate Management
• The Study
• Data Collection
• Analysis
• Findings
• Conclusion
2
©UCEM
Begin programming: Build your first mobile game
3
©UCEM
Background
• Dramatic in crease in video use
• Over 500 million watch video on Facebook every day1
• One billion hours of video watched on YouTube! 2
• 70% YouTube watched on mobile devices2
• 2018 Video in Education report from Kaltura3
• Over 1500 surveyed
• Use of lecture capture, ↑ 21% to 79%
• Use of video by students for assignments 69%
• Video feedback on student assignments in 35% of institutions
• Closed captions are in use at 52% of institutions
4
1. https://www.forbes.com/sites/tjmccue/2017/09/22/top-10-video-marketing-trends-and-statistics-roundup-2017/#3fe361da7103
2. https://www.youtube.com/intl/en-GB/yt/about/press/
3. https://librarytechnology.org/pr/23610
©UCEM
Accessibility
• The quality of being easily reached, entered, or
used by people who have a disability
- Oxford Living Dictionary
• Captions and Transcripts
• Manual Captioning
• Resource Intensive
• Rule of Thumb
allow four hours of transcribing for an hour of recording
(Punch & Oancea, 2015)
• Automatic Speech Recognition
• Google Duplex https://www.youtube.com/watch?v=7gh6_U7Nfjs
• Apple Siri, Amazon Alexa,
5
Punch, K. & Oancea, A. (2015). Introduction to Research Methods in Education (2nd eds). London: SAGE.
©UCEM
Accessibility Continued
• EU Accessibility Directive1
• New UK Regulations on Accessibility for Public Sector Bodies2
• 23rd September 2018
• Accessible VLEs – Making the most of the new regulations3
• The right thing to do
• What stops us from doing the right thing?
• Case of University of California Berkeley4
6
1. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016L2102
2. https://www.gov.uk/guidance/accessibility-requirements-for-public-sector-websites-and-apps
3. https://www.policyconnect.org.uk/appgat/research/accessible-vles-making-most-new-regulations
4. https://www.insidehighered.com/news/2017/03/06/u-california-berkeley-delete-publicly-available-educational-content
©UCEM
University College of Estate Management
• 1919  2019
• Postal  Online
• College  University College (2015)
• Online Distance Education Materials and Accessibility: Case
Study of University College of Estate Management1
• ALT South webinar: Online Learning Materials and Accessibility2
7
1.https://www.researchgate.net/publication/307438821_Online_Distance_Education_Materials_and_Accessibility_Case_Study
_of_University_College_of_Estate_Management
2. https://www.alt.ac.uk/civicrm/event/info%3Fid%3D337%26amp%3Breset%3D1
3. A Rising Tide: How Closed Captions Can Benefit All Students https://er.educause.edu/articles/2017/8/a-rising-tide-how-
closed-captions-can-benefit-all-students
©UCEM
The study
• Selected software
• Descript
• IBM Watson Speech to Text (Watson)
• Sonix
• Synote
• Trint
• Zoom
8
©UCEM
9
©UCEM
Data Collection
• 7 recordings
• Recording length  from 5:57 to 8:21minutes
10
Participant Gender
Native English
Speaker
English Accent
(as identified by participant)
1 Male Yes Generic British
2 Female No South American
3 Male Yes Generic Scottish
4 Male Yes Generic British
5 Female No South Asian
6 Male No Greek
7 Male No African
©UCEM
Analysis
• Recordings transcribed
• Check with recording
• Microsoft Word Compare
• Manual analysis
• Word Error Rate (WER)
WER = (Substitution + Deletion + Insertions) / N
where N is the total number of words in the reference transcript (Apone, Botkin,
Brooks, & Goldberg, 2011).
11
Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into
Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from
http://ncam.wgbh.org/file_download/136
©UCEM
Word Error Rate
• WER = (Substitution + Deletion + Insertions) / N
• Example:
(actual) this process will be quick
(caption) this proswilling quick
(actual) this process will be quick
(caption) this proswilling **** ** quick
S D D
12
Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into
Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from
http://ncam.wgbh.org/file_download/136
©UCEM
Results
13
0
100
200
300
400
500
600
700
800
2 3 4 5 6 7
Transcript number
Transcription Errors (per 1000 words)
Trint - Sonix - Descript - Watson - Zoom - Synonte -
©UCEM
Findings
14
Property
Management
Construction
Management
Building Pathology
Property and
Contract Law
Expert 1 Expert 2 Expert 3 Expert 4 Expert 5 Expert 6 Expert 7
2 x x x x x x x
3 x x x x x x x
4 x
Good
enough
Good
enough
x
Good
enough
Good
enough
x
5 x
Good
enough
x
Almost
good
enough
x x x
6 x x x x x x x
7 x x x x x x x
©UCEM
Conclusions
15
• Quality of recording matters
• In a technical discipline WER may not be sufficient predictor of
quality
• Good starting point for creating a transcript as an accessibility aid
• At present, off-the-shelf automatic transcription software does not
produce a high enough level of accuracy for the creation of
accessibility aids for the built environment sector
©UCEM
• Questions 
tharindu@ucem.ac.uk
@Tharindu__
16

Automatic transcription software: Good enough for accessibility? A case study from built environment education

  • 1.
    ©UCEM Automatic transcription software:Good enough for accessibility? Dr Tharindu Liyanagunawardena A case study from built environment education
  • 2.
    ©UCEM Plan • Background • Accessibility •University College of Estate Management • The Study • Data Collection • Analysis • Findings • Conclusion 2
  • 3.
    ©UCEM Begin programming: Buildyour first mobile game 3
  • 4.
    ©UCEM Background • Dramatic increase in video use • Over 500 million watch video on Facebook every day1 • One billion hours of video watched on YouTube! 2 • 70% YouTube watched on mobile devices2 • 2018 Video in Education report from Kaltura3 • Over 1500 surveyed • Use of lecture capture, ↑ 21% to 79% • Use of video by students for assignments 69% • Video feedback on student assignments in 35% of institutions • Closed captions are in use at 52% of institutions 4 1. https://www.forbes.com/sites/tjmccue/2017/09/22/top-10-video-marketing-trends-and-statistics-roundup-2017/#3fe361da7103 2. https://www.youtube.com/intl/en-GB/yt/about/press/ 3. https://librarytechnology.org/pr/23610
  • 5.
    ©UCEM Accessibility • The qualityof being easily reached, entered, or used by people who have a disability - Oxford Living Dictionary • Captions and Transcripts • Manual Captioning • Resource Intensive • Rule of Thumb allow four hours of transcribing for an hour of recording (Punch & Oancea, 2015) • Automatic Speech Recognition • Google Duplex https://www.youtube.com/watch?v=7gh6_U7Nfjs • Apple Siri, Amazon Alexa, 5 Punch, K. & Oancea, A. (2015). Introduction to Research Methods in Education (2nd eds). London: SAGE.
  • 6.
    ©UCEM Accessibility Continued • EUAccessibility Directive1 • New UK Regulations on Accessibility for Public Sector Bodies2 • 23rd September 2018 • Accessible VLEs – Making the most of the new regulations3 • The right thing to do • What stops us from doing the right thing? • Case of University of California Berkeley4 6 1. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32016L2102 2. https://www.gov.uk/guidance/accessibility-requirements-for-public-sector-websites-and-apps 3. https://www.policyconnect.org.uk/appgat/research/accessible-vles-making-most-new-regulations 4. https://www.insidehighered.com/news/2017/03/06/u-california-berkeley-delete-publicly-available-educational-content
  • 7.
    ©UCEM University College ofEstate Management • 1919  2019 • Postal  Online • College  University College (2015) • Online Distance Education Materials and Accessibility: Case Study of University College of Estate Management1 • ALT South webinar: Online Learning Materials and Accessibility2 7 1.https://www.researchgate.net/publication/307438821_Online_Distance_Education_Materials_and_Accessibility_Case_Study _of_University_College_of_Estate_Management 2. https://www.alt.ac.uk/civicrm/event/info%3Fid%3D337%26amp%3Breset%3D1 3. A Rising Tide: How Closed Captions Can Benefit All Students https://er.educause.edu/articles/2017/8/a-rising-tide-how- closed-captions-can-benefit-all-students
  • 8.
    ©UCEM The study • Selectedsoftware • Descript • IBM Watson Speech to Text (Watson) • Sonix • Synote • Trint • Zoom 8
  • 9.
  • 10.
    ©UCEM Data Collection • 7recordings • Recording length  from 5:57 to 8:21minutes 10 Participant Gender Native English Speaker English Accent (as identified by participant) 1 Male Yes Generic British 2 Female No South American 3 Male Yes Generic Scottish 4 Male Yes Generic British 5 Female No South Asian 6 Male No Greek 7 Male No African
  • 11.
    ©UCEM Analysis • Recordings transcribed •Check with recording • Microsoft Word Compare • Manual analysis • Word Error Rate (WER) WER = (Substitution + Deletion + Insertions) / N where N is the total number of words in the reference transcript (Apone, Botkin, Brooks, & Goldberg, 2011). 11 Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from http://ncam.wgbh.org/file_download/136
  • 12.
    ©UCEM Word Error Rate •WER = (Substitution + Deletion + Insertions) / N • Example: (actual) this process will be quick (caption) this proswilling quick (actual) this process will be quick (caption) this proswilling **** ** quick S D D 12 Apone, T., Botkin, B., Brooks, M., & Goldberg, L. (2011). Caption Accuracy Metrics Project: Research into Automated Error Ranking of Real-time Captions in Live Television News Programs. Retrieved from http://ncam.wgbh.org/file_download/136
  • 13.
    ©UCEM Results 13 0 100 200 300 400 500 600 700 800 2 3 45 6 7 Transcript number Transcription Errors (per 1000 words) Trint - Sonix - Descript - Watson - Zoom - Synonte -
  • 14.
    ©UCEM Findings 14 Property Management Construction Management Building Pathology Property and ContractLaw Expert 1 Expert 2 Expert 3 Expert 4 Expert 5 Expert 6 Expert 7 2 x x x x x x x 3 x x x x x x x 4 x Good enough Good enough x Good enough Good enough x 5 x Good enough x Almost good enough x x x 6 x x x x x x x 7 x x x x x x x
  • 15.
    ©UCEM Conclusions 15 • Quality ofrecording matters • In a technical discipline WER may not be sufficient predictor of quality • Good starting point for creating a transcript as an accessibility aid • At present, off-the-shelf automatic transcription software does not produce a high enough level of accuracy for the creation of accessibility aids for the built environment sector
  • 16.