SlideShare a Scribd company logo
Easing Transcripts for MOOC
Videos with an ASR (Automated
Speech Recognition) System
Carlos Turró, Jorge Civera and Jaime Busquets
Universitat Politècnica de València
The result of not having a screwdriver
• Pain
• Frustration
• Select a different tool
How can I transcribe a video?
• Manually transcribing a video
takes 10 times the length of the
video (RTF)
• Boring
• It’s worse if you don’t know
about the topic of the video
Automated Speech Recognition (ASR)
• How good is it?
• Will it recognize my special
words?
• Will it really help me?
UPValenciaX MOOCs - Transcribing
https://media.upv.es/?id=b444d12e-db23-9a4f-9b3b-d1d9275d4cb4
UPValenciaX MOOCs - Transcribing
https://www.youtube.com/watch?v=dKrbzX5NjTs
UPValenciaX MOOCs - Transcribing
30 MOOC courses
UPValenciaX MOOCs -Transcribing
• API
• Just after
recording
ASR
• RTF 3
• Teaching
Assistants
Review
UPValenciaX MOOCs –Transcribing
• API
• Just after
recording
ASR
• RTF 3
• Teaching
Assistants
Review
70% less time
Transcription and Translation Platform
• Post-editing web interface (in HTML5)
Crowdsourcing
• We are crowdsourcing the on-campus courses using our own Paella
video player.
How to get good transcription quality
•Transcription systems learn to transcribe from examples
–At least 50 hours of videos (audio) in the source language previously transcribed
to learn the acoustic model
–Texts in millions of words to learn the language model
Language Videos (hours) Text (Mwords)
Dutch 532 628
English 620 464000
Estonian 130 410
French 88 1800
German 36 135
Portuguese 54 573
Italian 54 868
Slovene 27 224
Spanish 128 654
How to get good transcription quality (II)
•Adaptation of transcription systems to the specific videos is key for
high accuracy
•Availability of videos manually transcribed with similar acoustic conditions
•Availability of text resources related to the video in question
· Title is used to retrieve related documents
· Slides contain most of the special words used by the lecturer
· Documents: text content from the course, additional text resources (bibliography)
• Sound quality of the video has a direct relationship with quality
• No noise, no background music, please
Try yourself
http://mllp.upv.es
Our next step
Translations !!
Conclusions
• ASR technology is enough mature to help a lot in captioning
• However, there should be a review phase
• Quality can be enhanced by providing transcribed videos
• At UP Valencia we got transcribed our 30 MOOC courses with 3x TA
cost 
Thanks!
Questions?
Why transcription of MOOC video files?
• Accessibility
Why transcription of MOOC video files?
• Accessibility
• Searching into a video file
• Searching into a video repository
• Topic identification
• …and much more
Measuring Quality: Word Error rate
Where
S is the number of word substitutions,
D is the number of word deletions,
I is the number of word insertions,
N is the number of words in the reference text
Measuring Quality: Word Error Rate
Language WER
English
Dutch
20.8
24.5
Italian 17.7
Spanish 14.4
Estonian 27.1
French 22.7
Attributions
• Fingerspelling & tools Wikipedia
• Bored https://www.flickr.com/photos/left-hand/3132070992/
• Siri https://www.flickr.com/photos/smemon/8070397213/

More Related Content

Viewers also liked

Recruitment_Manager
Recruitment_ManagerRecruitment_Manager
Recruitment_Manager
Chandru Gn
 
Türkiyeden vize istemeyen ülkelerin listesi
Türkiyeden vize istemeyen ülkelerin listesiTürkiyeden vize istemeyen ülkelerin listesi
Türkiyeden vize istemeyen ülkelerin listesi
atakan555
 
Using Fundraising Data to Increase Giving
Using Fundraising Data to Increase GivingUsing Fundraising Data to Increase Giving
Using Fundraising Data to Increase Giving
West Muse
 
Job Fair Flyer
Job Fair FlyerJob Fair Flyer
Job Fair Flyer
Rachel Remillard
 
Sslideshare
SslideshareSslideshare
Sslideshare
CartouchesLand
 
Agile Failure Patterns In Organisations – Leancamp Berlin 2016
Agile Failure Patterns In Organisations – Leancamp Berlin 2016Agile Failure Patterns In Organisations – Leancamp Berlin 2016
Agile Failure Patterns In Organisations – Leancamp Berlin 2016
Stefan Wolpers
 
G3 preescolar
G3 preescolarG3 preescolar
G3 preescolar
AUGUSTO DAVID
 
Contabilidad gubernamental
Contabilidad gubernamentalContabilidad gubernamental
Contabilidad gubernamental
thaLia_mf
 

Viewers also liked (8)

Recruitment_Manager
Recruitment_ManagerRecruitment_Manager
Recruitment_Manager
 
Türkiyeden vize istemeyen ülkelerin listesi
Türkiyeden vize istemeyen ülkelerin listesiTürkiyeden vize istemeyen ülkelerin listesi
Türkiyeden vize istemeyen ülkelerin listesi
 
Using Fundraising Data to Increase Giving
Using Fundraising Data to Increase GivingUsing Fundraising Data to Increase Giving
Using Fundraising Data to Increase Giving
 
Job Fair Flyer
Job Fair FlyerJob Fair Flyer
Job Fair Flyer
 
Sslideshare
SslideshareSslideshare
Sslideshare
 
Agile Failure Patterns In Organisations – Leancamp Berlin 2016
Agile Failure Patterns In Organisations – Leancamp Berlin 2016Agile Failure Patterns In Organisations – Leancamp Berlin 2016
Agile Failure Patterns In Organisations – Leancamp Berlin 2016
 
G3 preescolar
G3 preescolarG3 preescolar
G3 preescolar
 
Contabilidad gubernamental
Contabilidad gubernamentalContabilidad gubernamental
Contabilidad gubernamental
 

Similar to Easing transcripts for mooc videos with an asr lwmoo cs

Using audio and video well in your moodle course
Using audio and video well in your moodle courseUsing audio and video well in your moodle course
Using audio and video well in your moodle course
Colin Simpson
 
Captioning Video
Captioning VideoCaptioning Video
Captioning Video
Staci Trekles
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EUmoocs
 
Enriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella PlayerEnriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella Player
Carlos Turró Ribalta
 
Video is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP ValenciaVideo is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP Valencia
Carlos Turró Ribalta
 
Multiply your reach
Multiply your reachMultiply your reach
Multiply your reach
srbhbaid
 
Survey says! Uncovering faculty support needs #DTL13
Survey says!  Uncovering faculty support needs #DTL13Survey says!  Uncovering faculty support needs #DTL13
Survey says! Uncovering faculty support needs #DTL13
Tanya Joosten
 
MOOCs
MOOCsMOOCs
QFARC_14_1067
QFARC_14_1067QFARC_14_1067
QFARC_14_1067
Elsherif Mahmoud
 
Reinventing the lecture: how video technology and learning analytics are tran...
Reinventing the lecture: how video technology and learning analytics are tran...Reinventing the lecture: how video technology and learning analytics are tran...
Reinventing the lecture: how video technology and learning analytics are tran...
John Couperthwaite
 
Fpvp
FpvpFpvp
Training Heritage Speakers: A Journey Worth Taking
Training Heritage Speakers: A Journey Worth TakingTraining Heritage Speakers: A Journey Worth Taking
Training Heritage Speakers: A Journey Worth Taking
National Council on Interpreting in Health Care (NCIHC)
 
Newbutt podcasting to support
Newbutt podcasting to supportNewbutt podcasting to support
Newbutt podcasting to support
MEL SIG
 
Newbutt podcasting to support
Newbutt podcasting to supportNewbutt podcasting to support
Panopto workshop fall 2016
Panopto workshop fall 2016Panopto workshop fall 2016
Panopto workshop fall 2016
Ashley Turner
 
REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...
MEDEA Awards
 
Instructional design in massive open online course (moocs)
Instructional design in massive open online course (moocs)Instructional design in massive open online course (moocs)
Instructional design in massive open online course (moocs)
Eisa Rezaei
 
Personal capture.quick overview
Personal capture.quick overviewPersonal capture.quick overview
Personal capture.quick overview
CSaC
 
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
getadministrate
 
Adding audio lectures
Adding audio lecturesAdding audio lectures
Adding audio lectures
JuliaHoesingVanderMolen
 

Similar to Easing transcripts for mooc videos with an asr lwmoo cs (20)

Using audio and video well in your moodle course
Using audio and video well in your moodle courseUsing audio and video well in your moodle course
Using audio and video well in your moodle course
 
Captioning Video
Captioning VideoCaptioning Video
Captioning Video
 
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
 
Enriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella PlayerEnriching video content for educational uses with Paella Player
Enriching video content for educational uses with Paella Player
 
Video is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP ValenciaVideo is key for Flipped Learning: the experience at UP Valencia
Video is key for Flipped Learning: the experience at UP Valencia
 
Multiply your reach
Multiply your reachMultiply your reach
Multiply your reach
 
Survey says! Uncovering faculty support needs #DTL13
Survey says!  Uncovering faculty support needs #DTL13Survey says!  Uncovering faculty support needs #DTL13
Survey says! Uncovering faculty support needs #DTL13
 
MOOCs
MOOCsMOOCs
MOOCs
 
QFARC_14_1067
QFARC_14_1067QFARC_14_1067
QFARC_14_1067
 
Reinventing the lecture: how video technology and learning analytics are tran...
Reinventing the lecture: how video technology and learning analytics are tran...Reinventing the lecture: how video technology and learning analytics are tran...
Reinventing the lecture: how video technology and learning analytics are tran...
 
Fpvp
FpvpFpvp
Fpvp
 
Training Heritage Speakers: A Journey Worth Taking
Training Heritage Speakers: A Journey Worth TakingTraining Heritage Speakers: A Journey Worth Taking
Training Heritage Speakers: A Journey Worth Taking
 
Newbutt podcasting to support
Newbutt podcasting to supportNewbutt podcasting to support
Newbutt podcasting to support
 
Newbutt podcasting to support
Newbutt podcasting to supportNewbutt podcasting to support
Newbutt podcasting to support
 
Panopto workshop fall 2016
Panopto workshop fall 2016Panopto workshop fall 2016
Panopto workshop fall 2016
 
REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...
 
Instructional design in massive open online course (moocs)
Instructional design in massive open online course (moocs)Instructional design in massive open online course (moocs)
Instructional design in massive open online course (moocs)
 
Personal capture.quick overview
Personal capture.quick overviewPersonal capture.quick overview
Personal capture.quick overview
 
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
LITE 2016 – Administrate and Blended Learning; a Perfect Match [Rico Page & J...
 
Adding audio lectures
Adding audio lecturesAdding audio lectures
Adding audio lectures
 

More from Carlos Turró Ribalta

User derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upvUser derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upv
Carlos Turró Ribalta
 
Paella player and Opencast
Paella player and OpencastPaella player and Opencast
Paella player and Opencast
Carlos Turró Ribalta
 
Hacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPVHacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPV
Carlos Turró Ribalta
 
Paella player 5
Paella player 5Paella player 5
Paella player 5
Carlos Turró Ribalta
 
Pedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de ValènciaPedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de València
Carlos Turró Ribalta
 
Flipped Classroom project at UPV
Flipped Classroom project at UPVFlipped Classroom project at UPV
Flipped Classroom project at UPV
Carlos Turró Ribalta
 
Paella player 4 - Presentation at Opencast Summit 2015 at Manchester
Paella player 4 - Presentation at Opencast Summit 2015 at ManchesterPaella player 4 - Presentation at Opencast Summit 2015 at Manchester
Paella player 4 - Presentation at Opencast Summit 2015 at Manchester
Carlos Turró Ribalta
 
Open edx developing x-blocks @ upvalencia (4)
Open edx   developing x-blocks @ upvalencia (4)Open edx   developing x-blocks @ upvalencia (4)
Open edx developing x-blocks @ upvalencia (4)
Carlos Turró Ribalta
 

More from Carlos Turró Ribalta (8)

User derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upvUser derived videos in opencast. a first draft from upv
User derived videos in opencast. a first draft from upv
 
Paella player and Opencast
Paella player and OpencastPaella player and Opencast
Paella player and Opencast
 
Hacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPVHacia una nueva docencia ... caso UPV
Hacia una nueva docencia ... caso UPV
 
Paella player 5
Paella player 5Paella player 5
Paella player 5
 
Pedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de ValènciaPedagogical innovation at Universitat Politècnica de València
Pedagogical innovation at Universitat Politècnica de València
 
Flipped Classroom project at UPV
Flipped Classroom project at UPVFlipped Classroom project at UPV
Flipped Classroom project at UPV
 
Paella player 4 - Presentation at Opencast Summit 2015 at Manchester
Paella player 4 - Presentation at Opencast Summit 2015 at ManchesterPaella player 4 - Presentation at Opencast Summit 2015 at Manchester
Paella player 4 - Presentation at Opencast Summit 2015 at Manchester
 
Open edx developing x-blocks @ upvalencia (4)
Open edx   developing x-blocks @ upvalencia (4)Open edx   developing x-blocks @ upvalencia (4)
Open edx developing x-blocks @ upvalencia (4)
 

Recently uploaded

5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
ihlasbinance2003
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
zubairahmad848137
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
Yasser Mahgoub
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
KrishnaveniKrishnara1
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Sinan KOZAK
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
171ticu
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
rpskprasana
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
camseq
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
Hitesh Mohapatra
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
insn4465
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
RadiNasr
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
mahammadsalmanmech
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
Las Vegas Warehouse
 

Recently uploaded (20)

5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
5214-1693458878915-Unit 6 2023 to 2024 academic year assignment (AutoRecovere...
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
Casting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdfCasting-Defect-inSlab continuous casting.pdf
Casting-Defect-inSlab continuous casting.pdf
 
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
2008 BUILDING CONSTRUCTION Illustrated - Ching Chapter 02 The Building.pdf
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.pptUnit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
Unit-III-ELECTROCHEMICAL STORAGE DEVICES.ppt
 
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024
 
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样学校原版美国波士顿大学毕业证学历学位证书原版一模一样
学校原版美国波士顿大学毕业证学历学位证书原版一模一样
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
 
Modelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdfModelagem de um CSTR com reação endotermica.pdf
Modelagem de um CSTR com reação endotermica.pdf
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Generative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of contentGenerative AI leverages algorithms to create various forms of content
Generative AI leverages algorithms to create various forms of content
 
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
哪里办理(csu毕业证书)查尔斯特大学毕业证硕士学历原版一模一样
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdfIron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
Iron and Steel Technology Roadmap - Towards more sustainable steelmaking.pdf
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
Question paper of renewable energy sources
Question paper of renewable energy sourcesQuestion paper of renewable energy sources
Question paper of renewable energy sources
 
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have oneISPM 15 Heat Treated Wood Stamps and why your shipping must have one
ISPM 15 Heat Treated Wood Stamps and why your shipping must have one
 

Easing transcripts for mooc videos with an asr lwmoo cs

  • 1. Easing Transcripts for MOOC Videos with an ASR (Automated Speech Recognition) System Carlos Turró, Jorge Civera and Jaime Busquets Universitat Politècnica de València
  • 2.
  • 3.
  • 4.
  • 5.
  • 6. The result of not having a screwdriver • Pain • Frustration • Select a different tool
  • 7. How can I transcribe a video? • Manually transcribing a video takes 10 times the length of the video (RTF) • Boring • It’s worse if you don’t know about the topic of the video
  • 8. Automated Speech Recognition (ASR) • How good is it? • Will it recognize my special words? • Will it really help me?
  • 9. UPValenciaX MOOCs - Transcribing https://media.upv.es/?id=b444d12e-db23-9a4f-9b3b-d1d9275d4cb4
  • 10. UPValenciaX MOOCs - Transcribing https://www.youtube.com/watch?v=dKrbzX5NjTs
  • 11. UPValenciaX MOOCs - Transcribing 30 MOOC courses
  • 12. UPValenciaX MOOCs -Transcribing • API • Just after recording ASR • RTF 3 • Teaching Assistants Review
  • 13. UPValenciaX MOOCs –Transcribing • API • Just after recording ASR • RTF 3 • Teaching Assistants Review 70% less time
  • 14. Transcription and Translation Platform • Post-editing web interface (in HTML5)
  • 15. Crowdsourcing • We are crowdsourcing the on-campus courses using our own Paella video player.
  • 16. How to get good transcription quality •Transcription systems learn to transcribe from examples –At least 50 hours of videos (audio) in the source language previously transcribed to learn the acoustic model –Texts in millions of words to learn the language model Language Videos (hours) Text (Mwords) Dutch 532 628 English 620 464000 Estonian 130 410 French 88 1800 German 36 135 Portuguese 54 573 Italian 54 868 Slovene 27 224 Spanish 128 654
  • 17. How to get good transcription quality (II) •Adaptation of transcription systems to the specific videos is key for high accuracy •Availability of videos manually transcribed with similar acoustic conditions •Availability of text resources related to the video in question · Title is used to retrieve related documents · Slides contain most of the special words used by the lecturer · Documents: text content from the course, additional text resources (bibliography) • Sound quality of the video has a direct relationship with quality • No noise, no background music, please
  • 20. Conclusions • ASR technology is enough mature to help a lot in captioning • However, there should be a review phase • Quality can be enhanced by providing transcribed videos • At UP Valencia we got transcribed our 30 MOOC courses with 3x TA cost 
  • 22.
  • 23. Why transcription of MOOC video files? • Accessibility
  • 24. Why transcription of MOOC video files? • Accessibility • Searching into a video file • Searching into a video repository • Topic identification • …and much more
  • 25. Measuring Quality: Word Error rate Where S is the number of word substitutions, D is the number of word deletions, I is the number of word insertions, N is the number of words in the reference text
  • 26. Measuring Quality: Word Error Rate Language WER English Dutch 20.8 24.5 Italian 17.7 Spanish 14.4 Estonian 27.1 French 22.7
  • 27. Attributions • Fingerspelling & tools Wikipedia • Bored https://www.flickr.com/photos/left-hand/3132070992/ • Siri https://www.flickr.com/photos/smemon/8070397213/