J. D. Valor, J. A. Silvestre-Cerdà, C. Turró, J. Civera, A. Juan - Efficient generation of high-quality multilingual subtitles for video lecture repositories - MLLP UPV
The European Conference on Technology Enhanced Learning (EC-TEL) 2015, with the theme Design for Teaching and Learning in a Networked World, took place in Toledo from 15–18 September, addressing the pressing need to shape learning arrangements in such a way that they exploit the potentials and meet the requirements of a networked world. It was a unique opportunity for researchers, practitioners, educational developers, and policy makers to address current challenges and advances in the field. Also, numerous examples of excellence and innovation have been higlighted in the work presented at the conference.
At the conference, Juan Valor, an active member of the MLLP Research group that is participating in the EMMA project, presented a demo of the “MLLP transcription and translation platform”, which is used in the EMMA project. Also, some of the most recent work in the automatic multilingual subtitling field was presented in the oral presentation of the paper “Efficient generation of high-quality multilingual subtitles for video lecture repositories”. During the presentations the EMMA project generated considerable interest among the attendants from universities around Europe.
Find out more about EMMA: http://project.europeanmoocs.eu/
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
J. D. Valor, J. A. Silvestre-Cerdà, C. Turró, J. Civera, A. Juan - Efficient generation of high-quality multilingual subtitles for video lecture repositories - MLLP UPV
1. Efficient generation of high-quality multilingual
subtitles for video lecture repositories
Juan Daniel Valor Miró, Joan Albert Silvestre-Cerdà,
Carlos Turró, Jorge Civera and Alfons Juan.
{jvalor,jsilvestre,jcivera,ajuan}@dsic.upv.es
turro@cc.upv.es
September 10, 2015
2. Introduction
• Video lectures are fast becoming an everyday educational resource.
• The utility of video lectures could be further extended by adding subtitles.
– Discovery of content-related videos and improve searchability.
– Content accessible to students of different languages.
– Content accessible to students with disabilities.
– Assist student note-taking and content summarisation.
• Manual subtitulation and translation of video lectures has high cost.
• We propose cost-effective solutions using state-of-the-art ASR and SMT.
• Our proposal has been evaluated with real-users at the UPV.
– Review of automatic transcriptions in Spanish, Catalan, English.
– Review of automatic translations from Spanish into English.
Juan Daniel Valor Miró et al - EC-TEL 2015 2 / 11
3. transLectures project
• Acronym of “Transcription and Translation of Video Lectures”.
• Advanced ASR and SMT techniques are being tested on large educational repositories.
• Spread our innovative, open-source solutions over the worldwide educational community.
• Period: November 2011 - November 2014.
• Consortium:
– UPV: Universitat Politècnica de València, Valencia, Spain (Coordinator).
– XEROX: Xerox S.A.S., Grenoble, France.
– JSI: Jozef Stefan Institute, Ljubljana, Slovenia.
– RWTH: Rheinisch-Westfaelische Technische Hochschule, Aachen, Germany.
– EML: European Media Laboratory GmbH, Heidelberg, Germany.
– DDS: Deluxe Digital Studios Ltd, London, UK.
• Web page: translectures.eu
Juan Daniel Valor Miró et al - EC-TEL 2015 3 / 11
4. poliMedia repository
• Service for the creation and distribution of multimedia educational content at the UPV.
• Allow professors to record their courses in video accompanied by time-aligned slides.
• It serves more than 36,000 students and 2,800 university lecturers.
• It has been exported to several universities in Spain and South America.
• Basic statistics of the poliMedia repository:
Lectures 9222
Duration (hours) 2102
Avg. Lecture Length (minutes) 13
Speakers 1302
Avg. Lectures per Speaker 7
Juan Daniel Valor Miró et al - EC-TEL 2015 4 / 11
6. Study
• Study carried out with real-users at the Docencia en Red UPV program.
– Review of automatic transcriptions in Spanish, Catalan, English.
– Review of automatic translations from Spanish into English.
• Evaluation measures:
– WER: Percentage of incorrect recognised words in an automatic transcription.
∗ High-quality: WER < 20
∗ Good quality: WER < 35
– TER: Percentage of misleading translated words in an automatic translation.
∗ High-quality: TER < 35
∗ Good quality: TER < 50
– RTF: Time relative to video duration spent in the review of a transcription or translation.
∗ A manual transcription costs 10 RTF (10 times the duration of the video).
∗ A manual translation costs 30 RTF (30 times the duration of the video).
Juan Daniel Valor Miró et al - EC-TEL 2015 6 / 11
7. Spanish transcriptions
• Reviewed 135 video lectures by 39 lecturers accounting up to 18.3 hours.
• A linear regression model RTF = β · WER was fitted to all data.
• A significant dependency (β=0.184, Sig< 2.2*10−16
) between RTF and WER was found.
• Average WER of 12.0 (very high-quality), and average user review RTF of 2.7.
• Great user-effort reduction of 73%, when compared to transcribe from scratch.
Juan Daniel Valor Miró et al - EC-TEL 2015 7 / 11
8. English and Catalan transcriptions
• English:
– Reviewed 57 video lectures by 12 lecturers accounting up to 7.9 hours.
– A significant dependency (β=0.168, Sig< 2.2*10−16
) for RTF = β · WER was found.
– Average WER of 36.0, and average user review RTF of 6.2 (user-effort reduc. of 38%).
– Nowadays WER has been reduced to 21.4, so we expect better user-effort reduction.
• Catalan:
– Reviewed 19 video lectures by 5 lecturers accounting up to 1.5 hours.
– A significant dependency (β=0.138, Sig< 2.2*10−16
) for RTF = β · WER was found.
– Average WER of 40.4, and average user review RTF of 5.6 (user-effor reduc. 44%).
– Nowadays WER has been reduced to 17.4, so we expect better user-effort reduction.
Juan Daniel Valor Miró et al - EC-TEL 2015 8 / 11
9. Spanish into English translations
• Reviewed 13 video lectures by 10 lecturers accounting up to 2.1 hours.
• A linear regression model RTF = β · TER was fitted to all data.
• A significant dependency (β=0.255, Sig=3.73*10−7
) between RTF and TER was found.
• Average TER of 41.9 (good-quality), and average user review RTF of 12.2.
• Great user-effort reduction of 59%, when compared to translate from scratch.
Juan Daniel Valor Miró et al - EC-TEL 2015 9 / 11
10. Conclusions
• This work describes the efficient generation of high-quality multilingual subtitles.
• We obtain significant reductions in review time to generate subtitles.
• We found that review times depend on the automatic transcription or translation quality.
• The final quality of the subtitles are the same to those obtained from scratch.
• This approach is used nowadays in real-life scenarios like:
– The whole UC3M and UPV poliMedia repositories (es, en, ca).
– The european EMMA project (en, es, ca, it, pt, et, fr, nl).
Juan Daniel Valor Miró et al - EC-TEL 2015 10 / 11
11. Live system
Upload and review your own videos:
ttp.mllp.upv.es
Visualize and review poliMedia videos:
media.upv.es
Juan Daniel Valor Miró et al - EC-TEL 2015 11 / 11