EMMA presentation - Alfons Juan - Language technologies for Education: recent results by the MLLP group - UPV, Spain

EUmoocs
Language technologies for Education:
recent results by the MLLP group
Alfons Juan
2nd Internet of Education Conference 2015
18 September 2015, Sarajevo
Contents
1 / ??
• Research group at the Univ. Polit`ecnica de Val`encia (Spain)
• Research areas:
– Machine Learning and Applications
– Natural Language Processing
– Educational Technologies and Big Data
• Recent research projects:
– transLectures: Transcription and translation of video lectures
– EMMA: European Multiple MOOC Aggregator
– Active Interaction for Speech Transcription and Translation
2 / ??
transLectures (Nov 2011 – Oct 2014)
Main goal: to develop innovative, cost-effective (quasi-automatic)
solutions to produce accurate subtitles for educational videos
Two pilots:
• VideoLectures.NET: En, Sl En→{De,Sl,Fr,Es} Sl→En
• poliMedia: Es, Ca Es↔En Ca↔{Es,En}
Three scientific and technological objectives:
• Massive adaptation to improve subtitling quality
• Intelligent interaction to improve subtitling quality
• Integration into Opencast to enable real-life evaluation
3 / ??
transLectures: VideoLectures.NET
>20000 videos (45 min. on avg.): 85% English, 13% Slovenian, . . .
4 / ??
transLectures: poliMedia
27000 videos (2-10 min.): 88% Spanish, 3% Catalan, . . .
5 / ??
transLectures: Massive adaptation
10
15
20
25
30
35
40
45
50
M06 M12 M18 M24 M30 M36
WER (ASR)
Sl
Ca
En
Es 15
20
25
30
35
40
M12 M18 M24 M30 M36
BLEU (MT)
EnEs
EnFr
EsEn
EnDe
SlEn
EnSl
• Key advances: neural networks and “extra” resources (slides)
• ASR: all error rates below 30 % (Sl: 26.9 %, others below 20 %)
• SMT: all BLEU scores above 20 (En→Sl: 20.1, mostly above 25)
6 / ??
transLectures: Intelligent interaction
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
M12 M18 M24 M30 M36
WERPM
WP3
II 10%
poliMedia transcription
The 10% of words recognized
with minimum confidence
include 30% of the actual word
recognition errors
7 / ??
transLectures: Integration into Opencast
>300 different institutions worldwide [demo]
8 / ??
transLectures: try our tools
Try our tools at ttp.mllp.upv.es
190 users and 1082 videos (254 hours) since May 2014
9 / ??
transLectures: try our tools (cont.)
Institution Country Language
Univ. Polit`ecnica de Val`encia Spain ES, CA, EN
Univ. Carlos III de Madrid Spain ES, EN
Univ. Oberta de Catalunya Spain ES, CA
Open Universiteit Nederland Netherlands NL
Univ. of Naples Federico II Italy IT
Universit´e de Bourgogne France FR
Tallin University Estonia ET, EN
University of Leicester UK EN
Examples:
• Lecture recorded at poliMedia studios. [link]
• Lee Rubenstein, edX VP of Business Development: “Reinvent-
ing education”, 30 June 2015, Val`encia. [link]
• Text-to-Speech demo. [link]
10 / ??
EMMA (Feb 2014 – Jul 2016)
European Multiple MOOC Aggregator (EMMA)
Main goal: to provide multilingual access to European MOOCs
Motivation:
• Most MOOCs are offered in few languages (En, Es, Fr)
• Language barrier is keeping many learners from taking MOOCs
• MOOC components: texts, images, videos, forums
• EMMA uses transLectures tools to translate videos and texts
– Few hours of video in 7 En, Es, It, Nl, Et, Pt and Fr
– Source language is the national language of the MOOC provider
– Target languages: En, Es and It
11 / ??
EMMA: cost of manually translating MOOCs
Texts:
• Manual translation rate is approximately 2500 words per day
• A 6-week course with 75000 words takes 1.5 PM
Videos:
• Before translating, videos are manually transcribed (10 RTF)
• Then, transcriptions are translated (30 RTF)
• A course including 2 hours of video takes 0.5 PM
Solutions to lower costs:
• Crowdsourcing (e.g. TED talks)
• ASR and MT: user effort is reduced to 30% (2 → 0.6 PM)
12 / ??
EMMA: automatic video subtitling
1. Generation of automatic transcriptions from video
2. Manual review of automatic transcriptions
3. Generation of automatic translations from transcriptions
4. Manual review of automatic translations
13 / ??
EMMA: automatic document translation
• Course text is ingested into the translation system
• Source and target texts are reviewed in parallel
• Preview of source and target texts also available
• Translated text is imported back into the EMMA platform
14 / ??
EMMA: evaluations
Video subtitling
Transcription Translation Total
Language pairs RTF (10) RTF (30) RTF (40)
Spanish → English 3 7 10
English → Spanish 6 17 23
Document translation
Translation
Language pairs RTF
Spanish → English 7
English → Spanish 17
15 / ??
EMMA: conclusions
• Multilingual access to your course boosts visibility
• The cost of manually translating your course is high (2 PM)
• Automatic translation can reduce the temporal cost to 30%
• Accuracy of automatic translation depends on several factors:
– Languages involved
– Availability of annotated data resources related to your course
– Specificity of the course
• Designing a multilingual MOOC should also take into account:
slides, images, application interfaces (demos), bibliography
16 / ??
1 of 17

Recommended

Telecollaboration for Intercultural Language Learning: Lessons Learned by
Telecollaboration for Intercultural Language Learning: Lessons LearnedTelecollaboration for Intercultural Language Learning: Lessons Learned
Telecollaboration for Intercultural Language Learning: Lessons LearnedKristi Jauregi Ondarra
488 views24 slides
Ulrich Tiedau by
Ulrich TiedauUlrich Tiedau
Ulrich TiedauJisc
495 views24 slides
Presentation emma learntec 2015 by
Presentation emma learntec 2015Presentation emma learntec 2015
Presentation emma learntec 2015WilfredRubens.com
812 views14 slides
Exlporing New challenges in TELL: Language Learning MOOCs by
Exlporing New challenges in TELL: Language Learning MOOCsExlporing New challenges in TELL: Language Learning MOOCs
Exlporing New challenges in TELL: Language Learning MOOCsMaria Perifanou
630 views69 slides
OPENNESS: A challenge for Education, an opportunity for teachers! Let's work ... by
OPENNESS: A challenge for Education, an opportunity for teachers! Let's work ...OPENNESS: A challenge for Education, an opportunity for teachers! Let's work ...
OPENNESS: A challenge for Education, an opportunity for teachers! Let's work ...Maria Perifanou
758 views40 slides
Exploring new challenges in TELL: LangMOOC and Open Education Europa by
Exploring new challenges in TELL: LangMOOC and Open Education EuropaExploring new challenges in TELL: LangMOOC and Open Education Europa
Exploring new challenges in TELL: LangMOOC and Open Education EuropaMaria Perifanou
397 views45 slides

More Related Content

Similar to EMMA presentation - Alfons Juan - Language technologies for Education: recent results by the MLLP group - UPV, Spain

transLectures fact sheet by
transLectures fact sheettransLectures fact sheet
transLectures fact sheettransLectures
130 views1 slide
Automatic transcription of video files sig media by
Automatic transcription of video files   sig mediaAutomatic transcription of video files   sig media
Automatic transcription of video files sig mediaCarlos Turró Ribalta
271 views28 slides
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Luc Meertens, Crosslang... by
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Luc Meertens, Crosslang...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Luc Meertens, Crosslang...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Luc Meertens, Crosslang...TAUS - The Language Data Network
848 views21 slides
M&L 2012 - Translectures: tackling the translation issue in a cost effective ... by
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...Media & Learning Conference
408 views21 slides
REC:all Exploring the potential of lecture capture in universities and higher... by
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...MEDEA Awards
895 views48 slides
Pedagogy, technology and training for language learning and teaching: the ECM... by
Pedagogy, technology and training for language learning and teaching: the ECM...Pedagogy, technology and training for language learning and teaching: the ECM...
Pedagogy, technology and training for language learning and teaching: the ECM...LangOER
548 views11 slides

Similar to EMMA presentation - Alfons Juan - Language technologies for Education: recent results by the MLLP group - UPV, Spain(20)

transLectures fact sheet by transLectures
transLectures fact sheettransLectures fact sheet
transLectures fact sheet
transLectures130 views
M&L 2012 - Translectures: tackling the translation issue in a cost effective ... by Media & Learning Conference
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
REC:all Exploring the potential of lecture capture in universities and higher... by MEDEA Awards
REC:all Exploring the potential of lecture capture in universities and higher...REC:all Exploring the potential of lecture capture in universities and higher...
REC:all Exploring the potential of lecture capture in universities and higher...
MEDEA Awards895 views
Pedagogy, technology and training for language learning and teaching: the ECM... by LangOER
Pedagogy, technology and training for language learning and teaching: the ECM...Pedagogy, technology and training for language learning and teaching: the ECM...
Pedagogy, technology and training for language learning and teaching: the ECM...
LangOER548 views
European day of languages Vilnius Peter Birch by Peter Birch
European day of languages Vilnius Peter BirchEuropean day of languages Vilnius Peter Birch
European day of languages Vilnius Peter Birch
Peter Birch859 views
How to MOOC? – A pedagogical guideline for practitioners by Martin Ebner
How to MOOC? – A pedagogical guideline for practitionersHow to MOOC? – A pedagogical guideline for practitioners
How to MOOC? – A pedagogical guideline for practitioners
Martin Ebner3.7K views
EMMA project - general presentation 2 by EUmoocs
EMMA project - general presentation 2EMMA project - general presentation 2
EMMA project - general presentation 2
EUmoocs 465 views
AVATAR - Added Value of teaching in a virtual world by Cristina Stefanelli
AVATAR - Added Value of teaching in a virtual worldAVATAR - Added Value of teaching in a virtual world
AVATAR - Added Value of teaching in a virtual world
Cristina Stefanelli1.1K views
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2... by Europeana
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana441 views
Reflections upon the evolution of technology-enhanced language learning and f... by Ana Gimeno-Sanz
Reflections upon the evolution of technology-enhanced language learning and f...Reflections upon the evolution of technology-enhanced language learning and f...
Reflections upon the evolution of technology-enhanced language learning and f...
Ana Gimeno-Sanz1.2K views
1. EXPERT Winter School Partner Introductions by RIILP
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
RIILP4K views
Oer production processes, a quest for efficiency by Robert Schuwer
Oer production processes, a quest for efficiencyOer production processes, a quest for efficiency
Oer production processes, a quest for efficiency
Robert Schuwer375 views
I ti lt_overview_data_collection_nov_2011_sant cugat by bcnpaul1
I ti lt_overview_data_collection_nov_2011_sant cugatI ti lt_overview_data_collection_nov_2011_sant cugat
I ti lt_overview_data_collection_nov_2011_sant cugat
bcnpaul1313 views

More from EUmoocs

Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat... by
Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...
Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...EUmoocs
700 views23 slides
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platform by
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platformRosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platform
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platformEUmoocs
514 views15 slides
Firssova, Brouns - Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr... by
Firssova, Brouns -  Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...Firssova, Brouns -  Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...
Firssova, Brouns - Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...EUmoocs
510 views43 slides
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs? - Da... by
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs?  - Da...Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs?  - Da...
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs? - Da...EUmoocs
334 views29 slides
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E... by
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...EUmoocs
357 views52 slides
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E... by
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...EUmoocs
861 views1 slide

More from EUmoocs (20)

Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat... by EUmoocs
Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...
Carmen Padrón-Nápoles - EMMA webinar: Sharing the experience of the EMMA plat...
EUmoocs 700 views
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platform by EUmoocs
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platformRosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platform
Rosanna De Dosa - EMMA webinar: Sharing the experience of the EMMA platform
EUmoocs 514 views
Firssova, Brouns - Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr... by EUmoocs
Firssova, Brouns -  Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...Firssova, Brouns -  Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...
Firssova, Brouns - Hoe ontwerpt u een effectieve MOOC? Voorbeelden uit de pr...
EUmoocs 510 views
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs? - Da... by EUmoocs
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs?  - Da...Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs?  - Da...
Brouns, Firssova, Kalz - Is there value of learning analytics in MOOCs? - Da...
EUmoocs 334 views
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E... by EUmoocs
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...
Brouns, Firssova - Bootcamp EMMA MOOC Assessment for learning in practice - E...
EUmoocs 357 views
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E... by EUmoocs
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...
Ilaria Merciai, Marco Cerrone - Monitoring a Learning Community in a Hybrid E...
EUmoocs 861 views
Rosanna De Rosa, Alessandro Bogliolo - Teaching to teachers. A MOOC based hyb... by EUmoocs
Rosanna De Rosa, Alessandro Bogliolo - Teaching to teachers. A MOOC based hyb...Rosanna De Rosa, Alessandro Bogliolo - Teaching to teachers. A MOOC based hyb...
Rosanna De Rosa, Alessandro Bogliolo - Teaching to teachers. A MOOC based hyb...
EUmoocs 1.3K views
Mart Laanpere - Task-centred approach to mooc design by EUmoocs
Mart Laanpere - Task-centred approach to mooc designMart Laanpere - Task-centred approach to mooc design
Mart Laanpere - Task-centred approach to mooc design
EUmoocs 365 views
Olga Firssova - Task-centred approach to MOOC design - challenges and opportu... by EUmoocs
Olga Firssova - Task-centred approach to MOOC design - challenges and opportu...Olga Firssova - Task-centred approach to MOOC design - challenges and opportu...
Olga Firssova - Task-centred approach to MOOC design - challenges and opportu...
EUmoocs 460 views
Milena Popova - Taste Europeana by EUmoocs
Milena Popova - Taste EuropeanaMilena Popova - Taste Europeana
Milena Popova - Taste Europeana
EUmoocs 780 views
Anna Maria Tammaro, Getaneh Alemu - Using Europeanafor learning & teaching: E... by EUmoocs
Anna Maria Tammaro, Getaneh Alemu - Using Europeanafor learning & teaching: E...Anna Maria Tammaro, Getaneh Alemu - Using Europeanafor learning & teaching: E...
Anna Maria Tammaro, Getaneh Alemu - Using Europeanafor learning & teaching: E...
EUmoocs 603 views
Eleonora Pantò - Using Social Media effectively in your MOOC by EUmoocs
Eleonora Pantò - Using Social Media effectively in your MOOCEleonora Pantò - Using Social Media effectively in your MOOC
Eleonora Pantò - Using Social Media effectively in your MOOC
EUmoocs 316 views
Deborah Arnold - EMMA webinar: Capturing and delivering effective video as pa... by EUmoocs
Deborah Arnold - EMMA webinar: Capturing and delivering effective video as pa...Deborah Arnold - EMMA webinar: Capturing and delivering effective video as pa...
Deborah Arnold - EMMA webinar: Capturing and delivering effective video as pa...
EUmoocs 670 views
Mathy Vanbuel - EMMA webinar: Capturing and delivering effective video as par... by EUmoocs
Mathy Vanbuel - EMMA webinar: Capturing and delivering effective video as par...Mathy Vanbuel - EMMA webinar: Capturing and delivering effective video as par...
Mathy Vanbuel - EMMA webinar: Capturing and delivering effective video as par...
EUmoocs 354 views
Is there value of learning analytics in MOOCs? by EUmoocs
Is there value of learning analytics in MOOCs?Is there value of learning analytics in MOOCs?
Is there value of learning analytics in MOOCs?
EUmoocs 294 views
EMMA Services - EMOOCs 2016 conference by EUmoocs
EMMA Services - EMOOCs 2016 conferenceEMMA Services - EMOOCs 2016 conference
EMMA Services - EMOOCs 2016 conference
EUmoocs 311 views
Governing by data: Considerations on the role of learning analytics in education by EUmoocs
Governing by data: Considerations on the role of learning analytics in educationGoverning by data: Considerations on the role of learning analytics in education
Governing by data: Considerations on the role of learning analytics in education
EUmoocs 1.6K views
EMMA services at EMOOCs Conference 2016 by EUmoocs
 EMMA services at EMOOCs Conference 2016 EMMA services at EMOOCs Conference 2016
EMMA services at EMOOCs Conference 2016
EUmoocs 742 views
What EMMA stands for? by EUmoocs
What EMMA stands for?What EMMA stands for?
What EMMA stands for?
EUmoocs 594 views
Fra sperimentazione e policy istituzionale by EUmoocs
Fra sperimentazione e policy istituzionaleFra sperimentazione e policy istituzionale
Fra sperimentazione e policy istituzionale
EUmoocs 413 views

Recently uploaded

NS3 Unit 2 Life processes of animals.pptx by
NS3 Unit 2 Life processes of animals.pptxNS3 Unit 2 Life processes of animals.pptx
NS3 Unit 2 Life processes of animals.pptxmanuelaromero2013
102 views16 slides
ACTIVITY BOOK key water sports.pptx by
ACTIVITY BOOK key water sports.pptxACTIVITY BOOK key water sports.pptx
ACTIVITY BOOK key water sports.pptxMar Caston Palacio
350 views4 slides
Are we onboard yet University of Sussex.pptx by
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptxJisc
71 views7 slides
Classification of crude drugs.pptx by
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptxGayatriPatra14
65 views13 slides
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx by
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptxOEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptxInge de Waard
165 views29 slides
GSoC 2024 by
GSoC 2024GSoC 2024
GSoC 2024DeveloperStudentClub10
63 views15 slides

Recently uploaded(20)

NS3 Unit 2 Life processes of animals.pptx by manuelaromero2013
NS3 Unit 2 Life processes of animals.pptxNS3 Unit 2 Life processes of animals.pptx
NS3 Unit 2 Life processes of animals.pptx
manuelaromero2013102 views
Are we onboard yet University of Sussex.pptx by Jisc
Are we onboard yet University of Sussex.pptxAre we onboard yet University of Sussex.pptx
Are we onboard yet University of Sussex.pptx
Jisc71 views
Classification of crude drugs.pptx by GayatriPatra14
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptx
GayatriPatra1465 views
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx by Inge de Waard
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptxOEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx
OEB 2023 Co-learning To Speed Up AI Implementation in Courses.pptx
Inge de Waard165 views
AI Tools for Business and Startups by Svetlin Nakov
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and Startups
Svetlin Nakov89 views
Plastic waste.pdf by alqaseedae
Plastic waste.pdfPlastic waste.pdf
Plastic waste.pdf
alqaseedae110 views
DU Oral Examination Toni Santamaria by MIPLM
DU Oral Examination Toni SantamariaDU Oral Examination Toni Santamaria
DU Oral Examination Toni Santamaria
MIPLM138 views
UWP OA Week Presentation (1).pptx by Jisc
UWP OA Week Presentation (1).pptxUWP OA Week Presentation (1).pptx
UWP OA Week Presentation (1).pptx
Jisc68 views
Narration lesson plan.docx by TARIQ KHAN
Narration lesson plan.docxNarration lesson plan.docx
Narration lesson plan.docx
TARIQ KHAN99 views
Scope of Biochemistry.pptx by shoba shoba
Scope of Biochemistry.pptxScope of Biochemistry.pptx
Scope of Biochemistry.pptx
shoba shoba121 views
Narration ppt.pptx by TARIQ KHAN
Narration  ppt.pptxNarration  ppt.pptx
Narration ppt.pptx
TARIQ KHAN110 views
Nico Baumbach IMR Media Component by InMediaRes1
Nico Baumbach IMR Media ComponentNico Baumbach IMR Media Component
Nico Baumbach IMR Media Component
InMediaRes1425 views
Lecture: Open Innovation by Michal Hron
Lecture: Open InnovationLecture: Open Innovation
Lecture: Open Innovation
Michal Hron95 views
Universe revised.pdf by DrHafizKosar
Universe revised.pdfUniverse revised.pdf
Universe revised.pdf
DrHafizKosar108 views

EMMA presentation - Alfons Juan - Language technologies for Education: recent results by the MLLP group - UPV, Spain

  • 1. Language technologies for Education: recent results by the MLLP group Alfons Juan 2nd Internet of Education Conference 2015 18 September 2015, Sarajevo
  • 3. • Research group at the Univ. Polit`ecnica de Val`encia (Spain) • Research areas: – Machine Learning and Applications – Natural Language Processing – Educational Technologies and Big Data • Recent research projects: – transLectures: Transcription and translation of video lectures – EMMA: European Multiple MOOC Aggregator – Active Interaction for Speech Transcription and Translation 2 / ??
  • 4. transLectures (Nov 2011 – Oct 2014) Main goal: to develop innovative, cost-effective (quasi-automatic) solutions to produce accurate subtitles for educational videos Two pilots: • VideoLectures.NET: En, Sl En→{De,Sl,Fr,Es} Sl→En • poliMedia: Es, Ca Es↔En Ca↔{Es,En} Three scientific and technological objectives: • Massive adaptation to improve subtitling quality • Intelligent interaction to improve subtitling quality • Integration into Opencast to enable real-life evaluation 3 / ??
  • 5. transLectures: VideoLectures.NET >20000 videos (45 min. on avg.): 85% English, 13% Slovenian, . . . 4 / ??
  • 6. transLectures: poliMedia 27000 videos (2-10 min.): 88% Spanish, 3% Catalan, . . . 5 / ??
  • 7. transLectures: Massive adaptation 10 15 20 25 30 35 40 45 50 M06 M12 M18 M24 M30 M36 WER (ASR) Sl Ca En Es 15 20 25 30 35 40 M12 M18 M24 M30 M36 BLEU (MT) EnEs EnFr EsEn EnDe SlEn EnSl • Key advances: neural networks and “extra” resources (slides) • ASR: all error rates below 30 % (Sl: 26.9 %, others below 20 %) • SMT: all BLEU scores above 20 (En→Sl: 20.1, mostly above 25) 6 / ??
  • 8. transLectures: Intelligent interaction 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 M12 M18 M24 M30 M36 WERPM WP3 II 10% poliMedia transcription The 10% of words recognized with minimum confidence include 30% of the actual word recognition errors 7 / ??
  • 9. transLectures: Integration into Opencast >300 different institutions worldwide [demo] 8 / ??
  • 10. transLectures: try our tools Try our tools at ttp.mllp.upv.es 190 users and 1082 videos (254 hours) since May 2014 9 / ??
  • 11. transLectures: try our tools (cont.) Institution Country Language Univ. Polit`ecnica de Val`encia Spain ES, CA, EN Univ. Carlos III de Madrid Spain ES, EN Univ. Oberta de Catalunya Spain ES, CA Open Universiteit Nederland Netherlands NL Univ. of Naples Federico II Italy IT Universit´e de Bourgogne France FR Tallin University Estonia ET, EN University of Leicester UK EN Examples: • Lecture recorded at poliMedia studios. [link] • Lee Rubenstein, edX VP of Business Development: “Reinvent- ing education”, 30 June 2015, Val`encia. [link] • Text-to-Speech demo. [link] 10 / ??
  • 12. EMMA (Feb 2014 – Jul 2016) European Multiple MOOC Aggregator (EMMA) Main goal: to provide multilingual access to European MOOCs Motivation: • Most MOOCs are offered in few languages (En, Es, Fr) • Language barrier is keeping many learners from taking MOOCs • MOOC components: texts, images, videos, forums • EMMA uses transLectures tools to translate videos and texts – Few hours of video in 7 En, Es, It, Nl, Et, Pt and Fr – Source language is the national language of the MOOC provider – Target languages: En, Es and It 11 / ??
  • 13. EMMA: cost of manually translating MOOCs Texts: • Manual translation rate is approximately 2500 words per day • A 6-week course with 75000 words takes 1.5 PM Videos: • Before translating, videos are manually transcribed (10 RTF) • Then, transcriptions are translated (30 RTF) • A course including 2 hours of video takes 0.5 PM Solutions to lower costs: • Crowdsourcing (e.g. TED talks) • ASR and MT: user effort is reduced to 30% (2 → 0.6 PM) 12 / ??
  • 14. EMMA: automatic video subtitling 1. Generation of automatic transcriptions from video 2. Manual review of automatic transcriptions 3. Generation of automatic translations from transcriptions 4. Manual review of automatic translations 13 / ??
  • 15. EMMA: automatic document translation • Course text is ingested into the translation system • Source and target texts are reviewed in parallel • Preview of source and target texts also available • Translated text is imported back into the EMMA platform 14 / ??
  • 16. EMMA: evaluations Video subtitling Transcription Translation Total Language pairs RTF (10) RTF (30) RTF (40) Spanish → English 3 7 10 English → Spanish 6 17 23 Document translation Translation Language pairs RTF Spanish → English 7 English → Spanish 17 15 / ??
  • 17. EMMA: conclusions • Multilingual access to your course boosts visibility • The cost of manually translating your course is high (2 PM) • Automatic translation can reduce the temporal cost to 30% • Accuracy of automatic translation depends on several factors: – Languages involved – Availability of annotated data resources related to your course – Specificity of the course • Designing a multilingual MOOC should also take into account: slides, images, application interfaces (demos), bibliography 16 / ??