Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AI and Conference Interpretation – From Smart Assistants for the Human Interpreter to Automatic Solutions

10 views

Published on

Georg Rehm. AI and Conference Interpretation - From Smart Assistants for the Human Interpreter to Automatic Solutions. DG Interpretation Lunchtime Session on Digital Transformation. European Commission, Brussels, November 2018. November 12, 2018. Invited talk.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

AI and Conference Interpretation – From Smart Assistants for the Human Interpreter to Automatic Solutions

  1. 1. Georg Rehm DFKI GmbH Language Technology Lab – Berlin, Germany georg.rehm@dfki.de Artificial Intelligence and Conference Interpretation From Smart Assistants for the Human Interpreter to Automatic Solutions
  2. 2. Outline • Artificial Intelligence • Language Technology • Multilingualism in Europe • AI and Interpretation • Recommendations SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 2
  3. 3. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 3
  4. 4. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 4
  5. 5. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 Data Intelligence Current breakthroughs based on Machine Learning (“Deep Learning”) Also still in use: symbolic, rule-based methods and expert systems Artificial Intelligence • Huge data sets + powerful learning algorithms + very fast hardware • Enormous potential for disruptions in all sectors and areas • Important: “The AI” doesn’t exist – there’s only single-task systems! 5
  6. 6. Language Technology • Language Technology transfers theoretical results from language-oriented research into technologies and applications that are ready for production use • Uses results from, e.g.: – Artificial Intelligence – Computer Science – Computational Linguistics – Natural Language Processing – Psychology, Psycholinguistics – Cognitive Science SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 6 Example Applications • Spell checkers • Dictation systems • Translation systems • Search engines • Report generation • Expert systems • Dialogue systems • Text summarisers
  7. 7. META-NET and Multilingual Europe SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 7
  8. 8. • Multilingualism is at the heart of the European idea • 24 EU languages – all have the same status • Dozens of regional and minority languages as well as languages of immigrants and trade partners • Many economic and social challenges: – The Digital Single Market needs to be multilingual – Cross-border, cross-lingual, cross-cultural communication • This presentation is about very general application scenarios and use cases: day-to-day communication, ecommerce, mobility, health, tourism etc.
  9. 9. 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) Multilingual Europe Technology Alliance. 900+ members in 67 countries T4ME (META-NET) CESAR METANET4UMETA-NORD Multilingual Europe Technology AllianceNET (published in 2013) (31 volumes; published in 2012)
  10. 10. q Basque q Bulgarian* q Catalan q Croatian* q Czech* q Danish* q Dutch* q English* q Estonian* q Finnish* q French* q Galician q German* q Greek* q Hungarian* q Icelandic q Irish* q Italian* q Latvian* q Lithuanian* q Maltese* q Norwegian q Polish* q Portuguese* q Romanian* q Serbian q Slovak* q Slovene* q Spanish* q Swedish* q Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  11. 11. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  12. 12. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
  13. 13. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Important: even current state of the art technologies are far from being perfect! META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
  14. 14. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Important: 20+ European languages are severely under-supported and face the danger of digital extinction. META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
  15. 15. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) q Our languages have the same status but most EU languages are severely threatened by digital language extinction. q The big challenge of multilingualism in Europe must not be ignored or outsourced to other continents. Ø Europe has a huge demand for Language Technologies made in Europe!
  16. 16. META-NET SRA, published in early 2013 • First strategic research agenda of our field • Complex process for collecting and structuring technology visions • Approx. 200 researchers have participated D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015 SRIA V0.5 presented at • Based on strategy papers and roadmaps prepared by several EU projects including the META-NET SRA (see above) Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 SRIA V0.9 presented at • Prepared, presented and endorsed by the Cracking the Language Barrier federation • Explains how the LT community can realise the Multilingual Digital Single Market Strategic Research and Innovation Agenda Language Technologies for Multilingual Europe Towards a Human Language Project SRIA Editorial Team Version 1.0 beta – November 2017 SRIA V1.0 presented at • Complements and supports the STOA study • Crucial recommendation: Kickstart the Human Language Project! Georg Rehm and Hans Uszkoreit, editors. The META-NET Strategic Research Agenda for Multilingual Europe 2020. Springer, Heidelberg, New York, Dordrecht, London, 2013. Georg Rehm, editor. Language Technologies for Multilingual Europe: Towards a Human Language Project. Strategic Research and Innovation Agenda. Dec. 2017. Version 1.0. Unveiled at META-FORUM 2017 in Brussels, Belgium, on Nov. 13/14, 2017. Prepared by the Cracking the Language Barrier federation, supported by CRACKER. Georg Rehm, editor. Language as a Data Type and Key Challenge for Big Data. Strategic Research and Innovation Agenda for the Multilingual Digital Single Market. CRACKER and Cracking the Language Barrier federation, July 2016. Version 0.9. 04 July 2016. Supported by CRACKER and LT_Observatory. Georg Rehm, editor. Strategic Agenda for the Multilingual Digital Single Market – Technologies for Overcoming Language Barriers towards a truly integrated European Online Market. CRACKER and LT_Observatory, April 2015. Version 0.5. 22 April 2015. Prepared by the EU-funded projects CRACKER and LT_Observatory.
  17. 17. STUDY EPRS | European Parliamentary Research Service Scientific Foresight Unit (STOA) PE 581.621 Science and Technology Options Assessment STOA Workshop European Parliament 10 January 2017 • STOA study – published in March 2017 • Recommends to the EC to initiate the Human Language Project (HLP) • Three important research policy recommendations: – Strengthen research and focus upon the HLP – Support a European LT platform for data and services – Bridge the technology gap between Europe‘s languages
  18. 18. “Language equality” Resolution • EP resolution “Language equality in the digital age” P8_TA(2018)0332 – partially based on the STOA study • Voting in the EP on 11 September 2018: 592 votes in favour vs. 45 votes against! • Three of the 45 recommendations: – 25. Establish a large-scale, long-term LT funding programme – 27. Europe has to secure its leadership in language-centric AI – 29. Create a European LT platform for sharing of services SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 18 European Parliament 2014-2019 TEXTS ADOPTED Provisional edition P8_TA-PROV(2018)0332 Language equality in the digital age European Parliament resolution of 11 September 2018 on language equality in the digital age (2018/2028(INI)) The European Parliament, – having regard to Articles 2 and 3(3) of the Treaty on the Functioning of the European Union (TFEU), – having regard to Articles 21(1) and 22 of the Charter of Fundamental Rights of the European Union, – having regard to the 2003 UNESCO Convention for the Safeguarding of the Intangible Cultural Heritage, – having regard to Directive 2003/98/EC of the European Parliament and of the Council of 17 November 2003 on the re-use of public sector information1 , – having regard to Directive 2013/37/EU of the European Parliament and of the Council of 26 June 2013 amending Directive 2003/98/EC on the re-use of public sector information2 , – having regard to Decision (EU) 2015/2240 of the European Parliament and of the Council of 25 November 2015 establishing a programme on interoperability solutions and common frameworks for European public administrations, businesses and citizens (ISA2 programme) as a means for modernising the public sector3 , – having regard to the Council resolution of 21 November 2008 on a European strategy for multilingualism (2008/C 320/01)4 , – having regard to the Council decision of 3 December 2013 establishing the specific programme implementing Horizon 2020 – the Framework Programme for Research and 1 OJ L 345, 31.12.2003, p. 90. 2 OJ L 175, 27.6.2013, p. 1. 3 OJ L 318, 4.12.2015, p. 1. 4 OJ C 320, 16.12.2008, p. 1.
  19. 19. ? 19
  20. 20. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 20 ELGELG – The Primary Platform for Language Technology in Europe • Development of a functional language technology cloud platform for Europe • Market place for European LT business space (directory of stakeholders) • Hundreds of LT services and resources – easy-to-use and easy-to-integrate • Evaluation through 15-20 pilot projects feeding back into the platform • 30+ national competence centres for a strong European network • Services can be made available by the community • Boosting the emerging Multilingual Digital Single Market • Towards a thriving and flourishing European LT community Consortium • DFKI GmbH (Coordinator) (DE) • ILSP, R.C. “Athena“ (GR) • University of Sheffield (UK) • Charles University (CZ) • ELDA (FR) • Tilde (LV) • SAIL LABS GmbH (AT) • Expert System Iberia (ES) • University of Edinburgh (UK) 1 Jan. 2019 – 31 Dec. 2021 • ICT-29-2018: Multilingual Next Generation Internet Ø Two sub topics, budget 25M€ • ICT-29 a) European Language Grid Ø One Innovation Action, 7M€ • ICT-29 b) Domain-specific/challenge-oriented HLT Ø Six Research and Innovation Actions, je 3M€ Web Interface APIs European Language Grid – Content Catalogue LT Services, Tools, Components, Technologies Language Resources and Data Sets Organisations, Languages, Service Types etc. Cloud Infrastructure
  21. 21. • Goal: Deep Natural Language Understanding by 2030 • All official European and many additional languages • Broad coverage, high quality, high precision • Create new approaches, algorithms, data sets • Across modalities: text, text types, speech, video etc. • Across platforms: messaging, telephony, social, mobile, IoT, robots, smart devices, conversational technologies etc. • Across cultures: knowledge, customs, formalities, humour, emotion, subjectivity, biases, opinions, filter bubble etc. • How? As the next EU FET Flagship Project! SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 Human Language Project 21
  22. 22. Proposal • FET Flagship Projects: Up to 1B€ of funding for up to ten years • Flagships: Human Brain Project, Graphene, Quantum (2019) • H2020 FETFLAG-01-2018 called for preparatory actions, i.e., small (1M€) projects to prepare the full flagship proposal • Our proposal: “Human Language Project Preparation” • Consortium of 16 partners (coordinated by DFKI) • 375+ letters of support including 16 ministries and 24 national language institutions • More information: http://human-language-project.eu SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 22
  23. 23. HLP Partnering Projects • Language-specific and/or regional consortia doing research on their own languages • Close collaboration with the Core Project • Overlap between CP and PP in terms of partners HLP Partnering Project: Spanish PP PP HLP Partnering Project: Italian HLP Partnering Project: Greek HLP Partnering Project: German HLP Partnering Project: Polish HLP Partnering Project: Baltic languages PP HLP Core Project HLP Partnering Project: Dutch PP Important dates and next steps • 11. Sept 2018: EP “Language equality” resolution • 18. Sept 2018: HLP Prep proposal submitted • 27. Sept 2018: “Language equality” conference (EP) • 29. Nov 2018: Results to be circulated to the consortia • 04. Dec 2018: Official announcement of winning proposals at ICT 2018 • 01. March 2019: Start of the up to six FET Flagship preparation projects • 29. Feb 2020: End of the up to six projects 23 HLP Core Project • Coordination of the flagship (CP and PPs) • Continuous roadmap development • General technology and algorithm development • Digital data, resources, computing and collaboration infrastructure
  24. 24. AI and Interpretation SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 24
  25. 25. • Since approx. 2015, with breakthroughs in neural technolo- gies, Machine Translation has been getting better and better. • All areas of AI look for “super-human performance” but language is fundamentally different and much more complex. • Neural AI approaches cannot understand language, they process it according to huge collected data sets. • In many use cases, mistakes can be tolerated. • But: translation and interpretation are often mission-critical! • Mistakes can have serious consequences (politics, medicine). Translation and Interpretation SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 25
  26. 26. • Example: Lecture Translator – University lectures are automatically transcribed and translated, in near-real time, into several languages – Students can follow the translation – or only the transcription – through a web interface • Example: Presentation Translator – Presenter can have the speech automatically translated – Translations are displayed as subtitles • Example: Call Translator – Internet telephony provider offers automatic voice translation Speech Translation SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 26
  27. 27. • Lexifone – http://www.lexifone.com – Israeli startup, launched a telephone-based service in 2013 – Enable real-time translation and/or transcription for any conference call. • VoiceTra – http://voicetra.nict.go.jp – The Nara Institute of Science and Technology’s translation app covers 27 languages. • Mymanu – https://www.mymanu.com – Mymanu is deploying “smart” earbuds to make conversations in multiple languages easier. • TYWI – http://www.translateyourworld.com/ – Enables across-language communication at full speed. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 27 5-Year-Vision
  28. 28. • The three example applications work surprisingly well for their respective domains and registers. But: – They are far from being perfect. – They aren’t robust. – They cannot cope with unforeseen situations. – They cannot understand language as humans do. – They are not (yet?) suited for conference interpretation. Ø Limitations as regards their fields of application. Ø Interpretation is often mission-critical. Ø Human interpreters won’t be replaced anytime soon. Issues and Limitations SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 28
  29. 29. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)
  30. 30. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) Why machines won’t replace all human interpreters anytime soon …
  31. 31. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) And then there’s another challenge: Human-level performance!
  32. 32. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 32 https://slator.com/features/ai-interpreter-fail-at-china-summit-sparks-debate-about-future-of-profession/
  33. 33. Summary & Conclusions Ø AI is disrupting all industries – including translation and, increasingly, also interpretation. Ø But: perfect, robust, precise language technologies (incl. written/spoken MT and interpretation) are still far away. Ø The Human Language Project is still a vision. Its goal: develop new breakthroughs in Language Technology. Ø More and more demand for LT – including interpretation. Ø The machine will support human interpreters and help them become more efficient – it will not replace them. Ø At least not in important or critical use cases. Ø In other use cases the machine opens up interpretation. SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 33
  34. 34. Augmented Interpretation • SCIC repo, 4,000 speeches (3,000 public, 1,000 private) • Many R&D groups currently work on TED talk data sets • Suggestion 1: establish bridges between SCIC and LT research – share your data set (maybe in ELITR?). Maybe organise a shared task around the data set? • Help build the next generation of AI tools for interpreters • Tailored to the needs, topics, domains and established workflows of conference interpreters in the EC/EP • Both for preparing a meeting and for use in the booth • Glossary production, summarization, ASR, translation etc. • Suggestion 2: make use of services provided by the ELG to build smart tools for human interpreters (2020). SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 34
  35. 35. Thank you very much! Dr. Georg Rehm Senior Researcher DFKI Research Fellow DFKI Berlin ! georg.rehm@dfki.de ! http://georg-re.hm ! http://de.linkedin.com/in/georgrehm ! https://www.slideshare.net/georgrehm SCIC Lunchtime Session on Digital Transformation – 12 Nov. 2018 35

×