Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Georg Rehm
georg.rehm@dfki.de
DFKI GmbH, Language Technology Lab – Berlin, Germany
META-NET, General Secretary
Human Langu...
Outline
• Multilingual Europe
• Analysis I: Technology Support for Europe’s Languages
• Analysis II: Status and Current De...
• Multilingualism is at the very heart of the European idea.
• 24 EU languages – all languages have the same status.
• Doz...
Analysis I: Technology Support
for Europe’s Languages
4EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 201...
q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith ...
q Basque
q Bulgarian*
q Catalan
q Croatian*
q Czech*
q Danish*
q Dutch*
q English*
q Estonian*
q Finnish*
q Fre...
MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or ...
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
S...
Excellent
Good
Moderate
Fragmentary
Weak/no
support
LanguageTechnologySupport
MillionsofNativeSpeakers(Worldwide)
Yiddish
...
Analysis II: Status and
Current Developments
10EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
• Multilingual Europe: our languages enjoy equal status yet digital extinction
of the majority of European languages is a ...
Example:
Language Technology for the
Digital Single Market
12EP STOA Workshop: Language Equality in the Digital Age (10 Ja...
q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q ...
MDSM: Needed Applications
q Crosslingual SME presales communication and aftersales services
q Multilingual websites, pro...
Multilingual Value Programme
q Multilingual Value Programe
§ Suggested three-year programme
§ Requires modest investmen...
Missions and Opportunities
17EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
Missions and Opportunities
• Languages & European Society: Enable all European citizens
to communicate and operate in thei...
Towards the
Human Language Project
19EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
Multilingual
Europe
through
Technology
Multilingual
Strategy of the
EU: more tech
support for
multilingualism
Language
Tec...
Multilingual
Europe
through
Technology
Multilingual
Strategy of the
EU: more tech
support for
multilingualism
Language
Tec...
Human Language Project – Interdisciplinary R&D&I Programme
Basic
Research
•Results in new
methods,
approaches
Applied R&D
...
Human Language Project
• Goal: Deep Natural Language Understanding.
• Breakthroughs in Artificial Intelligence plus a fres...
Human Language Project
• Collaboration and coordination between EC, EP,
Member States and all other stakeholders.
• Mix of...
HLP Topics: Key Ingredients for
Future European LT Research
Artificial Intelligence
including cognition, perception, visio...
Human
Language
Project
Truly
Multilingual
Europe
European
Economy
(MDSM)
Attractive
jobs for
high
potentials
Education
and...
Thank you!
Georg Rehm
georg.rehm@dfki.de
Human
Language
Project
Truly
Multilingual
Europe
European
Economy
(MDSM)
Attracti...
Human Language Technologies in a Multilingual Europe
Upcoming SlideShare
Loading in …5
×

Human Language Technologies in a Multilingual Europe

246 views

Published on

Georg Rehm. Human Language Technologies in a Multilingual Europe. Workshop Language Equality in the Digital Age - Towards a Human Language Project. Science and Technology Options Assessment (STOA), European Parliament, Brussels, Belgium, January 2017. January 10, 2017.

Published in: Technology
  • Be the first to comment

Human Language Technologies in a Multilingual Europe

  1. 1. Georg Rehm georg.rehm@dfki.de DFKI GmbH, Language Technology Lab – Berlin, Germany META-NET, General Secretary Human Language Technologies in a Multilingual Europe
  2. 2. Outline • Multilingual Europe • Analysis I: Technology Support for Europe’s Languages • Analysis II: Status and Current Developments • Example: LT for the Digital Single Market • Missions and Opportunities • Towards the Human Language Project 2EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  3. 3. • Multilingualism is at the very heart of the European idea. • 24 EU languages – all languages have the same status. • Dozens of regional and minority languages as well as languages of immigrants and trade partners. • Economic challenges: – If the DSM is not multilingual, there will be 20+ isolated markets! – Language barriers are market barriers! • Social and public challenges: – Empower all citizens to use their mother tongues. – Provide multilingual digital public services. – Enable cross-border, cross-lingual, cross-cultural communication. Towards a European public sphere and e-participation. – Restore trust in media (fake news debate, filter bubble issue etc.)
  4. 4. Analysis I: Technology Support for Europe’s Languages 4EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  5. 5. q 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) q Multilingual Europe Technology Alliance. 826 members in 67 countries (published in 2013) (31 volumes; published in 2012) T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
  6. 6. q Basque q Bulgarian* q Catalan q Croatian* q Czech* q Danish* q Dutch* q English* q Estonian* q Finnish* q French* q Galician q German* q Greek* q Hungarian* q Icelandic q Irish* q Italian* q Latvian* q Lithuanian* q Maltese* q Norwegian q Polish* q Portuguese* q Romanian* q Serbian q Slovak* q Slovene* q Spanish* q Swedish* q Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  7. 7. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  8. 8. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Source: META-NET White Paper Series: Europe's Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors) Important: even current state of the art technologies are far from being perfect! Important: 20+ European languages are severely under-supported and face the danger of digital extinction.
  9. 9. Excellent Good Moderate Fragmentary Weak/no support LanguageTechnologySupport MillionsofNativeSpeakers(Worldwide) Yiddish Welsh VlaxRomani Turkish Scots Romany Occitan Maltese Macedonian Luxembourgish Lithuanian Limburgish Latvian Icelandic Friulian Frisian Breton Bosnian Asturian Albanian Irish Croatian Serbian Hebrew Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English 0 50 100 150 200 250 300 350 400 Source: Georg Rehm, Hans Uszkoreit, Ido Dagan, Vartkes Goetcherian, Mehmet Ugur Dogan, Coskun Mermer, Tamás Váradi, Sabine Kirchmeier-Andersen, Gerhard Stickel, Meirion Prys Jones, Stefan Oeter, and Sigve Gramstad. An Update and Extension of the META-NET Study “Europe's Languages in the Digital Age”. In Proceedings of the Workshop on Collaboration and Computing for Under-Resourced Languages in the Linked Open Data Era (CCURL 2014), Reykjavik, Iceland, May 2014.
  10. 10. Analysis II: Status and Current Developments 10EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  11. 11. • Multilingual Europe: our languages enjoy equal status yet digital extinction of the majority of European languages is a very severe danger. • Language Technology Research and Innovation in Europe: World class research, excellent results (examples: Moses, recent NMT results of QT21), strong SME base, thousands of LSPs; fragmentation; need for coordination. • Big need for high-quality, high-coverage, precise, robust, deployable Language Technologies: translation, conversational interfaces, text and media analytics, personal assistants, multilingual DSM etc. • Artificial Intelligence: Important breakthroughs and massive investments in R&D and applications (mostly in US and Asia) – huge opportunity for Europe! • The European Language Challenge cannot be abandoned or outsourced. Ø Europe must not make its digital infrastructure dependent on non-European solutions. This is why the EU is building GALILEO as an alternative to GPS, GLONASS, Bei Dou. • Big need for Language Technologies made in Europe for Europe! Status and Current Developments 11EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017) !
  12. 12. Example: Language Technology for the Digital Single Market 12EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  13. 13. q Top priority in the European Union. q Expected to add 400b€ to European GDP and hundreds of thousands of new jobs. q Unfortunately, the language topic is not included in the EC’s Digital Single Market strategy (published in May 2015).
  14. 14. MDSM: Needed Applications q Crosslingual SME presales communication and aftersales services q Multilingual websites, product catalogues, product descriptions q Crosslingual business intelligence (e.g., based on UGC) q Crosslingual communication for SMEs, public institutions, citizens q Multilingual (big) data, language and knowledge value chains q Multilingual knowledge bases and knowledge graphs (and services) q Multilingual conversational interfaces for connected devices (IoT) q Crosslingual social media analytics for EU-wide societal issues q Multilingual text and report generation (knowledge/data to text) q All services must be domain-adaptable (avoid one size fits all) q etc. 15EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  15. 15. Multilingual Value Programme q Multilingual Value Programe § Suggested three-year programme § Requires modest investment q “Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content” q Three components address the main needs of the Multilingual DSM (MDSM) and how to put them into practice: 1. Multilingual Application Areas 2. Multilingual Services 3. Research Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 16EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017) Version 1.0 to be published in 2017
  16. 16. Missions and Opportunities 17EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  17. 17. Missions and Opportunities • Languages & European Society: Enable all European citizens to communicate and operate in their mother tongues (online & offline). • Languages & Media: Address – technologically – the massively increasing social, political and commercial relevance of content and communication (fake news debate, filter bubble challenge). • Languages & Market: Realise the Multilingual DSM, including multilingual content, crosslingual text analytics, multilingual generation. • Languages & Digital Tech: Future-proof our languages. • Languages & Devices: Robust, precise, high-quality spoken language interfaces for billions of connected things – and all languages. • Excellent opportunity for Europe, European research, European education, European industry, European innovation, European culture! • Goal: Move Europe into the pole position in this field! 18EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  18. 18. Towards the Human Language Project 19EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  19. 19. Multilingual Europe through Technology Multilingual Strategy of the EU: more tech support for multilingualism Language Technologies for Europe's digital public services Technologies for the Multilingual Digital Single Market Language Technologies for Big Data text analytics The Human Language Project – long- term R&D&I, post-H2020 Language Technologies R&D&I (H2020, WP 2018-20) Multilingual Europe in January 2017 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 Open calls and upcoming service contracts Dec. 2016: EC brainstorming meeting on future LT priorities in Horizon 2020 and FP9. Need for a new strategy paper? Jan. 2017: STOA workshop and study on LT for Europe Dec. 2017: LT Session at BDVA Summit in Valencia 2017: MDSM SRIA V1.0 Policy change and initiative towards a European digital public sphere enabled by MT/LT DG CONNECT DGT and DG CONNECT DG CONNECT WP 2018-20 (incl. IoT, I4.0, assistants, robots etc.) Shared programme between EU and MS Suggested MLV Programme Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 CEF AT ELRC
  20. 20. Multilingual Europe through Technology Multilingual Strategy of the EU: more tech support for multilingualism Language Technologies for Europe's digital public services Technologies for the Multilingual Digital Single Market Language Technologies for Big Data text analytics The Human Language Project – long- term R&D&I, post-H2020 Language Technologies R&D&I (H2020, WP 2018-20) Multilingual Europe in January 2017 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 Open calls and upcoming service contracts Dec. 2016: EC brainstorming meeting on future LT priorities in Horizon 2020 and FP9. Need for a new strategy paper? Jan. 2017: STOA workshop and study on LT for Europe Dec. 2017: LT Session at BDVA Summit in Valencia 2017: MDSM SRIA V1.0 Policy change and initiative towards a European digital public sphere enabled by MT/LT DG CONNECT DGT and DG CONNECT DG CONNECT WP 2018-20 (incl. IoT, I4.0, assistants, robots etc.) Shared programme between EU and MS Suggested MLV Programme Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 CEF AT ELRC Observations: • Current initiatives are too small and unbalanced; they concentrate on innovation and technology deployment. • Danger to loose touch with research and novel, potentially paradigm-shifting developments. • Difficult to kick-start new, paradigm-shifting research. • We need a coordinated, concerted and consolidated push in basic research, applied R&D and innovation!
  21. 21. Human Language Project – Interdisciplinary R&D&I Programme Basic Research •Results in new methods, approaches Applied R&D •Results in novel technologies Innovation •Results in novel or improved products or services Research Themes – Needs and Gaps (market-driven) • Computational Linguistics • Artificial Intelligence • Language Technology • Linguistics • Computer Science • Cognitive Science • other related fields • New, groundbreaking methods, paradigms, approaches • Foster technologies, products, innovation, economy • Foster education HLP: Umbrella programme to turbo-charge and to coordinate all European R&D&I activities in a systematic way including EP, EC, Member States. 22EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  22. 22. Human Language Project • Goal: Deep Natural Language Understanding. • Breakthroughs in Artificial Intelligence plus a fresh look at Linguistics for the Next Generation of LT! • All official European and many additional languages • Broad coverage, high quality, high precision • Across modalities: text, text types, speech, image, video etc. • Across platforms: messaging, telephony, social, mobile, IoT etc. • Across cultures: knowledge, customs, formalities, humour, emotion, subjectivity, biases, opinions, filter bubble etc. 23EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  23. 23. Human Language Project • Collaboration and coordination between EC, EP, Member States and all other stakeholders. • Mix of funding sources: – EU projects: Horizon 2020 (WP 2018-2020) + FP9 (2021+) – National/regional funding sources • Setup: basic research, applied research, innovation, commercialisation – tightly intertwined • Timeframe: 10 years • Policy change towards “LT-enabled multilingualism” • Public procurement: EU/EC, MS administrations should demand certain language technologies 24EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  24. 24. HLP Topics: Key Ingredients for Future European LT Research Artificial Intelligence including cognition, perception, vision, cross-modal, cross-platform, cross-culture, IoT etc. Machine Learning Language Technology • Extend knowledge bases • Semantic Web, ontologies, linked data, interoperability • More complex models • Multilingual resources that are grounded, extensible • Subjectivity, objectivity, further novel dimensions • Web-scale reasoning • Combine DNNs and symbolic processing • ML for knowledge acquisition and extension • DNNs embedded into modular systems including symbolic knowledge bases • Make it possible to inspect and also to optimise DNNs (beyond end-to-end) • (Computational) Linguistics research towards deep language understanding • From corpora to DNNs to annotated data to highly improved symbolic methods • Language portability • Full and Deep Language Understanding by 2030 – Human Language Project Knowledge Technology 25EP STOA Workshop: Language Equality in the Digital Age (10 Jan. 2017)
  25. 25. Human Language Project Truly Multilingual Europe European Economy (MDSM) Attractive jobs for high potentials Education and young researchers Massive boost for research Foster innovation and new companies 26
  26. 26. Thank you! Georg Rehm georg.rehm@dfki.de Human Language Project Truly Multilingual Europe European Economy (MDSM) Attractive jobs for high potentials Education and young researchers Massive boost for research Foster innovation and new companies

×