Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER
(g...
Outline
q Initiatives for Multilingual Europe
q Towards the Multilingual Digital Single Market
q The MDSM SRIA V0.9
q ...
q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith ...
q Basque
q Bulgarian*
q Catalan
q Croatian*
q Czech*
q Danish*
q Dutch*
q English*
q Estonian*
q Finnish*
q Fre...
MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or ...
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
S...
Strategic Research Agenda (2013)
q Addresses the problems we identified
when preparing the white papers.
q Can put Europ...
Priority Research Themes
q Three priority research themes:
§ Translingual Cloud
§ Social Intelligence and
e-Participati...
1 DFKI Germany Georg Rehm
2 CUNI Czech Republic Jan Hajic
3 ELDA France Khalid Choukri
4 FBK Italy Marcello Federico
5 ATH...
Selected Activities
2015 2016 2017
M12
M1
M24
M36
Kick-off meeting
for all ICT-17
Projects
translate5
WMT
2016
WMT
2017
IWS...
http://www.cracker-project.eu • http://www.meta-net.eu
• Riga Summit 2015 and Riga Declaration.
• Federation of European p...
q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q ...
A. Ansip’s May 2016 Blog Post
q Posted on 27 May 2016.
q First public acknowledgment
of the EC that the language
topic i...
16
MDSM SRIA
q Version 0.5 unveiled at META-FORUM 2015
q Version 0.9 unveiled at META-FORUM 2016
q Version 1.0 foreseen fo...
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual ...
MDSM: Goals and Needs
q Crosslingual communication for SMEs, public institutions, citizens
q Crosslingual SME presales c...
MLV Programme
q Multilingual Value Programe*
§ Three-year programme
§ Requires modest investment
q “Enabling the Multi...
Multilingual Digital Single Market
Automated Translation
E-Commerce
Content, Media,
Verticals
Translation, Language,
Knowl...
Application Areas (Selection)
q Multilingual E-commerce
§ Customer-facing vs. back-office facing (after-market, after-sa...
Setup – Timeframe – Costs
q Close collaboration with EC, EP and all other stakeholders
(including SMEs, research centres,...
q Multilingual Europe: danger of digital language extinction; all languages
are equal; multilingual DSM; world class LT r...
Multilingual
Europe
through
Technology
Current Initiatives
and Activities
Multilingual
Strategy of the
EU: more tech
suppo...
Thank you for your attention.
georg.rehm@dfki.de
http://www.meta-net.eu
http://www.facebook.com/META.Alliance
http://www.c...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market
Upcoming SlideShare
Loading in …5
×

Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market

162 views

Published on

Georg Rehm. Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market. Future and Emerging Trends in Language Technologies, Machine Learning and Big Data (FETLT 2016), Seville, Spain, November 2016. November 30, 2016.

Published in: Technology
  • Be the first to comment

Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market

  1. 1. META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899). Multilingual Europe in late 2016 A Strategic Research and Innovation Agenda for the Multilingual Digital Single Market Georg Rehm Coordinator CRACKER, General Secretary META-NET DFKI, Germany georg.rehm@dfki.de FETLT 2016 2nd International Workshop – Seville, Spain, 30th November 2016
  2. 2. Outline q Initiatives for Multilingual Europe q Towards the Multilingual Digital Single Market q The MDSM SRIA V0.9 q Multilingual Europe in late 2016 – where do we stand? http://www.meta-net.eu – http://www.cracker-project.eu 2
  3. 3. q 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) q Multilingual Europe Technology Alliance. 826 members in 67 countries (published in 2013) (31 volumes; published in 2012) T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
  4. 4. q Basque q Bulgarian* q Catalan q Croatian* q Czech* q Danish* q Dutch* q English* q Estonian* q Finnish* q French* q Galician q German* q Greek* q Hungarian* q Icelandic q Irish* q Italian* q Latvian* q Lithuanian* q Maltese* q Norwegian q Polish* q Portuguese* q Romanian* q Serbian q Slovak* q Slovene* q Spanish* q Swedish* q Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  5. 5. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  6. 6. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Results of the META-­NET  White  Paper  Study  (2012)
  7. 7. Strategic Research Agenda (2013) q Addresses the problems we identified when preparing the white papers. q Can put Europe ahead of its competitors in this technology area. q 200 contributors; >2 years. 54% industry; 46% research; 4% (inter)national institutions. q Presented and discussed at 90+ conferences and major workshops. q Published in early 2013. q http://www.meta-net.eu/sra http://www.meta-net.eu 7
  8. 8. Priority Research Themes q Three priority research themes: § Translingual Cloud § Social Intelligence and e-Participation § Socially-Aware Interactive Assistants q Two additional themes: § European Service Platform for Language Technologies § Core Technologies for Language Analysis and Production http://www.meta-net.eu 8
  9. 9. 1 DFKI Germany Georg Rehm 2 CUNI Czech Republic Jan Hajic 3 ELDA France Khalid Choukri 4 FBK Italy Marcello Federico 5 ATHENA RC Greece Stelios Piperidis 6 UEDIN UK Philipp Koehn 7 USFD UK Lucia Specia Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu Cracking the Language Barrier Coordination, Evaluation and Resources for European MT Research THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence Language-blocking: languages they do not speak Geo-blocking and language-blocking are barriers to access Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing Communities • META-NET incl. META-SHARE and META • MT evaluation initiatives – WMT, IWSLT, MT Marathons • MT and other LT industry • Language resources – META-SHARE, ELRA • HT/MT evaluation tools – translate5 • Translation industry, translation profession • MT user communities Strategic Agenda for the Multilingual Digital Single Market • Version 0.5 presented at META-FORUM 2015 (Riga) • Version 0.9 presented at META-FORUM 2016 (Lisbon) Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016
  10. 10. Selected Activities 2015 2016 2017 M12 M1 M24 M36 Kick-off meeting for all ICT-17 Projects translate5 WMT 2016 WMT 2017 IWSLT 2015 IWSLT 2016 IWSLT 2017 QT Marathon 2015 QT Marathon 2016 Roadmap for European MT Research Survey on the State of HQMT in Industry and LSPs SRIA (initial version) SRIA (update) SRIA (final) version 2version 1 • Production of  resources  (e.g.,  for  WMT   2016  and  2017,  IWSLT  2015-­2017) • Tools (quality  control,  evaluations) • Strategies and  roadmaps  (SRIA,   Roadmap  for  European  MT  Research) • Exchange  and  sharing  facility  for   resources  (META-­SHARE) Recent or Upcoming Events • LREC Workshop on MT Eval. (May 25) • META-FORUM 2016 (July 4/5, Lisbon) • WMT 2016 (Aug. 11/12, Berlin) • IWSLT 2016 (Dec. 8/9, Seattle) • Federation of organisations and projects working on technologies for multilingual Europe. • 10 organisations; 24 projects. • Areas of collaboration: data management and repositories, tools, shared tasks, evaluations. • Goal: provide one umbrella organisation for the whole community. http://www.cracking-the-language-barrier.eu
  11. 11. http://www.cracker-project.eu • http://www.meta-net.eu • Riga Summit 2015 and Riga Declaration. • Federation of European projects and organisations working on technologies for a multilingual Europe. • Multi-lateral Memorandum of Understanding; 10 organisations and 24 projects on board. • Getting new members on a regular basis. • Selected areas of collaboration: data management and repositories, tools, shared tasks, evaluations, events. • Goal: provide one umbrella organisation for the whole community.
  12. 12. q Top priority in the European Union. q Expected to add 400b€ to European GDP and hundreds of thousands of new jobs. q Unfortunately, the language topic is not included in the EC’s Digital Single Market strategy (published in May 2015).
  13. 13. A. Ansip’s May 2016 Blog Post q Posted on 27 May 2016. q First public acknowledgment of the EC that the language topic is of very high relevance for the Digital Single Market. q “Overcoming language barriers is vital for building the DSM, which is by definition multilingual. It is now time to reduce and remove the language barriers that are holding back its advance, and turn them into competitive advantages.” http://www.meta-net.eu – http://www.cracker-project.eu 15
  14. 14. 16
  15. 15. MDSM SRIA q Version 0.5 unveiled at META-FORUM 2015 q Version 0.9 unveiled at META-FORUM 2016 q Version 1.0 foreseen for early 2017 q Prepared and presented by Cracking the Language Barrier federation (editorial team: 13 colleagues) q SRIA addresses how the LT community is going to act united in order to make the DSM multilingual q Aligned to three of the BDVA SRIA V2.0’s technical priorities: Data Management, Data Analysis, Data Processing. D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015 http://www.meta-net.eu – http://www.cracker-project.eu 17
  16. 16. Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 http://www.cracker-project.eu http://www.cracking-the-language-barrier.eu
  17. 17. MDSM: Goals and Needs q Crosslingual communication for SMEs, public institutions, citizens q Crosslingual SME presales communication and aftersales services q Multilingual (big) data, language and knowledge value chains q Multilingual websites, product catalogues, product descriptions q Multilingual knowledge bases and knowledge graphs (and services) q Multilingual conversational interfaces for connected devices (IoT) q Crosslingual business intelligence (e.g., based on UGC) q Crosslingual social media analytics for EU-wide societal issues q Multilingual text and report generation (knowledge/data to text) q All services must be domain-adaptable (no one size fits all) q Translation Centre (Cloud) – HQ automated translation for all http://www.meta-net.eu – http://www.cracker-project.eu 19
  18. 18. MLV Programme q Multilingual Value Programe* § Three-year programme § Requires modest investment q “Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content” q Three components address the main needs of the Multilingual DSM (MDSM) and how to put them into practice: 1. Multilingual Application Areas 2. Multilingual Services 3. Research http://www.meta-net.eu – http://www.cracker-project.eu 20 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 * SRIA V0.9 and MLV Programme devised before re-organisation of DG CONNECT.
  19. 19. Multilingual Digital Single Market Automated Translation E-Commerce Content, Media, Verticals Translation, Language, Knowledge, Data Knowledge and Data Repositories Multilingual Applications Multilingual Services Research Crosslingual Big Data Language Analytics Meaning, Semantics, Knowledge High-Quality Machine Translation SMEs CEF DSIs IT Integrators Research provide innovative applications fills gaps H2020 RIAs H2020 CSAs, IAs, RIAs H2020 CSAs, RAs, national funding Multimodal Interaction Language Processing, Analysis and Production – Language Resources Citizens Public Business interoperable and standardised collaboration with member states Conversational Technologies Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 MLV Programme
  20. 20. Application Areas (Selection) q Multilingual E-commerce § Customer-facing vs. back-office facing (after-market, after-sales) § Crosslingual search, CRM, helpdesks, processes, workflows § Semantic, crosslingual product descriptions and catalogues § Online dispute resolution q Multilingual Content, Media, Verticals § Content analytics, curation, generation (incl. authoring support) § Multimodal communication (conversational, written, IoT) § Vertical domains: health, government, mobility, energy, legal. q Translation, Language, Knowledge, Data § Translation Cloud – written/spoken, automatic/human § Crosslingual public and social intelligence, business intelligence § HQ resources, under-resourced languages, domain-specific LRs 22
  21. 21. Setup – Timeframe – Costs q Close collaboration with EC, EP and all other stakeholders (including SMEs, research centres, universities, NGOs etc.). q Mix of funding sources: § Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA) § National/regional funding sources for work on monolingual LTs and LRs and also to support and grow SMEs in this area § Include, strengthen and broaden role of CEF AT (public services) q Estimated costs for basic MLV implementation: ca. 175-200M€ § Includes set of mission-critical services and applications § Timeframe: 2018, 2019, 2020 http://www.meta-net.eu – http://www.cracker-project.eu 23
  22. 22. q Multilingual Europe: danger of digital language extinction; all languages are equal; multilingual DSM; world class LT research in Europe. q Artificial Intelligence: Important breakthroughs and massive investments (USA, Asia) in AI R&D and applications (deep learning, DNNs). q Need for LT: not only Multilingual DSM but also Translation, Internet of Things, Industrie 4.0, HCI, smart personal assistants etc. q Need for European LT: US and other non-European technologies are not the solution! Europe must not make its crucial IT infrastructure dependent on non-European solutions (same reason why EU is building GALILEO). q Digitalisation of our continent: SMEs, enterprises, public administrations are struggling to cope with the digital revolution (see Industrie 4.0, IoT etc.). q Security and Privacy: Secure systems on European servers are essential for large-scale industry adoption. q Growing need for Language Technologies made in Europe for Europe. http://www.meta-net.eu – http://www.cracker-project.eu 24 Context – Current Developments
  23. 23. Multilingual Europe through Technology Current Initiatives and Activities Multilingual Strategy of the EU: more tech support for multilingualism Language Technologies for Europe's digital public services Technologies for the Multilingual Digital Single Market Language Technologies for Big Data text analytics The Human Language Project – long- term R&D&I, post-H2020 Language Technologies R&D&I (H2020, WP 2018-20) Multilingual Europe in late 2016 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 Open calls and upcoming service contracts Dec. 2016: EC brainstorming meeting on future LT priorities In and post Horizon 2020. Maybe a new document is needed? Jan. 2017: STOA workshop and study on LT for Europe Dec. 2017: LT Session at BDVA Summit in Valencia Q1 2017: MDSM SRIA V1.0 Policy change and initiative towards a European digital public sphere enabled by MT/LT DG CONNECT DGT and DG CONNECT DG CONNECT WP 2018-20 (incl. IoT, I4.0, assistants, robots etc.) Shared programme between EU and MS MLV Programme Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 CEF AT ELRC
  24. 24. Thank you for your attention. georg.rehm@dfki.de http://www.meta-net.eu http://www.facebook.com/META.Alliance http://www.cracker-project.eu http://www.cracking-the-language-barrier.eu 26

×