The META-NET Strategic Research Agenda
for Multilingual Europe 2020
Georg Rehm, Network Manager META-NET
DFKI GmbH, Berlin...
Update on the META-NET Strategic Research Agenda for Multilingual Europe 2020: Final Version and Next Steps


  1. 1. The META-NET Strategic Research Agenda for Multilingual Europe 2020 Georg Rehm, Network Manager META-NET DFKI GmbH, Berlin, Germany European society is multilingual: the diversity of its cultural heritage is an asset and an opportunity. • Europe is and will remain a multilingual, integrative and inclusive society. • Geographical Europe has more than 80 languages, including the EU’s 23 official languages as well as minority and immigrant languages. • Languages without sufficient technological support will become marginalised and threatened by digital extinction. • Decent technologies exist for English but Europe's other languages are under-supported, many of them seriously. Language barriers must be overcome: language technology is a key enabler which will help solve this problem. • • Language barriers are hindering the free flow of information, goods, knowledge, thought and innovation. If the European community makes a dedicated push, we can get rid of many language barriers by 2020 and thus fully realise the single digital space and marketplace. Europeans will be able to communicate with one another, with their governments and with web services in their native mother tongue. • Priority Research Theme 1: Translingual Cloud • The next generation of IT will be able to handle human language, knowledge and emotion in meaningful ways. • LT will enable a host of powerful innovative services (in big data, knowledge use and transmission, control of technology, learning etc.). • Europe has a splendid chance to become a leading actor and economic beneficiary of this revolution. • The problem in Europe is the lack of take-up by industry because research and innovation funding in LT has fallen short of the scale, coordination and breath needed to drive the ball into the goal. Through a focused, concerted, major interdisciplinary LT research effort, Europe can preserve its languages, benefit from language diversity and from existing strengths, and play a leading role in the next IT revolution. The SRA outlines three priority themes: 1. Translingual Cloud; 2. Social Intelligence and e-Participation; 3. Socially Aware Interactive Assistants. • A needed horizontal effort is the coordinated development, improvement and sharing of base technologies and resources for all European languages. • A cloud platform is proposed for providing free and commercial LT services including the access to a wealth of public and private webservices in any European language. • The massive push needs to be accompanied by policy making such as regulations supporting the multilingual setup of our society and the effective utilisation of language data for research and technology development. • The proposed measures have the power to bring about a quantum leap in the evolution of IT, put Europe in a leading position in a core area of economic growth and to allow our languages to thrive in the digital age. • Next steps: further specification of the META-NET roadmaps, see Written (twitter, blog, article, newspaper, text with/without metadata etc.) or spoken input (spontaneous spoken language, video/audio, multiple speakers) Extending translation with semantic data and linked open data Modular combination of analysis, transfer and generation models From very fast but lower quality to slower but very high quality (including instant quality upgrades) Exploiting strong monolingual analysis and generation methods and resources Services and Technologies: • Automatic translation and interpretation • Language checking • Post-editing • Workbenches for creative translations • Novel translation and authoring workflows National Language Institutions Research Centres Language Service Providers Language Technology Providers Universities Other companies (SMEs, startups etc.) European Institutions Multiple target formats • Quality assurance • Computer-supported human translation • Multilingual content production and text authoring • Trusted service centre (privacy, confidentiality, security of source data) Domain, task and genre specialisation models Priority Research Theme 1: Translingual Cloud Priority Research Theme 2: Social Intelligence & e-Participation Priority Research Theme 3: Socially Aware Interactive Assistants Applications: • Crosslingual communication, translation and search • Real-time subtitling, voice-over generation and translating speech from live events • Mobile interactive interpretation • Multilingual content production (media, web, technical, legal documents) • Showcases: translingual spaces for ambient translation Target groups: European citizen, language professional, organisations, companies, European institutions, software applications Any device Language Processing Multiple target formats Single access point Data protection Tools Data Sets Resources Components Metadata Standards Interfaces APIs Catalogues Quality Assurance Data Import/Export Input/Output Storage Performance Availability Scalability European Service Platform for Language Technologies (Cloud or Sky Computing Platform) Language Understanding Text analytics Multilingual technologies Text generation Information and relation extraction Knowledge Priority Research Theme 2: Social Intelligence and e-Participation Mapping large, heterogeneous, unstructured volumes of online content to structured, actionable representations From shallow to deep, from coarse-grained to detailed processing techniques Making language technologies interoperable with knowledge representation and the semantic web Language checking Emotion/ Sentiment “Semantification” of the web: tight integration with the Semantic Web and Linked Open Data Sentiment analysis Named entity recognition Summarisation Knowledge access and management Features Language technology is a key enabling technology for the next IT revolution. Providers of operational and research technologies and services Services and Technologies: • Intelligent analysis of web content, especially social media, comments, blogs, forums • Detection and cross-lingual analysis of decision-relevant information • Multilingual, problem-specific decision support • Text analytics (named entity recognition, event recognition, relation extraction, sentiment analysis and opinion mining including the temporal dimension) • Syntactic, semantic, rhetorical analysis and text structure identification • Resolution of coreference or modality cues • Extraction of semantic representations from arbitrary online content • Clustering, categorising, summarising, visualising discussions and opinion statements Interfaces (web, speech, mobile etc.) Applications: • Technologies for decision support, collective deliberation and e-participation • Public discussion platform for Europewide deliberation on pressing issues • Visualisation of social intelligence data Make use of the wisdom of the crowds Unleashing social intelligence by detecting and monitoring opinions, demands, needs and problems Beneficiaries/users of the platform and processes; modeling evolution of opinions • High performance web-scale content analysis technologies • Events/trend detection and prediction Target groups: European citizen, European institutions, discussion participants, companies Improved efficiency and quality of decision processes European Institutions Research Centres Public Administrations European Citizens LT User Industries Enterprises Universities Understanding influence diffusion across social media Priority Research Theme 3: Socially-Aware Interactive Assistants Noisy environments, any speaker, open vocabulary Error recovery, selfassessment Interacting naturally with and in groups Multilingual capabilities Include human-computer, human-artificial agent and computer-mediated humanhuman communication Learning and forgetting information Adaptable to the user’s needs and preferences and the environment Icelandic Icelandic Finnish Services and Technologies: • Robust, accurate, incremental speech recognition • Natural, incremental speech generation and synthesis, providing expressive voices • Robust dialogue systems Finnish • From speech recognition to speech understanding • Develop methods for the support of incremental conversational speech • Context-aware semantic and pragmatic models of human communication • Parsing with support for temporal inter-dependencies • Strong connections to the other two priority themes Norwegian Norwegian Estonian Swedish Estonian Swedish Lithuanian Danish Irish Latvian Polish Latvian Lithuanian Danish Irish Slovak English Polish Dutch German Applications: Dutch • Use language in connection with other modalities (visual, tactile, haptic) • Education, language training, e-learning • Provide access to knowledge • Robust analysis of user's age, gender, verbal/non-verbal behaviour, social context • Question answering Romanian Croatian Portuguese Can be personalised to individual communication abilities including special needs Can learn incrementally from all interactions and other sources of information Hungarian Slovene Romanian French Basque Croatian French Serbian Basque Serbian Catalan Interacts naturally with humans, in any language and modality German Galician Hungarian Slovene Galician Czech Slovak Czech • Generalised and specialised interactive dialogue systems • Support people interacting with their environment Proactive, self-aware, user-adaptable English Bulgarian Bulgarian Italian Catalan Portuguese Spanish Greek Spanish Italian Greek Maltese Maltese Translingual Cloud Translation Technologies Technologies for Knowledge Discovery and Analysis Social Intelligence Core Technologies for Language Analysis & Production Technologies for Speech Recognition and Analysis Socially Aware Interactive Assistants