Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER
(g...
Outline
q A Multilingual Europe Initiative: META-NET
§ LT Support – META-NET White Paper Series
§ LT Strategy – META-NE...
META-NET and META:
Brief History
http://www.meta-net.eu 3
Multilingual Europe in 2010
4http://www.meta-net.eu
q Challenge: Providing each language community with the most
advanced...
q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith ...
META-NET
White Paper Series
http://www.meta-net.eu 6
q Basque
q Bulgarian*
q Catalan
q Croatian*
q Czech*
q Danish*
q Dutch*
q English*
q Estonian*
q Finnish*
q Fre...
Cross-Lingual Comparison
q 1. Machine Translation 2. Text Analytics
3. Speech Processing/Synthesis 4. Language Resources
...
MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or ...
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
S...
Observations and Results
http://www.meta-net.eu 11
q When it comes to technology
support, there are massive
differences b...
Digital Language Extinction!
q “At Least 21 European Languages in Danger of Digital Extinction!”
q Press release on Euro...
Desudensættesderpengeaftilatøgeantal-
let af operationer og udvide ambulatorieka-
paciteten på det urologiske område på He...
Update of the Study (2014)
q Study comprised 31 volumes/languages.
q Many languages missing! Need for
extension – at lea...
MT
English
good
French,
Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or ...
Excellent
Good
Moderate
Fragmentary
Weak/no
support
LanguageTechnologySupport
MillionsofNativeSpeakers(Worldwide)
Yiddish
...
META-NET
Strategic Research
Agenda (SRA)
http://www.meta-net.eu 17
Three Ingredients
http://www.meta-net.eu 18
Appropriate
Programme
Vision & Agenda
Appropriate
Actors
Research &
Commercial...
Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media a...
Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media a...
Strategic Research Agenda
q Addresses the problems we identified
when preparing the white papers.
q Can put Europe ahead...
Priority Research Themes
q Three priority research themes:
§ Translingual Cloud
§ Social Intelligence and
e-Participati...
Providers of operational and research technologies and services
Research
Centres
European
Institutions
Other
companies (SM...
Icelandic
French
Catalan
Italian
Maltese
Greek
Bulgarian
Romanian
Serbian
Croatian
Slovene Hungarian
Slovak
Czech
German
D...
CRACKER
http://www.meta-net.eu 25
1 DFKI Germany Georg Rehm
2 CUNI Czech Republic Jan Hajic
3 ELDA France Khalid Choukri
4 FBK Italy Marcello Federico
5 ATH...
Selected Activities
2015 2016 2017
M12
M1
M24
M36
Kick-off meeting
for all ICT-17
Projects
translate5
WMT
2016
WMT
2017
IWS...
q META-FORUM 2016 – July 04/05, Lisbon, Portugal
Beyond Multilingual Europe
q META-FORUM 2015 – April 27, Riga, Latvia
T...
The Multilingual
Digital Single Market
http://www.meta-net.eu 29
q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q ...
http://www.meta-net.eu 33
Facts and Figures
http://www.meta-net.eu 34
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multi...
Facts and Figures
http://www.meta-net.eu 35
Geo-blocking:
due to nationality, location, or residence
customers
Language-bl...
The MDSM Fact Sheet
http://www.meta-net.eu 36
Current eCommerce growth within Europe is about half that of the US,
due par...
META-FORUM 2015
AND MDSM SRIA V0.5
http://www.meta-net.eu 37
Open Letter to the EC
q On Friday, March 20, 2015, we published an open letter to the EC on
http://multilingualeurope.eu....
META-FORUM 2015
q April 27 in Riga, Latvia
q Riga Summit 2015 on the Multi-
lingual Digital Single Market
q Two importa...
Joint EFNIL and NPLD Panel
q Joint EFNIL and NPLD panel at META-FORUM 2015.
q Joint position paper.
Initially presented ...
Vision
Paper
Vision Group
Translation and
Localisation Report
Vision Group
Interactive Systems
Report
Vision Group Media
a...
Strategic Agenda for MDSM
q Presented at META-FORUM 2015
and Riga Summit for the first time.
q Version 0.5 – work in pro...
A Strategy for the MDSM
q Strategic R&I Agenda for the
Multilingual Digital Single Market
q Core: Technology Solutions
q...
ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015
Contents
Executive Summary. . . ...
q Letter from Andrus Ansip (June 2015)
q “We invite the European language
technology community to further
develop the id...
Cracking the
Language Barrier
http://www.meta-net.eu 46
Riga Declaration
q 12 organisations present at
META-FORUM 2015 and the
Riga Summit 2015 drafted and
signed the “Declarati...
http://www.cracker-project.eu • http://www.meta-net.eu
• A federation of European projects and
organisations working on te...
Project Members
Organisation Members
http://www.cracker-project.eu • http://www.meta-net.eu
• Website: information about the initia-
tive, all projects and org...
META-FORUM 2016
AND MDSM SRIA V0.9
http://www.meta-net.eu 51
Andrus Ansip’s Blog Post
q Posted on 27 May 2016.
q First public acknowledgment
of the EC that the language
topic is of ...
Reorganisation of DG CONNECT (01/07/2016)
01/07/2016
DG CONNECT
Communications Networks,
Content & Technology
Director-Gen...
Communities & Stakeholders
54
...  and  many  more  research  centres,  companies,  EU  projects  etc.
MDSM SRIA
q Version 0.5 unveiled at META-FORUM 2015
q Version 0.9 unveiled at META-FORUM 2016
q Version 1.0 foreseen fo...
MLV Programme
q Multilingual Value Programe*
§ Three-year programme
§ Requires modest investment
q “Enabling the Multi...
MDSM: Goals and Needs
q Crosslingual communication for SMEs, public institutions, citizens
q Crosslingual SME presales c...
Multilingual Digital Single Market
Automated Translation
E-Commerce
Content, Media,
Verticals
Translation, Language,
Knowl...
Application Areas (Selection)
q Multilingual E-commerce
§ Customer-facing vs. back-office facing (after-market, after-sa...
Setup – Timeframe – Costs
q Close collaboration with EC, EP and all other stakeholders
(including SMEs, research centres,...
Conclusions
and Next Steps
http://www.meta-net.eu 62
q There is a lot of traction for the multilingualism/language topic.
q The EU should develop a Multilingual Strategy (in...
Next Steps
q Several tightly interconnected goals:
§ Multilingual Technologies for Europe
§ Technologies for the Multil...
Thank you.
office@meta-net.eu
http://www.meta-net.eu
http://www.facebook.com/META.Alliance
65
Language Technology Topics
q Multilingual Europe – Technologies for all European languages
q Machine Translation, Text A...
Digital Language Extinction
q Many smaller languages are experiencing problems digitally:
§ Loss of function – other lan...
http://www.meta-net.eu
q Pan-European infrastructure, bringing together providers and consumers of
language data, tools a...
Preparation of the SRA
q Strategic Research Agendas of other initiatives were screened.
q Many suggestions as input from...
• Published in early 2013.
• First strategic research
agenda for our field.
• Complex process of
collecting and shaping
te...
PT1: Translingual Cloud
q Europe has a big need for translations of publishable quality.
q Focus on high-quality transla...
Priority Research Theme 1: Translingual Cloud
Any
device
Target groups: European citizen, language
professional, organisat...
PT2: Social Intelligence
q Better decisions by monitoring social media
q Inclusion of citizens into collective decision ...
Priority Research Theme 2: Social Intelligence and e-Participation
From shallow to deep,
from coarse-grained to
detailed p...
Priority Research Theme 3: Socially-Aware Interactive Assistants
Interacting
naturally
with and in
groups
Learning
and
for...
ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015
Contents
Executive Summary. . . ...
q European Parliament
§ Upcoming STOA Study and Workshop (Jan. 2017)
q European Commission
§ DG CONNECT: Horizon 2020 ...
Multilingual Success Stories
q Moses SMT toolkit as well as research and technology ecosystem
q CEF AT for public online...
Multilingualism for Digital Europe
Multilingualism for Digital Europe
Multilingualism for Digital Europe
Multilingualism for Digital Europe
Multilingualism for Digital Europe
Multilingualism for Digital Europe
Upcoming SlideShare
Loading in …5
×

Multilingualism for Digital Europe

245 views

Published on

Georg Rehm. Mehrsprachigkeit für das Digitale Europa. Ringvorlesung Digitale Lebenswelten, University of Hildesheim, Germany, November 2016. November 15, 2016.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Multilingualism for Digital Europe

  1. 1. META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899). Multilingualism for Digital Europe Georg Rehm General Secretary META-NET, Coordinator CRACKER DFKI, Germany georg.rehm@dfki.de Ringvorlesung Digitale Lebenswelten – Universität Hildesheim, 15th November 2016
  2. 2. Outline q A Multilingual Europe Initiative: META-NET § LT Support – META-NET White Paper Series § LT Strategy – META-NET SRA q Continuing the Initiative – Recent Developments § The Digital Single Market and Multilingualism § Cracking the Language Barrier § META-FORUM 2015/2016 – MDSM SRIA V0.5/V0.9 q Goals and Next Steps http://www.meta-net.eu 2
  3. 3. META-NET and META: Brief History http://www.meta-net.eu 3
  4. 4. Multilingual Europe in 2010 4http://www.meta-net.eu q Challenge: Providing each language community with the most advanced technologies for communication and information so that maintaining their mother tongue does not turn into a disadvantage. q While research has made considerable progress in recent years, the pace of progress is not fast enough to meet the challenge within the next 10-20 years. q All stakeholders – researchers, LT industries, policy makers, language communities, funding programmes – should team up in a strategic alliance for a major dedicated push.
  5. 5. q 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) q Multilingual Europe Technology Alliance. 826 members in 67 countries (published in 2013) (31 volumes; published in 2012) T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
  6. 6. META-NET White Paper Series http://www.meta-net.eu 6
  7. 7. q Basque q Bulgarian* q Catalan q Croatian* q Czech* q Danish* q Dutch* q English* q Estonian* q Finnish* q French* q Galician q German* q Greek* q Hungarian* q Icelandic q Irish* q Italian* q Latvian* q Lithuanian* q Maltese* q Norwegian q Polish* q Portuguese* q Romanian* q Serbian q Slovak* q Slovene* q Spanish* q Swedish* q Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  8. 8. Cross-Lingual Comparison q 1. Machine Translation 2. Text Analytics 3. Speech Processing/Synthesis 4. Language Resources q Ranking: from excellent LT support to weak/no LT support. q Cross-lingual comparison discussed and finalised at a network meeting with representatives of all languages (Oct., 2011). http://www.meta-net.eu 8
  9. 9. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  10. 10. Fragmentary Weak/none Moderate Good Excellent Welsh Maltese Lithuanian Latvian Icelandic Irish Croatian Serbian Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English Levelofsupport Languages with names in red have little or no MT support Results of the META-­NET  White  Paper  Study  (2012)
  11. 11. Observations and Results http://www.meta-net.eu 11 q When it comes to technology support, there are massive differences between Europe’s languages and technology areas. q Support for English is ahead of any other language. q But: even support for English is far from being perfect. q Several languages get the weakest score in all four areas (e.g., Icelan- dic, Latvian, Lithuanian, Maltese)!
  12. 12. Digital Language Extinction! q “At Least 21 European Languages in Danger of Digital Extinction!” q Press release on European Day of Languages (Sept. 26, 2012). q Huge global interest in the topic and our key findings! q 600+ mentions in the press. q News from 40+ countries in 35+ different languages. q 20+ television reports and 30+ broadcast interviews (radio, tv) with META-NET representatives. q Two Parliamentary Questions in the EP on the “digital extinction of languages” topic. q These results lead to a STOA Workshop in the EP (Dec. 3, 2013). http://www.meta-net.eu 12
  13. 13. Desudensættesderpengeaftilatøgeantal- let af operationer og udvide ambulatorieka- paciteten på det urologiske område på Herlev, »Mensåerdetogsåvigtigtatholdefastidet målogikkestillesigtilfredsmed,at80eller85 pct.kommerigennemtiltiden.«B Af Jens Ejsing // ejs@berlingske.dk Det danske sprog har det svært i den digitale verden. Det konstaterer danske sprogforskere- og eksperter i forbindelse med den nye inter- nationale undersøgelse META-NET, der ser nærmere på, hvordan en lang række mindre, europæiske sprog som dansk klarer sig i den digitaleverden. Forskerne fra bl.a. Københavns Universitet og Dansk Sprognævn når frem til, at dansk i fremtiden kan få det endnu sværere i den digitale verden, fordi Google Translate, GPSer, applikationertilsmartphonesogandresprog- teknologiske programmer ikke i tilstrækkelig grad formår at behandle de mange nuancer i detdanskesprog. Professor i sprogteknologi på Københavns Universitet, Bolette Sandford Pedersen, mener, at der er brug for en slags digital dansk sprogbank fyldt med data, så bl.a. oversættel- ser bliver så præcise og gode som muligt. Med hjælp fra sprogbanken kan forskere ifølge professoren hjælpe virksomheder med at for- bedreprogrammer,derskalhåndteresproglig viden om bl.a. maskinoversættelse, tale- genkendelseoginformationssøgning. Dermedvilderblivelængeremellemfejlag- tige oversættelser, som når »hæld olie på pan- den« med Google Translate bliver til »pour oil on the forehead« på engelsk. Oversættelser, der er i værste fald er så upræcise, at danskere ender med at fravælge deres eget sprog i den digitaleverden. Sproghjælp til virksomheder Hun anerkender dog, at »teknologien til auto- matiske oversættelser på mange måder er fantastisk«. »Den er bare ikke god nok, når det gælder dansk,«sigerhun: »Detersomom,atviietvistomfanglægger det i hænderne på Google eller andre virk- somheder at afgøre, om dansk skal behandles godt nok eller ej. Men det danske marked er ikke stort for dem. Spørgsmålet er derfor, Dårlig sprogteknologi truer dansk på nettet Ord. Forskere arbejder på at forbedre danske oversættelser på internettet. om vi ikke i højere grad selv skal gøre noget for at sikre, at det fornødne datamateriale er til rådighed, så vi får gode oversættelser og anden god sprogteknologi. Det kunne f.eks. være ved, at vi gjorde en indsats for at få opret- tet en sprogbank med en masse beriget mate- rialeomdansk.« »Hvis vi hele tiden oplever, at oversættel- ser er behæftede med fejl, tør vi ikke stole på dem,« siger hun og understreger, at »fejlagtige oversættelserkanføretilstoremisforståelser«. Ifølge Dansk Sprognævns direktør, Sabine Kirchmeier-Andersen,kandårligsprogtekno- logi have konsekvenser for mange danskere, derikkeersågodetilengelsk. »Hvis vi har ambitioner om at bruge det danske sprog i fremtidens teknologiske univers, skal der gøres en indsats nu for at fastholde ekspertise og udbygge den viden, vi har,«menerhun: »Ellers risikerer vi, at kun folk, der taler fly- dendeengelsk,vilfåglædeafdenyegeneratio- ner af web-, tele- og robotteknologi, der er på vej.«B INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN KILDE: REGION HOVEDSTADEN H Der er omkring 80 sprog i EU. For 21 af dem – også dansk – gælder det, at der er store sprogteknologiske mangler, når det gælder bl.a. maskinoversættelse, talegenken- delse og informationssøgning. H Ifølge en EU-undersøgelse køber et stigende antal europæiske internetbrugere varer eller tjenester på nettet, hvor det sprog, der bliver anvendt, ikke er deres eget. Det gælder over halvdelen af brugerne. H Over hver tredje anvender et fremmed- sprog til at skrive mail eller indlæg på nettet. fakta H Sprog i Europa 38 Στην ψηφιακή εποχή δεν… µιλούν ελληνικά, όπως και αρκετές άλλες ευρωπαϊκές γλώσσες, σύµφωνα µε πανευρωπαϊ- κή έκθεση µε την υπογραφή 200 και πλέον ειδικών. Η συγκεκριµένη µελέ- τη δηµοσιεύτηκε από το επιστηµονικό δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτε- σινή Ευρωπαϊκή Ηµέρα Γλωσσών. Για τις ανάγκες της έρευνάς τους, γλωσσολόγοι από 34 χώρες της Γη- ραιάς Ηπείρου βαθµολόγησαν τις διαθέσιµες γλωσσικές υπηρεσίες και δηµιούργησαν ένα «Λευκό Βι- βλίο» για κάθε ευρωπαϊκή γλώσσα. Στη µελέτη τους, οι ειδικοί αναζήτη- σαν µεταξύ άλλων τέσσερα βασικά ηλεκτρονικά εργαλεία, δηλαδή την ύπαρξη αυτόµατης µετάφρασης, τη δυνατότητα φωνητικής αλληλε- πίδρασης και ψηφιακής ανάλυσης κειµένου, ενώ ταυτόχρονα διερευνή- θηκε και η διαθεσιµότητα γλωσσικών πόρων ή πηγών. Σε πρώτη φάση εξέτασαν τις ιστο- σελίδες που επιτρέπουν στους χρή- στες να κάνουν µεταφράσεις online, όπως, για παράδειγµα, η υπηρεσία του κολοσσού πληροφορικής Google Translate. Την ίδια ώρα, εξετάστηκε και η «επικοινωνία» των ελληνόφω- νων χρηστών µε τις…συσκευές τους, όπως για παράδειγµα η δυνατότητα να «µιλήσει» κάποιος στο GPS στη µητρική του γλώσσα. Οι ερευνητές κατέληξαν στο συµπέρασµα ότι υπάρχουν τέτοιες συσκευές, αλλά δεν είναι τόσο διαδεδοµένες όσο οι αγγλόφωνες. Το «χρυσό» µετάλλιο κατακτά, όπως είναι άλλωστε και λογικό, η αγγλική γλώσσα. Οι αγγλόφωνοι χρή- στες έχουν την καλύτερη δυνατή τε- χνολογική υποστήριξη, κάτι το οποίο ευνοεί την περαιτέρω εξάπλωση της γλώσσας. Από «τεχνολογικό απο- κλεισµό» κινδυνεύουν περισσότερο η ισλανδική, η λετονική, η λιθουανική και η µαλτέζικη γλώσσα, ενώ σε λίγο καλύτερη µοίρα βρίσκονται η ελλη- νική, η βουλγαρική, η ουγγρική και η πολωνική, που όπως αναφέρει η έρευνα έχουν «αποσπασµατική» τε- χνολογική υποστήριξη. «Μέτρια» χαρακτηρίζεται η υπο- στήριξη χρηστών σε ολλανδική, γαλ- λική, γερµανική, ιταλική και ισπανική γλώσσα. Οι επικεφαλής της επιστη- µονικής οµάδας, Χανς Ουζκοράιτ και Γκεόργκ Ρεµ, αναφέρουν χαρακτηρι- στικά: «Υπάρχουν δραµατικές διαφο- ρές στην υποστήριξη της γλωσσικής τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες. Το χάσµα µετα- ξύ “µικρών” και “µεγάλων” γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων σε ψηφιακούς πόρους γλωσσών µε τις απαραίτητες βασικές τεχνολογί- ες. ∆ιαφορετικά, οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Μάλιστα, οι ειδικοί τονίζουν ότι χω- ρίς αποφασιστική δράση οι γλώσσες αυτές δύσκολα θα… επιβιώσουν στον ψηφιακό κόσµου του 21ου αιώνα. Η κ. Μαρία Γαβριηλίδου, µέλος της επι- στηµονικής οµάδας από το Ινστιτούτο Επεξεργασίας του Λόγου Ερευνητικό Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η έρευνα αυτή δεν λέει ότι δεν θα ζήσει η ελληνική γλώσσα ή ότι κινδυνεύει µε εξαφάνιση». Η ειδικός εξηγεί ότι όσο υπάρχουν άνθρωποι που µιλά- νε, γράφουν και επικοινωνούν µε µια γλώσσα, τότε αυτή θα συνεχίσει να υπάρχει. Είναι σηµαντικό, όµως, να έχουν όλοι οι χρήστες τη δυνατότητα να «µιλήσουν» στις µηχανές, όπως τα GPS τους, στα ελληνικά και να έχουν στη διάθεσή τους γλωσσικά εργαλεία ηλεκτρονικών υπολογιστών. Μεταξύ αυτών των «εργαλείων» είναι οι διορθωτές ορθογραφικών και συντακτικών λαθών, που χρησιµοποι- ούνται καθηµερινά από εκατοντάδες Ελληνες χρήστες και βασίζονται στη γλωσσική τεχνολογία. Παρ’ όλα αυτά, τονίζει ότι η ψη- φιακή εξάπλωση µιας γλώσσας είναι σηµαντική «∆εν είναι στα χέρια του µέσου χρήστη. Οι εκάστοτε κυβερ- νήσεις, η Ευρωπαϊκή Ενωση και ο ιδιωτικός τοµέας πρέπει να χρηµα- τοδοτήσουν την ανάπτυξη αυτής της τεχνολογίας για όλες τις γλώσσες», αναφέρει και συνεχίζει: «Οι χρήστες, όµως, πρέπει να απαιτούν να υπάρ- χουν και στη γλώσσα τους τα µέσα αυτά και να µην ικανοποιούνται µε τα αγγλικά». ■ Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ Life ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ Με ψηφιακή εξαφάνιση κινδυνεύουν τα ελληνικά ΕΛΕΝΗ ΒΕΡΓΟΥ evergou@e-typos.com Η γλώσσα της αποξένωσης… GREEKLISH Οι αγγλόφωνοι χρήστες έχουν την καλύτερη δυνατή τεχνολογική υποστήριξη, γεγονός που ευνοεί την περαιτέρω εξάπλωση της γλώσσας ΜΕ GREEKLISH επικοινω- νούν πλέον µέσω µηνυµά- των ή email οι περισσότεροι νέοι της χώρας µας. Παρά το γεγονός ότι τα τελευ- ταία χρόνια υπάρχουν τα γλωσσικά εργαλεία, τα οποία επιτρέπουν τη χρήση της ελληνικής γραµµατο- σειράς, έφηβοι και νέοι ενήλικες φαίνεται ότι δεν έχουν «αγκαλιάσει» αυτές τις τεχνολογίες. Ο καθη- γητής Γλωσσολογίας, κ. Γιώργος Μπαµπινιώτης, λέει στον «Ε.Τ.»: «Τα greeklish είναι πρόβληµα για την ελληνική γλώσσα, ιδίως για ανθρώπους νέας ηλικίας για έναν καθαρά γλωσσικό λόγο. Με τη χρήση των greeklish αποξενώνονται από τη µορφή της λέξης ή όπως λέµε το ετυµολογικό ίνδαλµα που δηλώνεται µε την ορθογραφία της λέξης και συνδέεται και µε τη ση- µασία της λέξης και µε την προέλευσή της». Ο κίνδυνος, µε τον οποίο έρχονται αντι- µέτωποι οι νέοι άνθρωποι, είναι η αποξένωση από τη γραπτή µορφή της γλώσ- σας. Αυτή η «οικειότητα», όµως, βοηθάει και στην κατανόηση της σηµασίας αλλά και την προέλευση της λέξης. «Αυτή η αποξένωση δεν είναι άνευ σηµασίας», αναφέρει ο ειδικός, ο οποίος εξηγεί ότι η διαδικασία της γραφής βοηθάει να εντυπω- θεί η λέξη και να συνδεθεί µε άλλες οµόρριζες λέξεις. «Οταν χρησιµοποιείται αυτή η µορφή επικοινωνίας, κα- ταστρέφονται, ατονούν. ∆εν είναι προς θάνατο, αλλά θα κάνει ζηµιά», αναφέρει ο κ. Μπαµπινιώτης, ο οποίος συµβουλεύει τους χρήστες να επιλέγουν την ελληνική γραµµατοσειρά. Γιώργος Μπαµπινιώτης. Date 30 September 2012 Page 16 Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency agreement (www.nla.co.uk) or with written publisher permission. For external republishing rights see www.nla-republishing.com 49KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012 Η 26η Σεπτεµβρίου έχει καθιε- ρωθεί από το Συµβούλιο της Ευρώπης ως η Ευρωπαϊκή Ηµέρα των Γλωσσών, αλλά, σύµφωνα µε µια νέα ευρωπαϊκή επι- στηµονική έκθεση, οι 21 από τις 30 γλώσσες της Ευρώπης -µεταξύ των οποί- ων και η Ελληνική- αντιµετωπίζουν κίν- δυνο ψηφιακής εξαφάνισης. Η έρευνα κρούει τον κώδωνα κινδύ- νου, καθώς διαπίστωσε ότι η ψηφιακή βοήθεια για τις περισσότερες ευρωπαϊκές γλώσσες είναι ελλιπής ή απολύτως ανύ- παρκτη για τους χρήστες. Τις έφαγαν οι κοινές Η έκθεση, µε τη µορφή µιας σειράς Λευκών Βίβλων (µε τίτλο «Γλώσσες στην Ευρωπαϊκή Κοινωνία της Πληροφορίας»), από το επιστηµονικό δίκτυο ΜΕΤΑ- ΝΕΤ, το οποίο συνενώνει 60 ερευνητικά κέντρα σε 34 χώρες, επισηµαίνει ότι οι γλώσσες που µιλιούνται από σχετικά µικρό αριθµό ανθρώπων κινδυνεύουν, επειδή δεν έχουν τεχνολογική υποστή- ριξη όπως έχουν οι ευρέως χρησιµο- ποιούµενες γλώσσες. Λευκές Βίβλοι έχουν καταρτιστεί για τις εξής ευρω- παϊκές γλώσσες: αγγλικά, βασκικά, βουλγαρικά, γαλικιανά, γαλλικά, γερ- µανικά, δανικά, ελληνικά, εσθονικά, ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά, καταλανικά, κροατικά, λετονικά, λι- θουανικά, µαλτέζικα, νορβηγικά (µπουκ- µόλ και νινόρσκ), ολλανδικά, ουγγρικά, πολωνικά, πορτογαλικά, ρουµανικά, σερβικά, σλοβακικά, σλοβενικά, σουη- δικά, τσεχικά και φινλανδικά. Κάθε Λευκή Βίβλος είναι γραµµένη στη γλώσ- σα στην οποία αναφέρεται και είναι µεταφρασµένη στα αγγλικά. Τέσσερις µεγάλοι κίνδυνοι Σύµφωνα µε τη νέα µελέτη, η Ισ- λανδική, η Λετονική, η Λιθουανική και η Μαλτέζικη αντιµετωπίζουν τον µε- γαλύτερο κίνδυνο εξαφάνισης σε µια ευρωπαϊκή τεχνολογική κοινωνία, που ολοένα περισσότερο προωθεί τη χρήση συγκεκριµένων γλωσσών και ιδίως της Αγγλικής. Όµως και άλλες γλώσσες, όπως η Ελληνική, η Βουλγαρική, η Ουγ- γρική και η Πολωνική, επίσης κινδυ- νεύουν στον σύγχρονο ψηφιακό κόσµο. Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία συνέβαλαν περισσότεροι από 200 ειδικοί, αξιολογεί τον κίνδυνο για κάθε γλώσσα µε βάση τέσσερα βασικά κριτήρια σε τεχνολογικό/ψηφιακό επίπεδο: την ύπαρ- ξη αυτόµατης µετάφρασης στη συγκε- κριµένη γλώσσα, τη δυνατότητα φωνη- τικής αλληλεπίδρασης, τη δυνατότητα ψηφιακής ανάλυσης κειµένου και τη διαθεσιµότητα των σχετικών ψηφιακών γλωσσικών πόρων/πηγών. Οι δυνατές Η γλώσσα µε την καλύτερη βαθµο- λογία στα κριτήρια είναι ασφαλώς η Αγγλική, που απολαµβάνει τη συγκριτικά καλύτερη τεχνολογική υποστήριξη (αν και όχι την καλύτερη δυνατή), γεγονός που διευκολύνει την περαιτέρω εξά- πλωσή της. Ακολουθούν µε ικανοποιητική ή µέ- τρια τεχνολογική/ψηφιακή υποστήριξη η Ολλανδική, η Γαλλική, η Γερµανική, η Ιταλική και η Ισπανική. Η Ελληνική, όπως επίσης η Βασκική, η Καταλανική, η Πολωνική, η Ουγγρική κ.ά. κατα- τάσσονται στις γλώσσες µε «αποσπα- σµατική» µόνο υποστήριξη, γι’ αυτό ακριβώς θεωρούνται γλώσσες υψηλού κινδύνου προς εξαφάνιση. Δραµατικές διαφορές Σύµφωνα µε τους επιµελητές της µε- λέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ, «υπάρχουν δραµατικές διαφορές στην υποστήριξη της γλωσσικής τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες και τεχνολογικές περιοχές. Το χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’ γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες, αλλιώς οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Ως ελπίδα αυτών των γλωσσών θεω- ρείται η βελτίωση και η ευρύτερη αξιο- ποίηση του λογισµικού γλωσσικής τε- χνολογίας, το οποίο επιτρέπει τη φω- νητική και τη γραπτή επεξεργασία των διαφόρων γλωσσών. Παραδείγµατα αυτών των δυνατοτή- των είναι οι ηλεκτρονικοί ορθογραφικοί και συντακτικοί διορθωτές κειµένων, οι διαδραστικοί προσωπικοί «βοηθοί» των έξυπνων κινητών τηλεφώνων (π.χ. η Siri στο iPhone), τα συστήµατα αυ- τόµατης µετάφρασης, τα ηλεκτρονικά συστήµατα διαλόγου των τηλεφωνικών κέντρων, οι µηχανές αναζήτησης, η συνθετική φωνή στα συστήµατα πλοή- γησης των αυτοκινήτων. κ.ά. Το βασικό πρόβληµα Το σηµαντικό, σύµφωνα µε την έκ- θεση, είναι όλες αυτές οι δυνατότητες να προσφέρονται στους χρήστες και στη µητρική τους γλώσσα που κινδυνεύει µε εξαφάνιση. Χωρίς αποφασιστική δρά- ση, γίνεται η δυσοίωνη πρόβλεψη ότι οι γλώσσες αυτές δύσκολα θα επιβιώσουν στον ψηφιακό κόσµο του 21ου αιώνα. Ένα πρόβληµα είναι ότι το λογισµικό αυτών των συστηµάτων γλωσσικής τε- χνολογίας στηρίζεται σε στατιστικές µε- θόδους που απαιτούν τεράστιες ποσό- τητες γραπτών ή φωνητικών δεδοµένων, όµως τόσα πολλά δεδοµένα είναι δύσκολο να αποκτηθούν για γλώσσες που οµι- λούνται από σχετικά λίγους ανθρώπους. Εξάλλου, ακόµα και για ευρέως χρη- σιµοποιούµενες γλώσσες όπως τα αγ- γλικά, η σχετική γλωσσική τεχνολογία έχει ακόµα αδυναµίες, που είναι π.χ. φανερές στις άκρως ανεπαρκείς και γε- µάτες λάθη αυτόµατες µεταφράσεις. Η έκθεση προτείνει ότι πρέπει να αναληφθεί µια συντονισµένη µεγάλης κλίµακας προσπάθεια στην Ευρώπη, προκειµένου σταδιακά να δηµιουργηθούν ή να βελ- τιωθούν οι αναγκαίες τεχνολογίες και να βοηθηθούν οι γλώσσες που είναι ψη- φιακά παραγκωνισµένες. Τη γλώσσα µού... έχασαν Οι περισσότερες ευρωπαϊκές γλώσσες κινδυνεύουν µε ψηφιακή εξαφάνιση Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες ?049-ΚΟΣΜΟΣ 29/09/2012 1:41 ?Μ Page 49
  14. 14. Update of the Study (2014) q Study comprised 31 volumes/languages. q Many languages missing! Need for extension – at least of the comparison. q We invited three language community bodies to participate in the update: European Federation of National Institutions for Language (EFNIL) Network to Promote Linguistic Diversity (NPLD) Experts Committee of the European Language Charter (Council of Europe) http://www.meta-net.eu 14 CCURL 2014 – Collaboration and Computing for Under- Resourced Languages in the Linked Open Data Era
  15. 15. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support Albanian, Asturian, Basque, Bosnian, Breton, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Frisian, Friulian, Galician, Greek, Hebrew, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Norwegian, Occitan, Portuguese, Romany, Scots, Serbian, Slovak, Slovene, Swedish, Turkish, Vlax Romani, Welsh, Yiddish excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish, Turkish weak or no support Albanian, Asturian, Bosnian, Breton, Croatian, Frisian, Friulian, Hebrew, Icelandic, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romanian, Romany, Scots, Vlax Romani, Welsh, Yiddish excellent English good Speech English good Dutch, French, German, Hebrew, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support Albanian, Asturian, Bosnian, Breton, Croatian, Estonian, Frisian, Friulian, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romany, Scots, Serbian, Turkish, Vlax Romani, Welsh, Yiddish excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Hebrew, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Albanian, Asturian, Bosnian, Breton, Frisian, Friulian, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romany, Scots, Turkish, Vlax Romani, Welsh, Yiddish weak/no supportexcellent ResourcesTextAnalytics
  16. 16. Excellent Good Moderate Fragmentary Weak/no support LanguageTechnologySupport MillionsofNativeSpeakers(Worldwide) Yiddish Welsh VlaxRomani Turkish Scots Romany Occitan Maltese Macedonian Luxembourgish Lithuanian Limburgish Latvian Icelandic Friulian Frisian Breton Bosnian Asturian Albanian Irish Croatian Serbian Hebrew Estonian Slovene Slovak Romanian Norwegian Greek Galician Danish Bulgarian Basque Swedish Portuguese Finnish Catalan Polish Hungarian Czech Italian German Dutch Spanish French English 0 50 100 150 200 250 300 350 400 Extension  of the META-­NET  White  Paper  Study  (2013/2014)
  17. 17. META-NET Strategic Research Agenda (SRA) http://www.meta-net.eu 17
  18. 18. Three Ingredients http://www.meta-net.eu 18 Appropriate Programme Vision & Agenda Appropriate Actors Research & Commercialisation Appropriate Support Funding
  19. 19. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes Planning Process Strategic Research Agenda 2010 2011 2012
  20. 20. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes Planning Process: Documents Strategic Research Agenda 2010 2011 2012 www.meta-net.eu office@meta-net.eu T: +49 30 23895 1833 The Future European Multilingual Information Society Vision Paper for a Strategic Research Agenda “People can’t share knowledge if they don’t speak a common language.” Davenport, Thomas H, and Laurence Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School, Boston, 1997, p. 98. Join the discussion at www.meta-et.eu/forum LT 2020 Vision and Priority Themes for Language Technology Research in Europe until the Year 2020 Towards the META-NET Strategic Research Agenda The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro- pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement 270893) and META-NORD (Grant Agreement 270899). Do you have comments, ideas or suggestions with regard to the content of this document? Please send them to office@meta-net.eu or discuss them online: http://www.meta-net.eu/sra. This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Translation and Localisation Results of first two meetings Editors: Aljoscha Burchardt, Georg Rehm Dissemination Level: Public Date: 3 December 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Media and Information Services: Results of first two meetings Editors: Maria Koutsombogera, Stelios Piperidis Dissemination Level: Public Date: 10 November 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Interactive Systems: Results of first two meetings Editors: Joseph Mariani, Bernardo Magnini Dissemination Level: Public Date: 28 December 2010
  21. 21. Strategic Research Agenda q Addresses the problems we identified when preparing the white papers. q Can put Europe ahead of its competitors in this technology area. q 200 contributors; >2 years. 54% industry; 46% research; 4% (inter)national institutions. q Presented and discussed at 90+ conferences and major workshops. q Published in early 2013. q http://www.meta-net.eu/sra http://www.meta-net.eu 21
  22. 22. Priority Research Themes q Three priority research themes: § Translingual Cloud § Social Intelligence and e-Participation § Socially-Aware Interactive Assistants q Two additional themes: § European Service Platform for Language Technologies § Core Technologies for Language Analysis and Production http://www.meta-net.eu 22
  23. 23. Providers of operational and research technologies and services Research Centres European Institutions Other companies (SMEs, startups etc.) National Language Institutions Language Technology Providers Language Service Providers Universities European Institutions Research Centres Public Administrations Enterprises LT User Industries Universities European Citizens Beneficiaries/users of the platform Interfaces (web, speech, mobile etc.) Priority Research Theme 1: Translingual Cloud Priority Research Theme 2: Social Intelligence & e-Participation Priority Research Theme 3: Socially Aware Interactive Assistants European Service Platform for Language Technologies (Cloud or Sky Computing Platform) Multilingual technologies Text analytics Text generation Language checking Sentiment analysis Named entity recognition Summari- sation Knowledge access and management Information and relation extraction Language Processing Language Understanding Knowledge Emotion/ Sentiment Data protection Tools Data Sets Resources Components Metadata Standards Interfaces APIs Catalogues Quality Assurance Data Import/Export Input/Output Storage Performance Availability Scalability Features
  24. 24. Icelandic French Catalan Italian Maltese Greek Bulgarian Romanian Serbian Croatian Slovene Hungarian Slovak Czech German Danish Lithuanian Latvian Estonian Finnish Swedish Norwegian Basque Spanish Portuguese Galician English Irish PolishDutch Polish English Irish Icelandic Italian Maltese Greek Bulgarian Romanian SerbianCroatian Slovene Hungarian Slovak Czech German Dutch Danish Lithuanian Latvian Estonian Finnish Swedish Norwegian Basque Spanish Portuguese Galician French Catalan http://www.meta-net.eu 24 Concrete  result  of  these  activities: One  call  for  proposals   around  Machine  Translation  in  Horizon  2020  WP  2015-­17.
  25. 25. CRACKER http://www.meta-net.eu 25
  26. 26. 1 DFKI Germany Georg Rehm 2 CUNI Czech Republic Jan Hajic 3 ELDA France Khalid Choukri 4 FBK Italy Marcello Federico 5 ATHENA RC Greece Stelios Piperidis 6 UEDIN UK Philipp Koehn 7 USFD UK Lucia Specia Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu Cracking the Language Barrier Coordination, Evaluation and Resources for European MT Research THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence Language-blocking: languages they do not speak Geo-blocking and language-blocking are barriers to access Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing Communities • META-NET incl. META-SHARE and META • MT evaluation initiatives – WMT, IWSLT, MT Marathons • MT and other LT industry • Language resources – META-SHARE, ELRA • HT/MT evaluation tools – translate5 • Translation industry, translation profession • MT user communities Strategic Agenda for the Multilingual Digital Single Market • Version 0.5 presented at META-FORUM 2015 (Riga) • Version 0.9 presented at META-FORUM 2016 (Lisbon) Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016
  27. 27. Selected Activities 2015 2016 2017 M12 M1 M24 M36 Kick-off meeting for all ICT-17 Projects translate5 WMT 2016 WMT 2017 IWSLT 2015 IWSLT 2016 IWSLT 2017 QT Marathon 2015 QT Marathon 2016 Roadmap for European MT Research Survey on the State of HQMT in Industry and LSPs SRIA (initial version) SRIA (update) SRIA (final) version 2version 1 • Production of  resources  (e.g.,  for  WMT   2016  and  2017,  IWSLT  2015-­2017) • Tools (quality  control,  evaluations) • Strategies and  roadmaps  (SRIA,   Roadmap  for  European  MT  Research) • Exchange  and  sharing  facility  for   resources  (META-­SHARE) Recent or Upcoming Events • LREC Workshop on MT Eval. (May 25) • META-FORUM 2016 (July 4/5, Lisbon) • WMT 2016 (Aug. 11/12, Berlin) • IWSLT 2016 (Dec. 8/9, Seattle) • Federation of organisations and projects working on technologies for multilingual Europe. • 10 organisations; 24 projects. • Areas of collaboration: data management and repositories, tools, shared tasks, evaluations. • Goal: provide one umbrella organisation for the whole community. http://www.cracking-the-language-barrier.eu
  28. 28. q META-FORUM 2016 – July 04/05, Lisbon, Portugal Beyond Multilingual Europe q META-FORUM 2015 – April 27, Riga, Latvia Technologies for the Multilingual Digital Single Market q META-FORUM 2013 – Sept. 19/20, Berlin, Germany Connecting Europe for New Horizons q META-FORUM 2012 – June 20/21, Brussels, Belgium A Strategy for Multilingual Europe q META-FORUM 2011 – June 27/28, Budapest, Hungary Solutions for Multilingual Europe q META-FORUM 2010 – Nov. 17/18, Brussels, Belgium Challenges for Multilingual Europe http://www.meta-net.eu 28
  29. 29. The Multilingual Digital Single Market http://www.meta-net.eu 29
  30. 30. q Top priority in the European Union. q Expected to add 400b€ to European GDP and hundreds of thousands of new jobs. q Unfortunately, the language topic is not included in the EC’s Digital Single Market strategy (published in May 2015).
  31. 31. http://www.meta-net.eu 33
  32. 32. Facts and Figures http://www.meta-net.eu 34 THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing
  33. 33. Facts and Figures http://www.meta-net.eu 35 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. and marketing costs. responsiveness is needed to achieve customer satisfaction and build brand loyalty.
  34. 34. The MDSM Fact Sheet http://www.meta-net.eu 36 Current eCommerce growth within Europe is about half that of the US, due partially to a lack of language coverage from European SMEs. Lessthan5%ofEuropeanSMEscurrentlysellcross-language. Multilingual Digital Single Market Why Europe needs a No single language accounts for more than 20% of the potential Multilingual Digital Single Market. Most account for less than 3% of the DSM. Without a solution, the European Digital Single Market will remain fragmented. Europe’s 24 official languages present a tremendous opportunity for European business Removing language barriers within Europe would open access to 73% (with >€25 trillion in annual revenue!) of the world’s digitally accessible market to European enterprise. Europetodayisnotasinglemarket: itisaseparatedinto20+smalllanguagemarkets. www.meta-net.eu Chinese (510 million) W orld Spanish (1 65 millio n) W orld Po rtug ue se (8 3 millio n) English (565 million) Ja pane se (1 00 millio n) Rus sian (6 0 millio n) Europe today (Many small markets) LANGUAGE TECHNOLOGY The Multilingual Digital Single Market Online Population Source:InternetWorldStats(MiniwattMarketingGroup)InternetWorldStats(Mini THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing Good Moderate Fragmentary Weak/no support 0 50 100 150 200 250 300 350 400 LanguageTechnologySupport* MillionsofNativeSpeakers(Worldwide) LanguageTechnology Danger Zone (≈150 million EU citizens) LanguageTechnology Danger Zone (≈150 million EU citizens) Spanish English Portuguese German French Italian Polish Romanian Dutch Greek Hungarian Czech Swedish Bulgarian Danish Croatian Slovak Finnish Lithuanian Slovene Latvian Estonian Maltese Irish 140 million EU citizens are in the LanguageTechnology Danger Zone, where language technology is inadequate to support the DSM. Current online automatic translation provided by US tech giants does not solve less than 30% of automatically translated content is truly useful for online commerce. Only three European languages Boosting commerce through multilingual technologies2 Connecting citizens to European digital public services3 Without LanguageTechnology, the European Commission has no way to respond effectively to citizen participation. Current language technology is inadequate for over half of the EU official languages to help the European Commission solve its citizen engagement problem. Translation opens 20 times its cost in revenue opportunity. However, translation remains too expensive for many European SMEs, blocking this opportunity and limiting economic growth in Europe. Lowering these costs is a strategic opportunity Translation Costs Increase in Revenue good bad ugly OnlineAutomatic TranslationQuality Most local governmental services are monolingual only. This poses a problem for tourists, expatriates, and linguistic minorities. Language technology can provide the Multilingual eParticipation can help build the European Identity with one another in their respective native languages with sophisticated machine translation working behind the scenes. Only when EU citizens can interact in their own languages will they truly develop a sense of European identity and community. Over half of EU citizens are language blocked from interacting with the European Commission’s web resources for citizen participation. 290 million EU citizens excluded Speakers of other languages are language blocked from full participation Speakers of English, French, German can participate fully Strategic Agenda for the Multilingual Digital Single Market http://rigasummit2015.eu. META, the Multilingual EuropeTechnology Alliance, has more than 750 members (http://www.meta-net.eu LT-Innovate, the European Association of the LanguageTechnology Industry, has 180 corporate members throughout Europe (http://lt-innovate.eu Technology support has improved for some languages since this study was completed. Technology Solutions Investment in the following solutions will help achieve the Multilingual Digital Single Market Unified Customer Experience care, customer relationship, discussion fora, Multimodal User Experience for Connected Devices interfaces household appliances, and consumer Voice of the Customer market research Content Curation and Production DigitalTranslation Centre customers, citizens TheforthcomingStrategicAgendafortheMultilingualDigitalSingleMarketwillprovideadditional detailsontheseandothersolutionsfortheneedsoftheMultilingualDigitalSingleMarket. Downloadthisfactsheetfromhttp://cracker-project.eu. FormoreinformationcontactDr.GeorgRehm(DFKI)atgeorg.rehm@dfki.de. http://cracker-project.eu/wp-content/uploads/2015/11/mDSM-Fact-Sheet.pdf
  35. 35. META-FORUM 2015 AND MDSM SRIA V0.5 http://www.meta-net.eu 37
  36. 36. Open Letter to the EC q On Friday, March 20, 2015, we published an open letter to the EC on http://multilingualeurope.eu. q On Monday, March 23, 2015, we informed President Juncker and all Commissioners about the campaign and the 1300+ signatures. q By now more than 3600 signatures! 38 q 5 Members of the European Parliament q 150+ high-level representatives from industry (CxO level) q 1200+ professors q 400+ project or research managers q 20+ entrepreneurs and founders q hundreds of language and language technology professionals, officials, researchers, administrators and representatives from related stakeholder groups Who  signed?
  37. 37. META-FORUM 2015 q April 27 in Riga, Latvia q Riga Summit 2015 on the Multi- lingual Digital Single Market q Two important components: § MDSM SRIA Version 0.5 § Further community fusing q http://www.meta-forum.eu
  38. 38. Joint EFNIL and NPLD Panel q Joint EFNIL and NPLD panel at META-FORUM 2015. q Joint position paper. Initially presented at META-FORUM 2015 and the Riga Summit 2015 on the Multilingual Digital Single Market, April 27, 2015 www.rigasummit2015.eu Joint NPLD/EFNIL Position Paper on the Multilingual Digital Single Market ! “Languages are not only a means of communication. They also have embedded in them people’s values, aspirations and hopes.”(European Roadmap for Linguistic Diversity 2015, NPLD) “Many European languages run the risk of becoming victims of the digital age as they are un- der-represented and under-resourced online. Huge regional market opportunities remain un- tapped because of language barriers.” (Multilingual Europe: A challenge for language tech. MultiLingual. April/May 2011, page 51/52)
  39. 39. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes META-NET Strategic Research Agenda for Multilingual Europe 2020 2010 2011 2012 2013 2014 2015 www.meta-net.eu office@meta-net.eu T: +49 30 23895 1833 The Future European Multilingual Information Society Vision Paper for a Strategic Research Agenda “People can’t share knowledge if they don’t speak a common language.” Davenport, Thomas H, and Laurence Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School, Boston, 1997, p. 98. Join the discussion at www.meta-et.eu/forum LT 2020 Vision and Priority Themes for Language Technology Research in Europe until the Year 2020 Towards the META-NET Strategic Research Agenda The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro- pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement 270893) and META-NORD (Grant Agreement 270899). Do you have comment s, ideas or suggestio ns with regard to the content of this document ? Please send them to office@m eta-net.eu or discuss them online: http://ww w.meta-n et.eu/sra. This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Translation and Localisation Results of first two meetings Editors: Aljoscha Burchardt, Georg Rehm Dissemination Level: Public Date: 3 December 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Media and Information Services: Results of first two meetings Editors: Maria Koutsombogera, Stelios Piperidis Dissemination Level: Public Date: 10 November 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Interactive Systems: Results of first two meetings Editors: Joseph Mariani, Bernardo Magnini Dissemination Level: Public Date: 28 December 2010 Strategic Research and Innovation Agenda roadmaps,  agendas  and  any   other  input  from  other  initiatives … D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  40. 40. Strategic Agenda for MDSM q Presented at META-FORUM 2015 and Riga Summit for the first time. q Version 0.5 – work in progress q Builds upon many strategy papers and roadmaps prepared by several European projects, incl. the META-NET SRA (2013). q Input and feedback collected at the Riga Summit 2015 to be used for upcoming versions. http://www.meta-net.eu D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  41. 41. A Strategy for the MDSM q Strategic R&I Agenda for the Multilingual Digital Single Market q Core: Technology Solutions q Data economy is an inherent component – LT for effective multilingual data value chains. http://www.meta-net.eu 43
  42. 42. ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015 Contents Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i 1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6 1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10 2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10 2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18 3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18 3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29 5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
  43. 43. q Letter from Andrus Ansip (June 2015) q “We invite the European language technology community to further develop the ideas presented in the draft Strategic Agenda for the multilingual Digital Single Market”
  44. 44. Cracking the Language Barrier http://www.meta-net.eu 46
  45. 45. Riga Declaration q 12 organisations present at META-FORUM 2015 and the Riga Summit 2015 drafted and signed the “Declaration of Common Interests”. q CRACKER: community building, mostly among projects. q We combined these into the Cracking the Language Barrier federation. q Important goal: measure against community fragmentation. http://www.meta-net.eu DECLARATION OF COMMON INTERESTS We, the undersigned, declare here, at the Riga Summit on the Multilingual Digital Single Market, encouraged by the letter Vice President Andrus Ansip sent to its participants, that we stand united in our goal and interest to: - support multilingualism in Europe by employing language technology in business, society and governance, to create a truly Multilingual Digital Single Market, - exchange and share information in our efforts to promote our goals and interests at local, national and European levels, - raise awareness in society at large using channels available to our associations, alliances and societies. In the near future, we foresee the establishment of a Memorandum of Understanding among our organisations towards a “Coalition for a Multilingual Europe”, to better serve our members address the language barrier challenges towards establishing a truly integrated Multilingual Digital Single Market. Riga, 29. April 2015 Signed by (in alphabetical order): BDVA Laure Le Bars CITIA Steve Renals CLARIN Steven Krauwer EFNIL Sabine Kirchmeier-Andersen, Tamás Váradi ELEN Davyth Hicks, Claudia Soria ELRA Nicoletta Calzolari, Khalid Choukri GALA Laura Brandon, Robert E. Etches, Sergey Gladkov LT Innovate Jochen Hummel, Philippe Wacker META-NET Jan Hajic, Josef van Genabith, Georg Rehm, Andrejs Vasiljevs NPLD Meirion Prys Jones TAUS Jaap van der Meer W3C Richard Ishida, Felix Sasaki For any questions, please contact Georg.Rehm@dfki.de.
  46. 46. http://www.cracker-project.eu • http://www.meta-net.eu • A federation of European projects and organisations working on technologies for a multilingual Europe. • Multi-lateral Memorandum of Understanding; 10 organisations and 24 projects on board already (including FP7 and H2020-ICT15). • Getting new members on a regular basis. • Selected areas of collaboration: data management and repositories, tools, shared tasks, evaluations, events. • Goal: provide one umbrella organisation for the whole community.
  47. 47. Project Members Organisation Members
  48. 48. http://www.cracker-project.eu • http://www.meta-net.eu • Website: information about the initia- tive, all projects and organisations • Downloadable documents • List of events • LREC 2016 MT Eval Workshop • Several new members will join the initiative soon http://www.cracking-the-language-barrier.eu
  49. 49. META-FORUM 2016 AND MDSM SRIA V0.9 http://www.meta-net.eu 51
  50. 50. Andrus Ansip’s Blog Post q Posted on 27 May 2016. q First public acknowledgment of the EC that the language topic is of very high relevance for the Digital Single Market. q “Overcoming language barriers is vital for building the DSM, which is by definition multilingual. It is now time to reduce and remove the language barriers that are holding back its advance, and turn them into competitive advantages.” http://www.meta-net.eu 52
  51. 51. Reorganisation of DG CONNECT (01/07/2016) 01/07/2016 DG CONNECT Communications Networks, Content & Technology Director-General R. Viola (60240 Assistants O. Bringer (92067 P. Stuckmann (21097 Deputy Director-General in charge of Directorates A, C, E & H G. Kent (acting) (91945 Assistant E. Mitjana (81149 Deputy Director-General in charge of Directorates B, D, F, G & I C. Bury (60499 Assistant P. Lamotte (98892 Directorate F Digital Single Market G. de Graaf (68466 Directorate E Future Networks M. Campolargo (63479 Directorate D Policy Strategy & Outreach L. Corugedo Steneberg (96383 Directorate C Digital Excellence & Science Infrastructure Th. Skordas (acting) (68908 Directorate B Electronic Communications Networks & Services A. Whelan (50941 Directorate A Digital Industry K. Rouhana (68057 Principal Adviser F. Lupescu (68538 Directorate R Resources & Support G. Kent (91945 Directorate I Media Policy G. Abbamonte (93573 Directorate H Digital Society, Trust & Cybersecurity P. Timmers (90245 Directorate G Data J. Hernández-Ros (acting) (34533 F.1: Digital Policy Development & Coordination M. Bailey (acting) (69176 E.1: Future Connectivity Systems B. Barani (acting) (69616 D.1: Research Strategy & Programme Coordination M. Fjalland (50021 C.1: eInfrastructure & Science Cloud A. Burgueño Arjona (92471 B.1: Electronic Communications Policy V. Terävä (92381 A.1: Robotics & Artificial Intelligence J. Heikkilä (35325 R.1: Human Resources & Competences I. Mariën-Dusak (92376 I.1: Audiovisual & Media Services Policy L. Boix Alonso (90009 H.1: Cybersecurity & Digital Privacy J. Boratynski (69452 G.1: Data Policy & Innovation M. Nagy-Rothengass (31680 F.2: E-Commerce & Platforms P. Agarwal (acting) (87153 E.2: Cloud & Software P. O’Donohue (91280 D.2: Policy Implementation & Planning E. Forti (65172 C.2: High Performance Computing & Quantum Technology G. Kalbe (32866 B.2: Implementation of the Regulatory Framework W-D. Grussmann (58559 A.2: Technologies & Systems for Digitising Industry M. Lemke (91575 R.2: Budget & Finance M-C. Laffineur (68515 I.2: Copyright M. Martin-Prat (65157 H.2: Smart Mobility & Living E. Hartog (90084 G.2: Data Applications & Creativity J. Hernández-Ros (34533 F.3: Start-ups & Innovation P. Zilgalvis (50935 E.3: Next- Generation Internet J. Villasante (63521 D.3: Policy Outreach & International Affairs A. Angelova-Krasteva (91145 C.3: Future & Emerging Technologies (FET) V. Peca (57843 B.3: Markets R. Krüger (61555 A.3: Competitive Electronics Industry W. Van Puymbroeck (68138 R.3: Knowledge Management & Support Systems F. Accordino (98272 I.3: Audiovisual Industry & Media Programme L. Recalde Langarica (91281 H.3: E-Health, Well-Being & Ageing M. González-Sancho (52918 G.3: Learning, Multilingualism & Accessibility M. Marsella (acting) (32750 F.4: Digital Economy & Skills L. Sioli (51262 E.4: Internet of Things M. Rohen (63674 D.4: Communication D. Ringrose (93913 C.4: Flagships Th. Skordas (68908 B.4: Radio Spectrum Policy A. Geiss (59466 A.4: Photonics C. Maloney (69082 R.4: Compliance & Planning K. Engelbosch (54693 I.4: Media Convergence & Social Media J. Cotta (66407 H.4: E-Government & Trust A. Servida (58186 G.4: Administration & Finance G. Kalbe (acting) (32866 A.5: Administration & Finance * A. Fiala (64787 B.5: Investment in High-Capacity Networks A. Krzyżanowska (87246 H.5: Administration & Finance ** G. Van Caenegem (acting) (61895 R.5: Programme Operations & Common Services I. Malekos (52902 Mirror-Unit REA.A.5 Fostering Novel Ideas: FET-Open T. Hallantie (68167 Mirror-Unit EACEA.B.2 Creative Europe: MEDIA H. Trettenbrein (84955 Mirror-Unit REA.C.4 Expert Contracting & Payments A. Oram (97805 Principal Adviser M. Richards (62443 Adviser for Legal & Legislative Issues Ž. Bahovec (88284 Adviser for cross-cutting Policy/Research Issues G. Santucci (68963 Adviser for International Relations linked to Future Networks P. Blixt (68048 Adviser for Societal Issues N. Dewandre (94925 Adviser for Organisational Transition (Finance) Vacant Adviser for Societal Challenges Vacant Adviser for Innovation Systems B. Salmelin (69564 Reporting lines are: - R. Viola for Directorate R; - G. Kent (acting) for Directorates A, C, E, H; - C. Bury for Directorates B, D, F, G, I. Luxembourg; To be transferred to Luxembourg. Shared Administration & Finance Unit for Directorates A, B, C, D & F. Shared Administration & Finance Unit for Directorates E, H & I. Unit G.1 “Data Policy & Innovation” Unit G.3 “Learning, Multilingualism & Accessibility” • Support the data economy in the Digital Single Market • Policy initiatives addressing new and emerging issues. • Advance the Commission open data policy by ensuring the correct implementation of the PSI Directive and the Pan- European Open Data Portal • Promote the emergence of an ecosystem comprising all the players of the data value chain. • Steers together with industry the SRIA. • Addresses key framework conditions of the data economy • Fund research and innovation in data technologies and applications inter alia by driving the big data PPP. • Make the DSM more accessible, secure and inclusive. • Support policy, research, innovation and deployment of learning technologies • Support key enabling digital language technologies and services to allow all European consumers and businesses to fully benefit from the Digital Single Market. • Responsible for Web Accessibility Directive • Promote a better Internet for children by protecting and empowering children online, and improving the quality of content available to them.
  52. 52. Communities & Stakeholders 54 ...  and  many  more  research  centres,  companies,  EU  projects  etc.
  53. 53. MDSM SRIA q Version 0.5 unveiled at META-FORUM 2015 q Version 0.9 unveiled at META-FORUM 2016 q Version 1.0 foreseen for Nov./Dec. 2016 q Prepared and presented by Cracking the Language Barrier federation (editorial team: 13 colleagues) q SRIA addresses how the LT community is going to act united in order to make the DSM multilingual q Document available on http://www.cracker-project.eu and also on http://www.cracking-the-language-barrier.eu D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  54. 54. MLV Programme q Multilingual Value Programe* § Three-year programme § Requires modest investment q “Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content” q Three components address the main needs of the Multilingual DSM (MDSM) and how to put them into practice: 1. Multilingual Application Areas 2. Multilingual Services 3. Research http://www.meta-net.eu 57 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 * SRIA V0.9 and MLV Programme devised before re-organisation of DG CONNECT.
  55. 55. MDSM: Goals and Needs q Crosslingual communication for SMEs, public institutions, citizens q Crosslingual SME presales communication and aftersales services q Multilingual (big) data, language and knowledge value chains q Multilingual websites, product catalogues, product descriptions q Multilingual knowledge bases and knowledge graphs (and services) q Multilingual conversational interfaces for connected devices (IoT) q Crosslingual business intelligence (e.g., based on UGC) q Crosslingual social media analytics for EU-wide societal issues q Multilingual text and report generation (knowledge/data to text) q All services must be domain-adaptable (no one size fits all) q Translation Centre (Cloud) – HQ automated translation for all http://www.meta-net.eu 58
  56. 56. Multilingual Digital Single Market Automated Translation E-Commerce Content, Media, Verticals Translation, Language, Knowledge, Data Knowledge and Data Repositories Multilingual Applications Multilingual Services Research Crosslingual Big Data Language Analytics Meaning, Semantics, Knowledge High-Quality Machine Translation SMEs CEF DSIs IT Integrators Research provide innovative applications fills gaps H2020 RIAs H2020 CSAs, IAs, RIAs H2020 CSAs, RAs, national funding Multimodal Interaction Language Processing, Analysis and Production – Language Resources Citizens Public Business interoperable and standardised collaboration with member states Conversational Technologies Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 MLV Programme
  57. 57. Application Areas (Selection) q Multilingual E-commerce § Customer-facing vs. back-office facing (after-market, after-sales) § Crosslingual search, CRM, helpdesks, processes, workflows § Semantic, crosslingual product descriptions and catalogues § Online dispute resolution q Multilingual Content, Media, Verticals § Content analytics, curation, generation (incl. authoring support) § Multimodal communication (conversational, written, IoT) § Vertical domains: health, government, mobility, energy, legal. q Translation, Language, Knowledge, Data § Translation Cloud – written/spoken, automatic/human § Crosslingual public and social intelligence, business intelligence § HQ resources, under-resourced languages, domain-specific LRs
  58. 58. Setup – Timeframe – Costs q Close collaboration with EC, EP and all other stakeholders (including SMEs, research centres, universities, NGOs etc.). q Mix of funding sources: § Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA) § National/regional funding sources for work on monolingual LTs and LRs and also to support and grow SMEs in this area § Include, strengthen and broaden role of CEF AT (public services) q Estimated costs for basic MLV implementation: ca. 175-200M€ § Includes set of mission-critical services and applications § Timeframe: 2018, 2019, 2020 http://www.meta-net.eu 61
  59. 59. Conclusions and Next Steps http://www.meta-net.eu 62
  60. 60. q There is a lot of traction for the multilingualism/language topic. q The EU should develop a Multilingual Strategy (incl. technology). q Strategy must take into account several stakeholders: citizens, business/innovation, DSM, research (multiple communities). q Most components in place: Communities, SRIAs, STOA Study etc. q We need the political will to establish language policy change to support multilingualism (both member state level, EU level). q Some Member States are ahead (DK, IE, EE, ES, LT, LV, NL, SL). q Coordinate, intensify the push and keep up the pressure from Member States, EP, EC, research community, businesses etc. q Goal: a shared programme (EU/MSs) as a concerted action. http://www.meta-net.eu 63 Conclusions
  61. 61. Next Steps q Several tightly interconnected goals: § Multilingual Technologies for Europe § Technologies for the Multilingual Digital Single Market § Multilingual Strategy of the European Union § The Human Language Project 1. Discuss and further shape MLV Programme V0.9 with EC 2. Extend the Cracking the Language Barrier federation 3. LT brainstorming meeting at EC, Unit G.3 (Dec. 2016) 4. EP STOA Workshop on Language Technologies (Jan. 2017) 5. MDSM SRIA V1.0 to be finalised (Q1 2017) http://www.meta-net.eu 64
  62. 62. Thank you. office@meta-net.eu http://www.meta-net.eu http://www.facebook.com/META.Alliance 65
  63. 63. Language Technology Topics q Multilingual Europe – Technologies for all European languages q Machine Translation, Text Analytics, Semantic Web etc. q Healthcare, societal challenges (ageing population, refugees etc.) q IoT, Smart Assistants and Conversational Interaction Technologies q E-Learning – Language Technology for E-Learning q Smart Homes, Cities, Manufacturing q Smart Virtual Assistants q Social Media Analytics q E-Participation q Games q etc. http://www.meta-net.eu 67
  64. 64. Digital Language Extinction q Many smaller languages are experiencing problems digitally: § Loss of function – other languages take over entire functional areas such as, e.g., texting, email, search, e-commerce etc. § Loss of prestige – if it’s not on the web, the languages doesn’t exist § Loss of competence – can you raise a digital native in your language? q Andras Kornai’s classification – corresponds to the amount of digital communication in that language: 1. digitally thriving languages (comfort zone languages) 2. vital languages 3. heritage languages 4. still/moribund/dead languages q Implications for the European/global multilingual web? http://www.meta-net.eu 68 potentially facing digital extinction …
  65. 65. http://www.meta-net.eu q Pan-European infrastructure, bringing together providers and consumers of language data, tools and services. q LRs are documented, uploaded, stored, catalogued, downloaded, shared – to improve visibility, documentation, identification, availability, interoperability. q Caters for datasets, tools, services for LT research and development (both academic and commercial); META-SHARE includes repository software, a metadata model, licensing kit, statistics. q 29 distributed repositories maintained by 37 organisations in 25 countries. q 2.600+ resources (corpora: 49%, lexical: 38%, tools/services: 12%), covering ca. 100 languages. q 7.000+ downloads in total; ca. 70% of all LRs have been downloaded.
  66. 66. Preparation of the SRA q Strategic Research Agendas of other initiatives were screened. q Many suggestions as input from Vision Group members. q We discussed procedures, input and structure of the SRA in four meetings of the META Technology Council. § Brussels, Belgium, November 16, 2010 § Venice, Italy, May 25, 2011 § Berlin, Germany, September 30, 2011 § Brussels, Belgium, June 19, 2012 q Additional input in talks, meetings, workshops, discussions, etc. § Example: Three HLT Expert Meetings organised by the EC (end of 2011) q Almost 200 experts contributed to the SRA (54% from industry; 46% from research; 4% from national/international institutions). http://www.meta-net.eu 71
  67. 67. • Published in early 2013. • First strategic research agenda for our field. • Complex process of collecting and shaping technology visions. • Hundreds of researchers participated. • Broad topics around multi- lingual Europe in general.
  68. 68. PT1: Translingual Cloud q Europe has a big need for translations of publishable quality. q Focus on high-quality translation. q New research paradigms § Inclusion of professional translators into the research process § Inclusion of technologists into research on human translation processes q Different technological approaches § Stronger emphasis on the properties of individual languages § A central role for semantics q Methods for specific genres & domains http://www.meta-net.eu 73
  69. 69. Priority Research Theme 1: Translingual Cloud Any device Target groups: European citizen, language professional, organisations, companies, European institutions, software applications Multiple target formats Single access point Automatic translation and interpretation Language checking Post-editing Workbenches for creative translations Novel translation and authoring workflows Quality assurance Computer-supported human translation Multilingual content production and text authoring Trusted service centre (privacy, confidentiality, security of source data) Services and Technologies: Crosslingual communication, translation and search Real-time subtitling, voice-over generation and translating speech from live events Mobile interactive interpretation Multilingual content production (media, web, technical, legal documents) Showcases: translingual spaces for ambient translation Applications: Written (twitter, blog, article, newspaper, text with/without metadata etc.) or spoken input (spontaneous spoken language, video/audio, multiple speakers) Modular combination of analysis, transfer and generation models From very fast but lower quality to slower but very high quality (including instant quality upgrades) Exploiting strong monolingual analysis and generation methods and resources Multiple target formats Domain, task and genre specialisation models Extending translation with semantic data and linked open data
  70. 70. PT2: Social Intelligence q Better decisions by monitoring social media q Inclusion of citizens into collective decision processes q Opinion formation, consensus building, decision making q Evolution of new solutions q New forms of democracy: e-democracy, massive participation, transparency q Dialogues and debates across language boundaries and across parties, political alliances, social classes q Better than binary voting q Documented transparent decision processes http://www.meta-net.eu 75
  71. 71. Priority Research Theme 2: Social Intelligence and e-Participation From shallow to deep, from coarse-grained to detailed processing techniques Making language technologies interoperable with knowledge representa- tion and the semantic web “Semantification” of the web: tight integration with the Semantic Web and Linked Open Data Mapping large, heterogeneous, unstructured volumes of online content to structured, actionable representations Unleashing social intelligence by detecting and monitoring opinions, demands, needs and problems Target groups: European citizen, European institutions, discussion participants, companies Make use of the wisdom of the crowds Improved efficiency and quality of decision processes Understanding influence diffusion across social media especially social media, comments, blogs, forums decision-relevant information support sentiment analysis and opinion mining including the temporal dimension) cues from arbitrary online content visualising discussions and opinion statements Services and Technologies: collective deliberation and e-participation - wide deliberation on pressing issues and processes; modeling evolution of opinions analysis technologies Applications:
  72. 72. Priority Research Theme 3: Socially-Aware Interactive Assistants Interacting naturally with and in groups Learning and forgetting information Adaptable to the user’s needs and preferences and the environment Include human-computer, human-artificial agent and computer-mediated human- human communication Proactive, self-aware, user-adaptable Interacts naturally with humans, in any language and modality Can be personalised to individual communication abilities including special needs Can learn incrementally from all interactions and other sources of information recognition and synthesis, providing expressive voices understanding incremental conversational speech models of human communication inter-dependencies priority themes Services and Technologies: Applications: dialogue systems environment modalities (visual, tactile, haptic) verbal/non-verbal behaviour, social context ments, any vocabulary recovery, self- assessment Multilingual capabilities
  73. 73. ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015 Contents Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i 1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6 1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10 2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10 2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18 3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18 3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29 5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
  74. 74. q European Parliament § Upcoming STOA Study and Workshop (Jan. 2017) q European Commission § DG CONNECT: Horizon 2020 WP 2018-2020 (G1) § DG CONNECT: New Unit “Learning, Multilingualism, Inclusion” (G3) § DG Translation: Connecting Europe Facility, AT q Language Communities: EFNIL and NPLD § Joint position paper META-FORUM 2015, 2016 q EU Member States and Non-Member States § National and regional funding agencies (ES, NL etc.) q Research Communities, especially Big Data community (BDVA SRIA V3.0), Web community and many others (Robotics, IoT etc.) q Standardisation – W3C and others http://www.meta-net.eu 80 Multilingual Europe Stakeholders
  75. 75. Multilingual Success Stories q Moses SMT toolkit as well as research and technology ecosystem q CEF AT for public online services – good and timely development q eBay: MT to Russian – 50% increase in sales q Hugo.lv for Latvian public services – better than Google Translate q Hundreds of European startups in Language Technology and AI q Conversational interfaces (Siri, Echo, Cortana): the next big thing q IBM Watson – a billion dollar LT business q Great Neural MT results reported by European researchers (QT21) q Very rapid development – many opportunities for European R&D&I http://www.meta-net.eu 81

×