Georg Rehm. Mehrsprachigkeit für das Digitale Europa. Ringvorlesung Digitale Lebenswelten, University of Hildesheim, Germany, November 2016. November 15, 2016.
Presentation on how to chat with PDF using ChatGPT code interpreter
Multilingualism for Digital Europe
1. META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER
(grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119),
CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).
Multilingualism
for Digital Europe
Georg Rehm
General Secretary META-NET, Coordinator CRACKER
DFKI, Germany
georg.rehm@dfki.de
Ringvorlesung Digitale Lebenswelten – Universität Hildesheim, 15th November 2016
2. Outline
q A Multilingual Europe Initiative: META-NET
§ LT Support – META-NET White Paper Series
§ LT Strategy – META-NET SRA
q Continuing the Initiative – Recent Developments
§ The Digital Single Market and Multilingualism
§ Cracking the Language Barrier
§ META-FORUM 2015/2016 – MDSM SRIA V0.5/V0.9
q Goals and Next Steps
http://www.meta-net.eu 2
4. Multilingual Europe in 2010
4http://www.meta-net.eu
q Challenge: Providing each language community with the most
advanced technologies for communication and information so that
maintaining their mother tongue does not turn into a disadvantage.
q While research has made considerable progress in recent years, the
pace of progress is not fast enough to meet the challenge within the
next 10-20 years.
q All stakeholders – researchers, LT industries, policy makers,
language communities, funding programmes – should
team up in a strategic alliance for a major dedicated push.
5. q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)
General Secretary: Georg Rehm (DFKI)
q
Multilingual Europe
Technology Alliance.
826 members in
67 countries
(published in 2013) (31 volumes; published in 2012)
T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
8. Cross-Lingual Comparison
q 1. Machine Translation 2. Text Analytics
3. Speech Processing/Synthesis 4. Language Resources
q Ranking: from excellent LT support to weak/no LT support.
q Cross-lingual comparison discussed and finalised at a network
meeting with representatives of all languages (Oct., 2011).
http://www.meta-net.eu 8
9. MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or no support through LT
Basque, Bulgarian, Croatian,
Czech, Danish, Estonian, Finnish,
Galician, Greek, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Norwegian, Portuguese, Serbian,
Slovak, Slovene, Swedish, Welsh
excellent
Czech, Dutch,
Finnish, French,
German, Italian,
Portuguese,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Danish, Estonian, Galician,
Greek, Hungarian, Irish,
Norwegian, Polish, Serbian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Icelandic, Latvian,
Lithuanian, Maltese, Romanian,
Welsh
excellent
English
good
Speech
English
good
Dutch, French,
German, Italian,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Czech, Danish, Finnish,
Galician, Greek, Hungarian,
Norwegian, Polish,
Portuguese, Romanian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Estonian, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Serbian, Welsh
excellent
English
good
Czech, Dutch,
French, German,
Hungarian, Italian,
Polish, Spanish,
Swedish
moderate fragmentary
Basque, Bulgarian, Catalan,
Croatian, Danish, Estonian,
Finnish, Galician, Greek,
Norwegian, Portuguese,
Romanian, Serbian, Slovak,
Slovene
Icelandic, Irish, Latvian,
Lithuanian, Maltese, Welsh
weak or no support through LTexcellent
ResourcesTextAnalytics
11. Observations and Results
http://www.meta-net.eu 11
q When it comes to technology
support, there are massive
differences between Europe’s
languages and technology areas.
q Support for English is ahead of
any other language.
q But: even support for English is
far from being perfect.
q Several languages get the weakest
score in all four areas (e.g., Icelan-
dic, Latvian, Lithuanian, Maltese)!
12. Digital Language Extinction!
q “At Least 21 European Languages in Danger of Digital Extinction!”
q Press release on European Day of Languages (Sept. 26, 2012).
q Huge global interest in the topic and our key findings!
q 600+ mentions in the press.
q News from 40+ countries in 35+ different languages.
q 20+ television reports and 30+ broadcast interviews (radio, tv) with
META-NET representatives.
q Two Parliamentary Questions in the EP on the “digital extinction of
languages” topic.
q These results lead to a STOA Workshop in the EP (Dec. 3, 2013).
http://www.meta-net.eu 12
13. Desudensættesderpengeaftilatøgeantal-
let af operationer og udvide ambulatorieka-
paciteten på det urologiske område på Herlev,
»Mensåerdetogsåvigtigtatholdefastidet
målogikkestillesigtilfredsmed,at80eller85
pct.kommerigennemtiltiden.«B
Af Jens Ejsing
// ejs@berlingske.dk
Det danske sprog har det svært i den digitale
verden.
Det konstaterer danske sprogforskere- og
eksperter i forbindelse med den nye inter-
nationale undersøgelse META-NET, der ser
nærmere på, hvordan en lang række mindre,
europæiske sprog som dansk klarer sig i den
digitaleverden.
Forskerne fra bl.a. Københavns Universitet
og Dansk Sprognævn når frem til, at dansk
i fremtiden kan få det endnu sværere i den
digitale verden, fordi Google Translate, GPSer,
applikationertilsmartphonesogandresprog-
teknologiske programmer ikke i tilstrækkelig
grad formår at behandle de mange nuancer i
detdanskesprog.
Professor i sprogteknologi på Københavns
Universitet, Bolette Sandford Pedersen,
mener, at der er brug for en slags digital dansk
sprogbank fyldt med data, så bl.a. oversættel-
ser bliver så præcise og gode som muligt. Med
hjælp fra sprogbanken kan forskere ifølge
professoren hjælpe virksomheder med at for-
bedreprogrammer,derskalhåndteresproglig
viden om bl.a. maskinoversættelse, tale-
genkendelseoginformationssøgning.
Dermedvilderblivelængeremellemfejlag-
tige oversættelser, som når »hæld olie på pan-
den« med Google Translate bliver til »pour oil
on the forehead« på engelsk. Oversættelser,
der er i værste fald er så upræcise, at danskere
ender med at fravælge deres eget sprog i den
digitaleverden.
Sproghjælp til virksomheder
Hun anerkender dog, at »teknologien til auto-
matiske oversættelser på mange måder er
fantastisk«.
»Den er bare ikke god nok, når det gælder
dansk,«sigerhun:
»Detersomom,atviietvistomfanglægger
det i hænderne på Google eller andre virk-
somheder at afgøre, om dansk skal behandles
godt nok eller ej. Men det danske marked
er ikke stort for dem. Spørgsmålet er derfor,
Dårlig sprogteknologi truer dansk på nettet
Ord. Forskere arbejder på at forbedre danske oversættelser på internettet.
om vi ikke i højere grad selv skal gøre noget
for at sikre, at det fornødne datamateriale er
til rådighed, så vi får gode oversættelser og
anden god sprogteknologi. Det kunne f.eks.
være ved, at vi gjorde en indsats for at få opret-
tet en sprogbank med en masse beriget mate-
rialeomdansk.«
»Hvis vi hele tiden oplever, at oversættel-
ser er behæftede med fejl, tør vi ikke stole på
dem,« siger hun og understreger, at »fejlagtige
oversættelserkanføretilstoremisforståelser«.
Ifølge Dansk Sprognævns direktør, Sabine
Kirchmeier-Andersen,kandårligsprogtekno-
logi have konsekvenser for mange danskere,
derikkeersågodetilengelsk.
»Hvis vi har ambitioner om at bruge det
danske sprog i fremtidens teknologiske
univers, skal der gøres en indsats nu for at
fastholde ekspertise og udbygge den viden, vi
har,«menerhun:
»Ellers risikerer vi, at kun folk, der taler fly-
dendeengelsk,vilfåglædeafdenyegeneratio-
ner af web-, tele- og robotteknologi, der er på
vej.«B
INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN KILDE: REGION HOVEDSTADEN
H Der er omkring 80 sprog i EU. For 21 af
dem – også dansk – gælder det, at der er
store sprogteknologiske mangler, når det
gælder bl.a. maskinoversættelse, talegenken-
delse og informationssøgning.
H Ifølge en EU-undersøgelse køber et
stigende antal europæiske internetbrugere
varer eller tjenester på nettet, hvor det sprog,
der bliver anvendt, ikke er deres eget. Det
gælder over halvdelen af brugerne.
H Over hver tredje anvender et fremmed-
sprog til at skrive mail eller indlæg på nettet.
fakta H
Sprog i Europa
38
Στην ψηφιακή εποχή δεν…
µιλούν ελληνικά, όπως και
αρκετές άλλες ευρωπαϊκές
γλώσσες, σύµφωνα µε πανευρωπαϊ-
κή έκθεση µε την υπογραφή 200 και
πλέον ειδικών. Η συγκεκριµένη µελέ-
τη δηµοσιεύτηκε από το επιστηµονικό
δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτε-
σινή Ευρωπαϊκή Ηµέρα Γλωσσών.
Για τις ανάγκες της έρευνάς τους,
γλωσσολόγοι από 34 χώρες της Γη-
ραιάς Ηπείρου βαθµολόγησαν τις
διαθέσιµες γλωσσικές υπηρεσίες
και δηµιούργησαν ένα «Λευκό Βι-
βλίο» για κάθε ευρωπαϊκή γλώσσα.
Στη µελέτη τους, οι ειδικοί αναζήτη-
σαν µεταξύ άλλων τέσσερα βασικά
ηλεκτρονικά εργαλεία, δηλαδή την
ύπαρξη αυτόµατης µετάφρασης,
τη δυνατότητα φωνητικής αλληλε-
πίδρασης και ψηφιακής ανάλυσης
κειµένου, ενώ ταυτόχρονα διερευνή-
θηκε και η διαθεσιµότητα γλωσσικών
πόρων ή πηγών.
Σε πρώτη φάση εξέτασαν τις ιστο-
σελίδες που επιτρέπουν στους χρή-
στες να κάνουν µεταφράσεις online,
όπως, για παράδειγµα, η υπηρεσία
του κολοσσού πληροφορικής Google
Translate. Την ίδια ώρα, εξετάστηκε
και η «επικοινωνία» των ελληνόφω-
νων χρηστών µε τις…συσκευές τους,
όπως για παράδειγµα η δυνατότητα
να «µιλήσει» κάποιος στο GPS στη
µητρική του γλώσσα. Οι ερευνητές
κατέληξαν στο συµπέρασµα ότι
υπάρχουν τέτοιες συσκευές, αλλά
δεν είναι τόσο διαδεδοµένες όσο οι
αγγλόφωνες.
Το «χρυσό» µετάλλιο κατακτά,
όπως είναι άλλωστε και λογικό, η
αγγλική γλώσσα. Οι αγγλόφωνοι χρή-
στες έχουν την καλύτερη δυνατή τε-
χνολογική υποστήριξη, κάτι το οποίο
ευνοεί την περαιτέρω εξάπλωση της
γλώσσας. Από «τεχνολογικό απο-
κλεισµό» κινδυνεύουν περισσότερο
η ισλανδική, η λετονική, η λιθουανική
και η µαλτέζικη γλώσσα, ενώ σε λίγο
καλύτερη µοίρα βρίσκονται η ελλη-
νική, η βουλγαρική, η ουγγρική και
η πολωνική, που όπως αναφέρει η
έρευνα έχουν «αποσπασµατική» τε-
χνολογική υποστήριξη.
«Μέτρια» χαρακτηρίζεται η υπο-
στήριξη χρηστών σε ολλανδική, γαλ-
λική, γερµανική, ιταλική και ισπανική
γλώσσα. Οι επικεφαλής της επιστη-
µονικής οµάδας, Χανς Ουζκοράιτ και
Γκεόργκ Ρεµ, αναφέρουν χαρακτηρι-
στικά: «Υπάρχουν δραµατικές διαφο-
ρές στην υποστήριξη της γλωσσικής
τεχνολογίας ανάµεσα στις διάφορες
ευρωπαϊκές γλώσσες. Το χάσµα µετα-
ξύ “µικρών” και “µεγάλων” γλωσσών
ολοένα και διευρύνεται. Πρέπει να
εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων
σε ψηφιακούς πόρους γλωσσών µε
τις απαραίτητες βασικές τεχνολογί-
ες. ∆ιαφορετικά, οι γλώσσες αυτές
είναι καταδικασµένες σε ψηφιακή
εξαφάνιση».
Μάλιστα, οι ειδικοί τονίζουν ότι χω-
ρίς αποφασιστική δράση οι γλώσσες
αυτές δύσκολα θα… επιβιώσουν στον
ψηφιακό κόσµου του 21ου αιώνα. Η
κ. Μαρία Γαβριηλίδου, µέλος της επι-
στηµονικής οµάδας από το Ινστιτούτο
Επεξεργασίας του Λόγου Ερευνητικό
Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η
έρευνα αυτή δεν λέει ότι δεν θα ζήσει
η ελληνική γλώσσα ή ότι κινδυνεύει
µε εξαφάνιση». Η ειδικός εξηγεί ότι
όσο υπάρχουν άνθρωποι που µιλά-
νε, γράφουν και επικοινωνούν µε µια
γλώσσα, τότε αυτή θα συνεχίσει να
υπάρχει. Είναι σηµαντικό, όµως, να
έχουν όλοι οι χρήστες τη δυνατότητα
να «µιλήσουν» στις µηχανές, όπως τα
GPS τους, στα ελληνικά και να έχουν
στη διάθεσή τους γλωσσικά εργαλεία
ηλεκτρονικών υπολογιστών.
Μεταξύ αυτών των «εργαλείων»
είναι οι διορθωτές ορθογραφικών και
συντακτικών λαθών, που χρησιµοποι-
ούνται καθηµερινά από εκατοντάδες
Ελληνες χρήστες και βασίζονται στη
γλωσσική τεχνολογία.
Παρ’ όλα αυτά, τονίζει ότι η ψη-
φιακή εξάπλωση µιας γλώσσας είναι
σηµαντική «∆εν είναι στα χέρια του
µέσου χρήστη. Οι εκάστοτε κυβερ-
νήσεις, η Ευρωπαϊκή Ενωση και ο
ιδιωτικός τοµέας πρέπει να χρηµα-
τοδοτήσουν την ανάπτυξη αυτής της
τεχνολογίας για όλες τις γλώσσες»,
αναφέρει και συνεχίζει: «Οι χρήστες,
όµως, πρέπει να απαιτούν να υπάρ-
χουν και στη γλώσσα τους τα µέσα
αυτά και να µην ικανοποιούνται µε
τα αγγλικά». ■
Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ
Life
ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ
Με ψηφιακή εξαφάνιση
κινδυνεύουν τα ελληνικά
ΕΛΕΝΗ ΒΕΡΓΟΥ
evergou@e-typos.com
Η γλώσσα της
αποξένωσης…
GREEKLISH
Οι αγγλόφωνοι
χρήστες έχουν
την καλύτερη
δυνατή τεχνολογική
υποστήριξη,
γεγονός που ευνοεί
την περαιτέρω
εξάπλωση
της γλώσσας
ΜΕ GREEKLISH επικοινω-
νούν πλέον µέσω µηνυµά-
των ή email οι περισσότεροι
νέοι της χώρας µας. Παρά
το γεγονός ότι τα τελευ-
ταία χρόνια υπάρχουν τα
γλωσσικά εργαλεία, τα
οποία επιτρέπουν τη χρήση
της ελληνικής γραµµατο-
σειράς, έφηβοι και νέοι
ενήλικες φαίνεται ότι δεν
έχουν «αγκαλιάσει» αυτές
τις τεχνολογίες. Ο καθη-
γητής Γλωσσολογίας, κ.
Γιώργος Μπαµπινιώτης, λέει
στον «Ε.Τ.»: «Τα greeklish
είναι πρόβληµα για την
ελληνική γλώσσα, ιδίως για
ανθρώπους νέας ηλικίας
για έναν καθαρά γλωσσικό
λόγο. Με τη χρήση των
greeklish αποξενώνονται
από τη µορφή της λέξης ή
όπως λέµε το ετυµολογικό
ίνδαλµα που δηλώνεται µε
την ορθογραφία της λέξης
και συνδέεται και µε τη ση-
µασία της λέξης και µε την
προέλευσή της». Ο κίνδυνος,
µε τον οποίο έρχονται αντι-
µέτωποι οι νέοι άνθρωποι,
είναι η αποξένωση από τη
γραπτή µορφή της γλώσ-
σας. Αυτή η «οικειότητα»,
όµως, βοηθάει και στην
κατανόηση της σηµασίας
αλλά και την προέλευση της
λέξης. «Αυτή η αποξένωση
δεν είναι άνευ σηµασίας»,
αναφέρει ο ειδικός, ο οποίος
εξηγεί ότι η διαδικασία της
γραφής βοηθάει να εντυπω-
θεί η λέξη και να συνδεθεί
µε άλλες οµόρριζες λέξεις.
«Οταν χρησιµοποιείται αυτή
η µορφή επικοινωνίας, κα-
ταστρέφονται, ατονούν. ∆εν
είναι προς θάνατο, αλλά θα
κάνει ζηµιά», αναφέρει ο
κ. Μπαµπινιώτης, ο οποίος
συµβουλεύει τους χρήστες
να επιλέγουν την ελληνική
γραµµατοσειρά.
Γιώργος
Μπαµπινιώτης.
Date 30 September 2012
Page 16
Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency
agreement (www.nla.co.uk) or with written publisher permission.
For external republishing rights see www.nla-republishing.com
49KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012
Η
26η Σεπτεµβρίου έχει καθιε-
ρωθεί από το Συµβούλιο της
Ευρώπης ως η Ευρωπαϊκή
Ηµέρα των Γλωσσών, αλλά,
σύµφωνα µε µια νέα ευρωπαϊκή επι-
στηµονική έκθεση, οι 21 από τις 30
γλώσσες της Ευρώπης -µεταξύ των οποί-
ων και η Ελληνική- αντιµετωπίζουν κίν-
δυνο ψηφιακής εξαφάνισης.
Η έρευνα κρούει τον κώδωνα κινδύ-
νου, καθώς διαπίστωσε ότι η ψηφιακή
βοήθεια για τις περισσότερες ευρωπαϊκές
γλώσσες είναι ελλιπής ή απολύτως ανύ-
παρκτη για τους χρήστες.
Τις έφαγαν οι κοινές
Η έκθεση, µε τη µορφή µιας σειράς
Λευκών Βίβλων (µε τίτλο «Γλώσσες στην
Ευρωπαϊκή Κοινωνία της Πληροφορίας»),
από το επιστηµονικό δίκτυο ΜΕΤΑ-
ΝΕΤ, το οποίο συνενώνει 60 ερευνητικά
κέντρα σε 34 χώρες, επισηµαίνει ότι οι
γλώσσες που µιλιούνται από σχετικά
µικρό αριθµό ανθρώπων κινδυνεύουν,
επειδή δεν έχουν τεχνολογική υποστή-
ριξη όπως έχουν οι ευρέως χρησιµο-
ποιούµενες γλώσσες. Λευκές Βίβλοι
έχουν καταρτιστεί για τις εξής ευρω-
παϊκές γλώσσες: αγγλικά, βασκικά,
βουλγαρικά, γαλικιανά, γαλλικά, γερ-
µανικά, δανικά, ελληνικά, εσθονικά,
ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά,
καταλανικά, κροατικά, λετονικά, λι-
θουανικά, µαλτέζικα, νορβηγικά (µπουκ-
µόλ και νινόρσκ), ολλανδικά, ουγγρικά,
πολωνικά, πορτογαλικά, ρουµανικά,
σερβικά, σλοβακικά, σλοβενικά, σουη-
δικά, τσεχικά και φινλανδικά. Κάθε
Λευκή Βίβλος είναι γραµµένη στη γλώσ-
σα στην οποία αναφέρεται και είναι
µεταφρασµένη στα αγγλικά.
Τέσσερις µεγάλοι κίνδυνοι
Σύµφωνα µε τη νέα µελέτη, η Ισ-
λανδική, η Λετονική, η Λιθουανική και
η Μαλτέζικη αντιµετωπίζουν τον µε-
γαλύτερο κίνδυνο εξαφάνισης σε µια
ευρωπαϊκή τεχνολογική κοινωνία, που
ολοένα περισσότερο προωθεί τη χρήση
συγκεκριµένων γλωσσών και ιδίως της
Αγγλικής. Όµως και άλλες γλώσσες,
όπως η Ελληνική, η Βουλγαρική, η Ουγ-
γρική και η Πολωνική, επίσης κινδυ-
νεύουν στον σύγχρονο ψηφιακό κόσµο.
Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία
συνέβαλαν περισσότεροι από 200 ειδικοί,
αξιολογεί τον κίνδυνο για κάθε γλώσσα
µε βάση τέσσερα βασικά κριτήρια σε
τεχνολογικό/ψηφιακό επίπεδο: την ύπαρ-
ξη αυτόµατης µετάφρασης στη συγκε-
κριµένη γλώσσα, τη δυνατότητα φωνη-
τικής αλληλεπίδρασης, τη δυνατότητα
ψηφιακής ανάλυσης κειµένου και τη
διαθεσιµότητα των σχετικών ψηφιακών
γλωσσικών πόρων/πηγών.
Οι δυνατές
Η γλώσσα µε την καλύτερη βαθµο-
λογία στα κριτήρια είναι ασφαλώς η
Αγγλική, που απολαµβάνει τη συγκριτικά
καλύτερη τεχνολογική υποστήριξη (αν
και όχι την καλύτερη δυνατή), γεγονός
που διευκολύνει την περαιτέρω εξά-
πλωσή της.
Ακολουθούν µε ικανοποιητική ή µέ-
τρια τεχνολογική/ψηφιακή υποστήριξη
η Ολλανδική, η Γαλλική, η Γερµανική,
η Ιταλική και η Ισπανική. Η Ελληνική,
όπως επίσης η Βασκική, η Καταλανική,
η Πολωνική, η Ουγγρική κ.ά. κατα-
τάσσονται στις γλώσσες µε «αποσπα-
σµατική» µόνο υποστήριξη, γι’ αυτό
ακριβώς θεωρούνται γλώσσες υψηλού
κινδύνου προς εξαφάνιση.
Δραµατικές διαφορές
Σύµφωνα µε τους επιµελητές της µε-
λέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ,
«υπάρχουν δραµατικές διαφορές στην
υποστήριξη της γλωσσικής τεχνολογίας
ανάµεσα στις διάφορες ευρωπαϊκές
γλώσσες και τεχνολογικές περιοχές. Το
χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’
γλωσσών ολοένα και διευρύνεται. Πρέπει
να εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων -σε
ψηφιακούς πόρους- γλωσσών µε τις
απαραίτητες βασικές τεχνολογίες, αλλιώς
οι γλώσσες αυτές είναι καταδικασµένες
σε ψηφιακή εξαφάνιση».
Ως ελπίδα αυτών των γλωσσών θεω-
ρείται η βελτίωση και η ευρύτερη αξιο-
ποίηση του λογισµικού γλωσσικής τε-
χνολογίας, το οποίο επιτρέπει τη φω-
νητική και τη γραπτή επεξεργασία των
διαφόρων γλωσσών.
Παραδείγµατα αυτών των δυνατοτή-
των είναι οι ηλεκτρονικοί ορθογραφικοί
και συντακτικοί διορθωτές κειµένων,
οι διαδραστικοί προσωπικοί «βοηθοί»
των έξυπνων κινητών τηλεφώνων (π.χ.
η Siri στο iPhone), τα συστήµατα αυ-
τόµατης µετάφρασης, τα ηλεκτρονικά
συστήµατα διαλόγου των τηλεφωνικών
κέντρων, οι µηχανές αναζήτησης, η
συνθετική φωνή στα συστήµατα πλοή-
γησης των αυτοκινήτων. κ.ά.
Το βασικό πρόβληµα
Το σηµαντικό, σύµφωνα µε την έκ-
θεση, είναι όλες αυτές οι δυνατότητες
να προσφέρονται στους χρήστες και στη
µητρική τους γλώσσα που κινδυνεύει
µε εξαφάνιση. Χωρίς αποφασιστική δρά-
ση, γίνεται η δυσοίωνη πρόβλεψη ότι
οι γλώσσες αυτές δύσκολα θα επιβιώσουν
στον ψηφιακό κόσµο του 21ου αιώνα.
Ένα πρόβληµα είναι ότι το λογισµικό
αυτών των συστηµάτων γλωσσικής τε-
χνολογίας στηρίζεται σε στατιστικές µε-
θόδους που απαιτούν τεράστιες ποσό-
τητες γραπτών ή φωνητικών δεδοµένων,
όµως τόσα πολλά δεδοµένα είναι δύσκολο
να αποκτηθούν για γλώσσες που οµι-
λούνται από σχετικά λίγους ανθρώπους.
Εξάλλου, ακόµα και για ευρέως χρη-
σιµοποιούµενες γλώσσες όπως τα αγ-
γλικά, η σχετική γλωσσική τεχνολογία
έχει ακόµα αδυναµίες, που είναι π.χ.
φανερές στις άκρως ανεπαρκείς και γε-
µάτες λάθη αυτόµατες µεταφράσεις. Η
έκθεση προτείνει ότι πρέπει να αναληφθεί
µια συντονισµένη µεγάλης κλίµακας
προσπάθεια στην Ευρώπη, προκειµένου
σταδιακά να δηµιουργηθούν ή να βελ-
τιωθούν οι αναγκαίες τεχνολογίες και
να βοηθηθούν οι γλώσσες που είναι ψη-
φιακά παραγκωνισµένες.
Τη γλώσσα
µού... έχασαν
Οι περισσότερες ευρωπαϊκές γλώσσες
κινδυνεύουν µε ψηφιακή εξαφάνιση
Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων
-σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες
?049-ΚΟΣΜΟΣ 29/09/2012 1:41 ?Μ Page 49
14. Update of the Study (2014)
q Study comprised 31 volumes/languages.
q Many languages missing! Need for
extension – at least of the comparison.
q We invited three language community
bodies to participate in the update:
European Federation of National
Institutions for Language (EFNIL)
Network to Promote Linguistic
Diversity (NPLD)
Experts Committee of the European
Language Charter (Council of Europe)
http://www.meta-net.eu 14
CCURL 2014 – Collaboration and Computing for Under-
Resourced Languages in the Linked Open Data Era
19. Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media and
Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
Planning Process
Strategic
Research
Agenda
2010 2011 2012
20. Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media and
Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
Planning Process: Documents
Strategic
Research
Agenda
2010 2011 2012
www.meta-net.eu
office@meta-net.eu
T: +49 30 23895 1833
The Future European Multilingual
Information Society
Vision Paper for a Strategic Research Agenda
“People can’t share knowledge
if they don’t speak a common language.”
Davenport, Thomas H, and Laurence Prusak, Working Knowledge:
How Organizations Manage What They Know, Harvard Business School,
Boston, 1997, p. 98.
Join the discussion at
www.meta-et.eu/forum
LT 2020
Vision and Priority Themes for
Language Technology Research
in Europe until the Year 2020
Towards the META-NET Strategic Research Agenda
The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro-
pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement
270893) and META-NORD (Grant Agreement 270899).
Do you have comments, ideas or suggestions
with regard to the content of this document?
Please send them to office@meta-net.eu or
discuss them online: http://www.meta-net.eu/sra.
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Translation and Localisation
Results of first two meetings
Editors: Aljoscha Burchardt, Georg Rehm
Dissemination Level: Public
Date: 3 December 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Media and Information Services:
Results of first two meetings
Editors: Maria Koutsombogera, Stelios Piperidis
Dissemination Level: Public
Date: 10 November 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Interactive Systems:
Results of first two meetings
Editors: Joseph Mariani, Bernardo Magnini
Dissemination Level: Public
Date: 28 December 2010
21. Strategic Research Agenda
q Addresses the problems we identified
when preparing the white papers.
q Can put Europe ahead of its
competitors in this technology area.
q 200 contributors; >2 years.
54% industry; 46% research;
4% (inter)national institutions.
q Presented and discussed at 90+
conferences and major workshops.
q Published in early 2013.
q http://www.meta-net.eu/sra
http://www.meta-net.eu 21
22. Priority Research Themes
q Three priority research themes:
§ Translingual Cloud
§ Social Intelligence and
e-Participation
§ Socially-Aware Interactive
Assistants
q Two additional themes:
§ European Service Platform
for Language Technologies
§ Core Technologies for
Language Analysis and Production
http://www.meta-net.eu 22
23. Providers of operational and research technologies and services
Research
Centres
European
Institutions
Other
companies (SMEs,
startups etc.)
National
Language
Institutions
Language
Technology
Providers
Language
Service
Providers
Universities
European
Institutions
Research
Centres
Public
Administrations
Enterprises
LT User
Industries
Universities
European
Citizens
Beneficiaries/users of the platform
Interfaces (web, speech, mobile etc.)
Priority Research Theme 1:
Translingual
Cloud
Priority Research Theme 2:
Social Intelligence
& e-Participation
Priority Research Theme 3:
Socially Aware
Interactive Assistants
European Service Platform for Language Technologies
(Cloud or Sky Computing Platform)
Multilingual
technologies
Text
analytics
Text
generation
Language
checking
Sentiment
analysis
Named entity
recognition
Summari-
sation
Knowledge access
and management
Information and
relation extraction
Language
Processing
Language
Understanding
Knowledge
Emotion/
Sentiment
Data protection
Tools
Data Sets
Resources
Components
Metadata
Standards
Interfaces
APIs
Catalogues
Quality Assurance
Data Import/Export
Input/Output
Storage
Performance
Availability
Scalability
Features
26. 1 DFKI Germany Georg Rehm
2 CUNI Czech Republic Jan Hajic
3 ELDA France Khalid Choukri
4 FBK Italy Marcello Federico
5 ATHENA RC Greece Stelios Piperidis
6 UEDIN UK Philipp Koehn
7 USFD UK Lucia Specia
Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu
Cracking the Language Barrier
Coordination, Evaluation and Resources for European MT Research
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
Language-blocking:
languages they do not speak
Geo-blocking and language-blocking are barriers to access
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Communities
• META-NET incl. META-SHARE and META
• MT evaluation initiatives – WMT, IWSLT, MT Marathons
• MT and other LT industry
• Language resources – META-SHARE, ELRA
• HT/MT evaluation tools – translate5
• Translation industry, translation profession
• MT user communities
Strategic Agenda for the Multilingual Digital Single Market
• Version 0.5 presented at META-FORUM 2015 (Riga)
• Version 0.9 presented at META-FORUM 2016 (Lisbon)
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
27. Selected Activities
2015 2016 2017
M12
M1
M24
M36
Kick-off meeting
for all ICT-17
Projects
translate5
WMT
2016
WMT
2017
IWSLT
2015
IWSLT
2016
IWSLT
2017
QT Marathon
2015
QT Marathon
2016
Roadmap for
European MT
Research
Survey on the State
of HQMT in Industry
and LSPs
SRIA
(initial version)
SRIA
(update)
SRIA
(final)
version 2version 1
• Production of resources (e.g., for WMT
2016 and 2017, IWSLT 2015-2017)
• Tools (quality control, evaluations)
• Strategies and roadmaps (SRIA,
Roadmap for European MT Research)
• Exchange and sharing facility for
resources (META-SHARE)
Recent or Upcoming Events
• LREC Workshop on MT Eval. (May 25)
• META-FORUM 2016 (July 4/5, Lisbon)
• WMT 2016 (Aug. 11/12, Berlin)
• IWSLT 2016 (Dec. 8/9, Seattle)
• Federation of organisations and
projects working on technologies
for multilingual Europe.
• 10 organisations; 24 projects.
• Areas of collaboration: data
management and repositories,
tools, shared tasks, evaluations.
• Goal: provide one umbrella
organisation for the whole
community.
http://www.cracking-the-language-barrier.eu
28. q META-FORUM 2016 – July 04/05, Lisbon, Portugal
Beyond Multilingual Europe
q META-FORUM 2015 – April 27, Riga, Latvia
Technologies for the Multilingual Digital Single Market
q META-FORUM 2013 – Sept. 19/20, Berlin, Germany
Connecting Europe for New Horizons
q META-FORUM 2012 – June 20/21, Brussels, Belgium
A Strategy for Multilingual Europe
q META-FORUM 2011 – June 27/28, Budapest, Hungary
Solutions for Multilingual Europe
q META-FORUM 2010 – Nov. 17/18, Brussels, Belgium
Challenges for Multilingual Europe
http://www.meta-net.eu 28
30. q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q Unfortunately, the language topic is not
included in the EC’s Digital Single Market
strategy (published in May 2015).
34. Facts and Figures
http://www.meta-net.eu 34
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
35. Facts and Figures
http://www.meta-net.eu 35
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
and marketing costs.
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
36. The MDSM Fact Sheet
http://www.meta-net.eu 36
Current eCommerce growth within Europe is about half that of the US,
due partially to a lack of language coverage from European SMEs.
Lessthan5%ofEuropeanSMEscurrentlysellcross-language.
Multilingual Digital Single Market
Why Europe needs a
No single language accounts
for more than 20% of the
potential Multilingual
Digital Single Market.
Most account for less than
3% of the DSM.
Without a solution, the
European Digital Single
Market will remain
fragmented.
Europe’s 24 official
languages present
a tremendous
opportunity for
European business
Removing language barriers within
Europe would open access to 73%
(with >€25 trillion in annual
revenue!) of the world’s digitally
accessible market to European
enterprise.
Europetodayisnotasinglemarket:
itisaseparatedinto20+smalllanguagemarkets.
www.meta-net.eu
Chinese
(510 million)
W
orld
Spanish
(1
65
millio
n)
W
orld
Po
rtug
ue
se
(8
3
millio
n)
English
(565 million)
Ja
pane
se
(1
00
millio
n)
Rus
sian
(6
0
millio
n)
Europe today
(Many small
markets)
LANGUAGE TECHNOLOGY
The Multilingual Digital Single Market
Online Population
Source:InternetWorldStats(MiniwattMarketingGroup)InternetWorldStats(Mini
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Good
Moderate
Fragmentary
Weak/no
support
0
50
100
150
200
250
300
350
400
LanguageTechnologySupport*
MillionsofNativeSpeakers(Worldwide)
LanguageTechnology Danger Zone
(≈150 million EU citizens)
LanguageTechnology Danger Zone
(≈150 million EU citizens)
Spanish
English
Portuguese
German
French
Italian
Polish
Romanian
Dutch
Greek
Hungarian
Czech
Swedish
Bulgarian
Danish
Croatian
Slovak
Finnish
Lithuanian
Slovene
Latvian
Estonian
Maltese
Irish
140 million EU
citizens are in the LanguageTechnology Danger
Zone, where language technology is inadequate to
support the DSM.
Current online automatic
translation provided by US
tech giants does not solve
less than 30% of
automatically translated
content is truly useful for
online commerce.
Only three European languages
Boosting commerce through multilingual technologies2
Connecting citizens to European digital public services3
Without LanguageTechnology, the European Commission has no way to respond effectively to citizen participation.
Current language technology is inadequate for over half
of the EU official languages to help the European
Commission solve its citizen engagement problem.
Translation opens 20 times its cost in revenue opportunity.
However, translation remains too expensive for many
European SMEs, blocking this opportunity and limiting economic
growth in Europe. Lowering these costs is a strategic opportunity
Translation
Costs
Increase in
Revenue
good
bad
ugly
OnlineAutomatic
TranslationQuality
Most local governmental services are monolingual only.
This poses a problem for tourists, expatriates, and
linguistic minorities. Language technology can provide the
Multilingual eParticipation can help build the European Identity
with one another in their respective native languages with sophisticated machine translation working behind the scenes. Only
when EU citizens can interact in their own languages will they truly develop a sense of European identity and community.
Over half of EU citizens are language blocked from interacting with
the European Commission’s web resources for citizen participation.
290 million EU citizens excluded Speakers of other
languages are
language
blocked from
full participation
Speakers of
English, French,
German can
participate
fully
Strategic Agenda for the Multilingual Digital Single Market http://rigasummit2015.eu.
META, the Multilingual EuropeTechnology Alliance, has more than 750 members (http://www.meta-net.eu
LT-Innovate, the European Association of the LanguageTechnology Industry, has 180 corporate members throughout Europe (http://lt-innovate.eu
Technology support has improved for some languages since this study was completed.
Technology Solutions
Investment in the following solutions will help achieve the
Multilingual Digital Single Market
Unified Customer Experience
care, customer relationship, discussion fora,
Multimodal User Experience for
Connected Devices
interfaces
household appliances, and consumer
Voice of the Customer
market research
Content Curation and Production
DigitalTranslation Centre
customers, citizens
TheforthcomingStrategicAgendafortheMultilingualDigitalSingleMarketwillprovideadditional
detailsontheseandothersolutionsfortheneedsoftheMultilingualDigitalSingleMarket.
Downloadthisfactsheetfromhttp://cracker-project.eu.
FormoreinformationcontactDr.GeorgRehm(DFKI)atgeorg.rehm@dfki.de.
http://cracker-project.eu/wp-content/uploads/2015/11/mDSM-Fact-Sheet.pdf
38. Open Letter to the EC
q On Friday, March 20, 2015, we published an open letter to the EC on
http://multilingualeurope.eu.
q On Monday, March 23, 2015, we informed
President Juncker and all Commissioners
about the campaign and the 1300+ signatures.
q By now more than 3600 signatures!
38
q 5 Members of the European
Parliament
q 150+ high-level representatives from
industry (CxO level)
q 1200+ professors
q 400+ project or research managers
q 20+ entrepreneurs and founders
q hundreds of language and language
technology professionals, officials,
researchers, administrators and
representatives from related
stakeholder groups
Who signed?
39. META-FORUM 2015
q April 27 in Riga, Latvia
q Riga Summit 2015 on the Multi-
lingual Digital Single Market
q Two important components:
§ MDSM SRIA Version 0.5
§ Further community fusing
q http://www.meta-forum.eu
40. Joint EFNIL and NPLD Panel
q Joint EFNIL and NPLD panel at META-FORUM 2015.
q Joint position paper.
Initially presented at META-FORUM 2015 and the Riga Summit 2015
on the Multilingual Digital Single Market, April 27, 2015
www.rigasummit2015.eu
Joint NPLD/EFNIL
Position Paper on the
Multilingual Digital Single Market
!
“Languages are not only a means of communication. They also have embedded in them people’s
values, aspirations and hopes.”(European Roadmap for Linguistic Diversity 2015, NPLD)
“Many European languages run the risk of becoming victims of the digital age as they are un-
der-represented and under-resourced online. Huge regional market opportunities remain un-
tapped because of language barriers.” (Multilingual Europe: A challenge for language tech.
MultiLingual. April/May 2011, page 51/52)
41. Vision
Paper
Vision Group
Translation and
Localisation Report
Vision Group
Interactive Systems
Report
Vision Group Media
and Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
META-NET Strategic
Research Agenda for
Multilingual Europe 2020
2010 2011 2012 2013 2014 2015
www.meta-net.eu
office@meta-net.eu
T: +49 30 23895 1833
The Future European Multilingual
Information Society
Vision Paper for a Strategic Research Agenda
“People can’t share knowledge
if they don’t speak a common language.”
Davenport, Thomas H, and Laurence Prusak, Working Knowledge:
How Organizations Manage What They Know, Harvard Business School,
Boston, 1997, p. 98.
Join the discussion at
www.meta-et.eu/forum
LT 2020
Vision and Priority Themes for
Language Technology Research
in Europe until the Year 2020
Towards the META-NET Strategic Research Agenda
The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro-
pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement
270893) and META-NORD (Grant Agreement 270899).
Do you have comment
s, ideas or suggestio
ns
with regard
to the content
of this document
?
Please
send them to office@m
eta-net.eu
or
discuss
them online:
http://ww
w.meta-n
et.eu/sra.
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Translation and Localisation
Results of first two meetings
Editors: Aljoscha Burchardt, Georg Rehm
Dissemination Level: Public
Date: 3 December 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Media and Information Services:
Results of first two meetings
Editors: Maria Koutsombogera, Stelios Piperidis
Dissemination Level: Public
Date: 10 November 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Interactive Systems:
Results of first two meetings
Editors: Joseph Mariani, Bernardo Magnini
Dissemination Level: Public
Date: 28 December 2010
Strategic
Research and
Innovation Agenda
roadmaps, agendas and any
other input from other initiatives
…
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
42. Strategic Agenda for MDSM
q Presented at META-FORUM 2015
and Riga Summit for the first time.
q Version 0.5 – work in progress
q Builds upon many strategy papers
and roadmaps prepared by
several European projects,
incl. the META-NET SRA (2013).
q Input and feedback collected at the
Riga Summit 2015 to be used for
upcoming versions.
http://www.meta-net.eu D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
43. A Strategy for the MDSM
q Strategic R&I Agenda for the
Multilingual Digital Single Market
q Core: Technology Solutions
q Data economy is an inherent
component – LT for effective
multilingual data value chains.
http://www.meta-net.eu 43
45. q Letter from Andrus Ansip (June 2015)
q “We invite the European language
technology community to further
develop the ideas presented in the
draft Strategic Agenda for the
multilingual Digital Single Market”
47. Riga Declaration
q 12 organisations present at
META-FORUM 2015 and the
Riga Summit 2015 drafted and
signed the “Declaration of
Common Interests”.
q CRACKER: community building,
mostly among projects.
q We combined these into the
Cracking the Language Barrier
federation.
q Important goal: measure against
community fragmentation.
http://www.meta-net.eu
DECLARATION OF COMMON INTERESTS
We, the undersigned, declare here, at the Riga Summit on the Multilingual Digital Single
Market, encouraged by the letter Vice President Andrus Ansip sent to its participants, that we
stand united in our goal and interest to:
- support multilingualism in Europe by employing language technology in business,
society and governance, to create a truly Multilingual Digital Single Market,
- exchange and share information in our efforts to promote our goals and interests at
local, national and European levels,
- raise awareness in society at large using channels available to our associations,
alliances and societies.
In the near future, we foresee the establishment of a Memorandum of Understanding among
our organisations towards a “Coalition for a Multilingual Europe”, to better serve our
members address the language barrier challenges towards establishing a truly integrated
Multilingual Digital Single Market.
Riga, 29. April 2015
Signed by (in alphabetical order):
BDVA Laure Le Bars
CITIA Steve Renals
CLARIN Steven Krauwer
EFNIL
Sabine Kirchmeier-Andersen,
Tamás Váradi
ELEN Davyth Hicks, Claudia Soria
ELRA
Nicoletta Calzolari,
Khalid Choukri
GALA
Laura Brandon, Robert E. Etches,
Sergey Gladkov
LT Innovate
Jochen Hummel,
Philippe Wacker
META-NET
Jan Hajic, Josef van Genabith,
Georg Rehm, Andrejs Vasiljevs
NPLD Meirion Prys Jones
TAUS Jaap van der Meer
W3C Richard Ishida, Felix Sasaki
For any questions, please contact Georg.Rehm@dfki.de.
48. http://www.cracker-project.eu • http://www.meta-net.eu
• A federation of European projects and
organisations working on technologies
for a multilingual Europe.
• Multi-lateral Memorandum of Understanding;
10 organisations and 24 projects on board
already (including FP7 and H2020-ICT15).
• Getting new members on a regular basis.
• Selected areas of collaboration: data
management and repositories, tools,
shared tasks, evaluations, events.
• Goal: provide one umbrella organisation
for the whole community.
50. http://www.cracker-project.eu • http://www.meta-net.eu
• Website: information about the initia-
tive, all projects and organisations
• Downloadable documents
• List of events
• LREC 2016 MT Eval Workshop
• Several new members will
join the initiative soon
http://www.cracking-the-language-barrier.eu
52. Andrus Ansip’s Blog Post
q Posted on 27 May 2016.
q First public acknowledgment
of the EC that the language
topic is of very high relevance
for the Digital Single Market.
q “Overcoming language
barriers is vital for building the
DSM, which is by definition
multilingual. It is now time to
reduce and remove the
language barriers that are
holding back its advance, and
turn them into competitive
advantages.”
http://www.meta-net.eu 52
53. Reorganisation of DG CONNECT (01/07/2016)
01/07/2016
DG CONNECT
Communications Networks,
Content & Technology
Director-General
R. Viola (60240
Assistants
O. Bringer (92067
P. Stuckmann (21097
Deputy Director-General
in charge of Directorates
A, C, E & H
G. Kent (acting) (91945
Assistant
E. Mitjana (81149
Deputy Director-General
in charge of Directorates
B, D, F, G & I
C. Bury (60499
Assistant
P. Lamotte (98892
Directorate F
Digital Single Market
G. de Graaf
(68466
Directorate E
Future Networks
M. Campolargo
(63479
Directorate D
Policy Strategy
& Outreach
L. Corugedo
Steneberg (96383
Directorate C
Digital Excellence
& Science Infrastructure
Th. Skordas (acting)
(68908
Directorate B
Electronic
Communications
Networks & Services
A. Whelan (50941
Directorate A
Digital Industry
K. Rouhana
(68057
Principal Adviser
F. Lupescu
(68538
Directorate R
Resources
& Support
G. Kent
(91945
Directorate I
Media Policy
G. Abbamonte
(93573
Directorate H
Digital Society, Trust
& Cybersecurity
P. Timmers
(90245
Directorate G
Data
J. Hernández-Ros
(acting) (34533
F.1: Digital Policy
Development &
Coordination
M. Bailey (acting)
(69176
E.1: Future
Connectivity
Systems
B. Barani (acting)
(69616
D.1: Research
Strategy &
Programme
Coordination
M. Fjalland (50021
C.1: eInfrastructure &
Science Cloud
A. Burgueño Arjona
(92471
B.1: Electronic
Communications
Policy
V. Terävä
(92381
A.1: Robotics
& Artificial
Intelligence
J. Heikkilä
(35325
R.1: Human
Resources &
Competences
I. Mariën-Dusak
(92376
I.1: Audiovisual &
Media Services
Policy
L. Boix Alonso
(90009
H.1: Cybersecurity
& Digital Privacy
J. Boratynski
(69452
G.1: Data Policy &
Innovation
M. Nagy-Rothengass
(31680
F.2: E-Commerce &
Platforms
P. Agarwal (acting)
(87153
E.2: Cloud &
Software
P. O’Donohue
(91280
D.2: Policy
Implementation &
Planning
E. Forti
(65172
C.2: High Performance
Computing &
Quantum Technology
G. Kalbe
(32866
B.2: Implementation
of the Regulatory
Framework
W-D. Grussmann
(58559
A.2: Technologies
& Systems for
Digitising Industry
M. Lemke
(91575
R.2: Budget &
Finance
M-C. Laffineur
(68515
I.2: Copyright
M. Martin-Prat
(65157
H.2: Smart
Mobility & Living
E. Hartog
(90084
G.2: Data
Applications &
Creativity
J. Hernández-Ros
(34533
F.3: Start-ups &
Innovation
P. Zilgalvis
(50935
E.3: Next-
Generation Internet
J. Villasante
(63521
D.3: Policy Outreach
& International
Affairs
A. Angelova-Krasteva
(91145
C.3: Future &
Emerging
Technologies (FET)
V. Peca
(57843
B.3: Markets
R. Krüger
(61555
A.3: Competitive
Electronics
Industry
W. Van Puymbroeck
(68138
R.3: Knowledge
Management &
Support Systems
F. Accordino
(98272
I.3: Audiovisual
Industry & Media
Programme
L. Recalde Langarica
(91281
H.3: E-Health,
Well-Being &
Ageing
M. González-Sancho
(52918
G.3: Learning,
Multilingualism &
Accessibility
M. Marsella (acting)
(32750
F.4: Digital
Economy & Skills
L. Sioli
(51262
E.4: Internet of
Things
M. Rohen
(63674
D.4: Communication
D. Ringrose
(93913
C.4: Flagships
Th. Skordas
(68908
B.4: Radio
Spectrum Policy
A. Geiss
(59466
A.4: Photonics
C. Maloney
(69082
R.4: Compliance &
Planning
K. Engelbosch
(54693
I.4: Media
Convergence &
Social Media
J. Cotta
(66407
H.4: E-Government
& Trust
A. Servida
(58186
G.4: Administration
& Finance
G. Kalbe (acting)
(32866
A.5: Administration
& Finance *
A. Fiala
(64787
B.5: Investment in
High-Capacity
Networks
A. Krzyżanowska
(87246
H.5: Administration
& Finance **
G. Van Caenegem
(acting) (61895
R.5: Programme
Operations &
Common Services
I. Malekos
(52902
Mirror-Unit REA.A.5
Fostering Novel
Ideas: FET-Open
T. Hallantie
(68167
Mirror-Unit EACEA.B.2
Creative Europe:
MEDIA
H. Trettenbrein
(84955
Mirror-Unit REA.C.4
Expert Contracting
& Payments
A. Oram
(97805
Principal Adviser
M. Richards
(62443
Adviser for Legal
& Legislative Issues
Ž. Bahovec (88284
Adviser for cross-cutting
Policy/Research Issues
G. Santucci (68963
Adviser for International
Relations linked to Future
Networks
P. Blixt (68048
Adviser for Societal
Issues
N. Dewandre (94925
Adviser for Organisational
Transition (Finance)
Vacant
Adviser for Societal
Challenges
Vacant
Adviser for Innovation
Systems
B. Salmelin (69564
Reporting lines are:
- R. Viola for Directorate R;
- G. Kent (acting) for Directorates A, C, E, H;
- C. Bury for Directorates B, D, F, G, I.
Luxembourg;
To be transferred to Luxembourg.
Shared Administration & Finance Unit for
Directorates A, B, C, D & F.
Shared Administration & Finance Unit for
Directorates E, H & I.
Unit G.1 “Data Policy & Innovation”
Unit G.3 “Learning,
Multilingualism & Accessibility”
• Support the data economy in the Digital Single Market
• Policy initiatives addressing new and emerging issues.
• Advance the Commission open data policy by ensuring the
correct implementation of the PSI Directive and the Pan-
European Open Data Portal
• Promote the emergence of an ecosystem comprising all the
players of the data value chain.
• Steers together with industry the SRIA.
• Addresses key framework conditions of the data economy
• Fund research and innovation in data technologies and
applications inter alia by driving the big data PPP.
• Make the DSM more accessible, secure and inclusive.
• Support policy, research, innovation and deployment of learning
technologies
• Support key enabling digital language technologies and
services to allow all European consumers and businesses
to fully benefit from the Digital Single Market.
• Responsible for Web Accessibility Directive
• Promote a better Internet for children by protecting and
empowering children online, and improving the quality of content
available to them.
56. MDSM SRIA
q Version 0.5 unveiled at META-FORUM 2015
q Version 0.9 unveiled at META-FORUM 2016
q Version 1.0 foreseen for Nov./Dec. 2016
q Prepared and presented by Cracking the Language
Barrier federation (editorial team: 13 colleagues)
q SRIA addresses how the LT community is going
to act united in order to make the DSM multilingual
q Document available on http://www.cracker-project.eu
and also on http://www.cracking-the-language-barrier.eu
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
57. MLV Programme
q Multilingual Value Programe*
§ Three-year programme
§ Requires modest investment
q “Enabling the Multilingual Digital Single
Market through technologies for
translating, analysing, processing and
curating natural language content”
q Three components address the main
needs of the Multilingual DSM (MDSM)
and how to put them into practice:
1. Multilingual Application Areas
2. Multilingual Services
3. Research
http://www.meta-net.eu 57
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
* SRIA V0.9 and MLV Programme devised
before re-organisation of DG CONNECT.
58. MDSM: Goals and Needs
q Crosslingual communication for SMEs, public institutions, citizens
q Crosslingual SME presales communication and aftersales services
q Multilingual (big) data, language and knowledge value chains
q Multilingual websites, product catalogues, product descriptions
q Multilingual knowledge bases and knowledge graphs (and services)
q Multilingual conversational interfaces for connected devices (IoT)
q Crosslingual business intelligence (e.g., based on UGC)
q Crosslingual social media analytics for EU-wide societal issues
q Multilingual text and report generation (knowledge/data to text)
q All services must be domain-adaptable (no one size fits all)
q Translation Centre (Cloud) – HQ automated translation for all
http://www.meta-net.eu 58
59. Multilingual Digital Single Market
Automated Translation
E-Commerce
Content, Media,
Verticals
Translation, Language,
Knowledge, Data
Knowledge and
Data Repositories
Multilingual Applications
Multilingual Services
Research
Crosslingual Big
Data Language
Analytics
Meaning,
Semantics,
Knowledge
High-Quality
Machine
Translation
SMEs CEF DSIs IT Integrators Research
provide innovative
applications
fills gaps
H2020 RIAs
H2020 CSAs, IAs, RIAs
H2020 CSAs, RAs, national funding
Multimodal Interaction
Language Processing, Analysis and Production – Language Resources
Citizens Public Business
interoperable and standardised
collaboration with member states
Conversational
Technologies
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
MLV Programme
60. Application Areas (Selection)
q Multilingual E-commerce
§ Customer-facing vs. back-office facing (after-market, after-sales)
§ Crosslingual search, CRM, helpdesks, processes, workflows
§ Semantic, crosslingual product descriptions and catalogues
§ Online dispute resolution
q Multilingual Content, Media, Verticals
§ Content analytics, curation, generation (incl. authoring support)
§ Multimodal communication (conversational, written, IoT)
§ Vertical domains: health, government, mobility, energy, legal.
q Translation, Language, Knowledge, Data
§ Translation Cloud – written/spoken, automatic/human
§ Crosslingual public and social intelligence, business intelligence
§ HQ resources, under-resourced languages, domain-specific LRs
61. Setup – Timeframe – Costs
q Close collaboration with EC, EP and all other stakeholders
(including SMEs, research centres, universities, NGOs etc.).
q Mix of funding sources:
§ Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA)
§ National/regional funding sources for work on monolingual LTs
and LRs and also to support and grow SMEs in this area
§ Include, strengthen and broaden role of CEF AT (public services)
q Estimated costs for basic MLV implementation: ca. 175-200M€
§ Includes set of mission-critical services and applications
§ Timeframe: 2018, 2019, 2020
http://www.meta-net.eu 61
63. q There is a lot of traction for the multilingualism/language topic.
q The EU should develop a Multilingual Strategy (incl. technology).
q Strategy must take into account several stakeholders: citizens,
business/innovation, DSM, research (multiple communities).
q Most components in place: Communities, SRIAs, STOA Study etc.
q We need the political will to establish language policy change to
support multilingualism (both member state level, EU level).
q Some Member States are ahead (DK, IE, EE, ES, LT, LV, NL, SL).
q Coordinate, intensify the push and keep up the pressure from
Member States, EP, EC, research community, businesses etc.
q Goal: a shared programme (EU/MSs) as a concerted action.
http://www.meta-net.eu 63
Conclusions
64. Next Steps
q Several tightly interconnected goals:
§ Multilingual Technologies for Europe
§ Technologies for the Multilingual Digital Single Market
§ Multilingual Strategy of the European Union
§ The Human Language Project
1. Discuss and further shape MLV Programme V0.9 with EC
2. Extend the Cracking the Language Barrier federation
3. LT brainstorming meeting at EC, Unit G.3 (Dec. 2016)
4. EP STOA Workshop on Language Technologies (Jan. 2017)
5. MDSM SRIA V1.0 to be finalised (Q1 2017)
http://www.meta-net.eu 64
67. Language Technology Topics
q Multilingual Europe – Technologies for all European languages
q Machine Translation, Text Analytics, Semantic Web etc.
q Healthcare, societal challenges (ageing population, refugees etc.)
q IoT, Smart Assistants and Conversational Interaction Technologies
q E-Learning – Language Technology for E-Learning
q Smart Homes, Cities, Manufacturing
q Smart Virtual Assistants
q Social Media Analytics
q E-Participation
q Games
q etc.
http://www.meta-net.eu 67
68. Digital Language Extinction
q Many smaller languages are experiencing problems digitally:
§ Loss of function – other languages take over entire functional areas
such as, e.g., texting, email, search, e-commerce etc.
§ Loss of prestige – if it’s not on the web, the languages doesn’t exist
§ Loss of competence – can you raise a digital native in your language?
q Andras Kornai’s classification – corresponds to the amount of digital
communication in that language:
1. digitally thriving languages (comfort zone languages)
2. vital languages
3. heritage languages
4. still/moribund/dead languages
q Implications for the European/global multilingual web?
http://www.meta-net.eu 68
potentially facing digital extinction …
69. http://www.meta-net.eu
q Pan-European infrastructure, bringing together providers and consumers of
language data, tools and services.
q LRs are documented, uploaded, stored, catalogued, downloaded, shared – to
improve visibility, documentation, identification, availability, interoperability.
q Caters for datasets, tools, services for LT research and development (both
academic and commercial); META-SHARE includes repository software, a
metadata model, licensing kit, statistics.
q 29 distributed repositories maintained
by 37 organisations in 25 countries.
q 2.600+ resources (corpora: 49%,
lexical: 38%, tools/services: 12%),
covering ca. 100 languages.
q 7.000+ downloads in total; ca. 70%
of all LRs have been downloaded.
70.
71. Preparation of the SRA
q Strategic Research Agendas of other initiatives were screened.
q Many suggestions as input from Vision Group members.
q We discussed procedures, input and structure of the SRA in four
meetings of the META Technology Council.
§ Brussels, Belgium, November 16, 2010
§ Venice, Italy, May 25, 2011
§ Berlin, Germany, September 30, 2011
§ Brussels, Belgium, June 19, 2012
q Additional input in talks, meetings, workshops, discussions, etc.
§ Example: Three HLT Expert Meetings organised by the EC (end of 2011)
q Almost 200 experts contributed to the SRA (54% from industry; 46%
from research; 4% from national/international institutions).
http://www.meta-net.eu 71
72. • Published in early 2013.
• First strategic research
agenda for our field.
• Complex process of
collecting and shaping
technology visions.
• Hundreds of researchers
participated.
• Broad topics around multi-
lingual Europe in general.
73. PT1: Translingual Cloud
q Europe has a big need for translations of publishable quality.
q Focus on high-quality translation.
q New research paradigms
§ Inclusion of professional translators into the
research process
§ Inclusion of technologists into research on
human translation processes
q Different technological approaches
§ Stronger emphasis on the properties of
individual languages
§ A central role for semantics
q Methods for specific genres & domains
http://www.meta-net.eu 73
74. Priority Research Theme 1: Translingual Cloud
Any
device
Target groups: European citizen, language
professional, organisations, companies, European
institutions, software applications
Multiple target
formats
Single access
point
Automatic translation and
interpretation
Language checking
Post-editing
Workbenches for creative
translations
Novel translation and authoring
workflows
Quality assurance
Computer-supported human
translation
Multilingual content production and
text authoring
Trusted service centre (privacy,
confidentiality, security of source
data)
Services and Technologies:
Crosslingual communication,
translation and search
Real-time subtitling, voice-over
generation and translating speech
from live events
Mobile interactive interpretation
Multilingual content production
(media, web, technical, legal
documents)
Showcases: translingual spaces for
ambient translation
Applications:
Written (twitter, blog, article, newspaper,
text with/without metadata etc.) or
spoken input (spontaneous spoken
language, video/audio, multiple speakers)
Modular combination
of analysis, transfer
and generation
models
From very fast but lower
quality to slower but very
high quality (including
instant quality upgrades)
Exploiting strong
monolingual analysis
and generation methods
and resources
Multiple target
formats
Domain, task and
genre specialisation
models
Extending
translation with
semantic data and
linked open data
75. PT2: Social Intelligence
q Better decisions by monitoring social media
q Inclusion of citizens into collective decision processes
q Opinion formation, consensus building, decision making
q Evolution of new solutions
q New forms of democracy: e-democracy,
massive participation, transparency
q Dialogues and debates across language
boundaries and across parties, political
alliances, social classes
q Better than binary voting
q Documented transparent
decision processes
http://www.meta-net.eu 75
76. Priority Research Theme 2: Social Intelligence and e-Participation
From shallow to deep,
from coarse-grained to
detailed processing
techniques
Making language
technologies interoperable
with knowledge representa-
tion and the semantic web
“Semantification” of the
web: tight integration
with the Semantic Web
and Linked Open Data
Mapping large, heterogeneous,
unstructured volumes of online
content to structured, actionable
representations
Unleashing social intelligence by
detecting and monitoring opinions,
demands, needs and problems
Target groups: European citizen,
European institutions, discussion
participants, companies
Make use of the
wisdom of the
crowds
Improved
efficiency and
quality of decision
processes
Understanding influence
diffusion across social media
especially social media, comments,
blogs, forums
decision-relevant information
support
sentiment analysis and opinion mining
including the temporal dimension)
cues
from arbitrary online content
visualising discussions and opinion
statements
Services and Technologies:
collective deliberation and
e-participation
-
wide deliberation on pressing issues
and processes; modeling evolution of
opinions
analysis technologies
Applications:
77. Priority Research Theme 3: Socially-Aware Interactive Assistants
Interacting
naturally
with and in
groups
Learning
and
forgetting
information
Adaptable to the
user’s needs and
preferences and
the environment
Include human-computer,
human-artificial agent and
computer-mediated human-
human communication
Proactive,
self-aware,
user-adaptable
Interacts naturally with
humans, in any
language and modality
Can be personalised to
individual communication
abilities including special needs
Can learn incrementally
from all interactions and
other sources of information
recognition
and synthesis, providing expressive
voices
understanding
incremental conversational speech
models of human communication
inter-dependencies
priority themes
Services and Technologies:
Applications:
dialogue systems
environment
modalities (visual, tactile, haptic) verbal/non-verbal behaviour, social
context
ments, any
vocabulary
recovery,
self-
assessment
Multilingual
capabilities
80. q European Parliament
§ Upcoming STOA Study and Workshop (Jan. 2017)
q European Commission
§ DG CONNECT: Horizon 2020 WP 2018-2020 (G1)
§ DG CONNECT: New Unit “Learning, Multilingualism, Inclusion” (G3)
§ DG Translation: Connecting Europe Facility, AT
q Language Communities: EFNIL and NPLD
§ Joint position paper META-FORUM 2015, 2016
q EU Member States and Non-Member States
§ National and regional funding agencies (ES, NL etc.)
q Research Communities, especially Big Data community (BDVA
SRIA V3.0), Web community and many others (Robotics, IoT etc.)
q Standardisation – W3C and others
http://www.meta-net.eu 80
Multilingual Europe Stakeholders
81. Multilingual Success Stories
q Moses SMT toolkit as well as research and technology ecosystem
q CEF AT for public online services – good and timely development
q eBay: MT to Russian – 50% increase in sales
q Hugo.lv for Latvian public services – better than Google Translate
q Hundreds of European startups in Language Technology and AI
q Conversational interfaces (Siri, Echo, Cortana): the next big thing
q IBM Watson – a billion dollar LT business
q Great Neural MT results reported by European researchers (QT21)
q Very rapid development – many opportunities for European R&D&I
http://www.meta-net.eu 81