SlideShare a Scribd company logo
1 of 81
Download to read offline
META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER
(grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119),
CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899).
Multilingualism
for Digital Europe
Georg Rehm
General Secretary META-NET, Coordinator CRACKER
DFKI, Germany
georg.rehm@dfki.de
Ringvorlesung Digitale Lebenswelten – Universität Hildesheim, 15th November 2016
Outline
q A Multilingual Europe Initiative: META-NET
§ LT Support – META-NET White Paper Series
§ LT Strategy – META-NET SRA
q Continuing the Initiative – Recent Developments
§ The Digital Single Market and Multilingualism
§ Cracking the Language Barrier
§ META-FORUM 2015/2016 – MDSM SRIA V0.5/V0.9
q Goals and Next Steps
http://www.meta-net.eu 2
META-NET and META:
Brief History
http://www.meta-net.eu 3
Multilingual Europe in 2010
4http://www.meta-net.eu
q Challenge: Providing each language community with the most
advanced technologies for communication and information so that
maintaining their mother tongue does not turn into a disadvantage.
q While research has made considerable progress in recent years, the
pace of progress is not fast enough to meet the challenge within the
next 10-20 years.
q All stakeholders – researchers, LT industries, policy makers,
language communities, funding programmes – should
team up in a strategic alliance for a major dedicated push.
q
60 research centres in 34 countries (founded in 2010)
Chair of Executive Board: Jan Hajic (CUNI)
Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde)
General Secretary: Georg Rehm (DFKI)
q
Multilingual Europe
Technology Alliance.
826 members in
67 countries
(published in 2013) (31 volumes; published in 2012)
T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
META-NET
White Paper Series
http://www.meta-net.eu 6
q Basque
q Bulgarian*
q Catalan
q Croatian*
q Czech*
q Danish*
q Dutch*
q English*
q Estonian*
q Finnish*
q French*
q Galician
q German*
q Greek*
q Hungarian*
q Icelandic
q Irish*
q Italian*
q Latvian*
q Lithuanian*
q Maltese*
q Norwegian
q Polish*
q Portuguese*
q Romanian*
q Serbian
q Slovak*
q Slovene*
q Spanish*
q Swedish*
q Welsh
* Official EU languagehttp://www.meta-net.eu/whitepapers
Cross-Lingual Comparison
q 1. Machine Translation 2. Text Analytics
3. Speech Processing/Synthesis 4. Language Resources
q Ranking: from excellent LT support to weak/no LT support.
q Cross-lingual comparison discussed and finalised at a network
meeting with representatives of all languages (Oct., 2011).
http://www.meta-net.eu 8
MT
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or no support through LT
Basque, Bulgarian, Croatian,
Czech, Danish, Estonian, Finnish,
Galician, Greek, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Norwegian, Portuguese, Serbian,
Slovak, Slovene, Swedish, Welsh
excellent
Czech, Dutch,
Finnish, French,
German, Italian,
Portuguese,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Danish, Estonian, Galician,
Greek, Hungarian, Irish,
Norwegian, Polish, Serbian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Icelandic, Latvian,
Lithuanian, Maltese, Romanian,
Welsh
excellent
English
good
Speech
English
good
Dutch, French,
German, Italian,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Czech, Danish, Finnish,
Galician, Greek, Hungarian,
Norwegian, Polish,
Portuguese, Romanian,
Slovak, Slovene, Swedish
weak or no support through LT
Croatian, Estonian, Icelandic, Irish,
Latvian, Lithuanian, Maltese,
Serbian, Welsh
excellent
English
good
Czech, Dutch,
French, German,
Hungarian, Italian,
Polish, Spanish,
Swedish
moderate fragmentary
Basque, Bulgarian, Catalan,
Croatian, Danish, Estonian,
Finnish, Galician, Greek,
Norwegian, Portuguese,
Romanian, Serbian, Slovak,
Slovene
Icelandic, Irish, Latvian,
Lithuanian, Maltese, Welsh
weak or no support through LTexcellent
ResourcesTextAnalytics
Fragmentary
Weak/none
Moderate
Good
Excellent
Welsh
Maltese
Lithuanian
Latvian
Icelandic
Irish
Croatian
Serbian
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
Levelofsupport
Languages with names in red
have little or no MT support
Results of the META-­NET  White  Paper  Study  (2012)
Observations and Results
http://www.meta-net.eu 11
q When it comes to technology
support, there are massive
differences between Europe’s
languages and technology areas.
q Support for English is ahead of
any other language.
q But: even support for English is
far from being perfect.
q Several languages get the weakest
score in all four areas (e.g., Icelan-
dic, Latvian, Lithuanian, Maltese)!
Digital Language Extinction!
q “At Least 21 European Languages in Danger of Digital Extinction!”
q Press release on European Day of Languages (Sept. 26, 2012).
q Huge global interest in the topic and our key findings!
q 600+ mentions in the press.
q News from 40+ countries in 35+ different languages.
q 20+ television reports and 30+ broadcast interviews (radio, tv) with
META-NET representatives.
q Two Parliamentary Questions in the EP on the “digital extinction of
languages” topic.
q These results lead to a STOA Workshop in the EP (Dec. 3, 2013).
http://www.meta-net.eu 12
Desudensættesderpengeaftilatøgeantal-
let af operationer og udvide ambulatorieka-
paciteten på det urologiske område på Herlev,
»Mensåerdetogsåvigtigtatholdefastidet
målogikkestillesigtilfredsmed,at80eller85
pct.kommerigennemtiltiden.«B
Af Jens Ejsing
// ejs@berlingske.dk
Det danske sprog har det svært i den digitale
verden.
Det konstaterer danske sprogforskere- og
eksperter i forbindelse med den nye inter-
nationale undersøgelse META-NET, der ser
nærmere på, hvordan en lang række mindre,
europæiske sprog som dansk klarer sig i den
digitaleverden.
Forskerne fra bl.a. Københavns Universitet
og Dansk Sprognævn når frem til, at dansk
i fremtiden kan få det endnu sværere i den
digitale verden, fordi Google Translate, GPSer,
applikationertilsmartphonesogandresprog-
teknologiske programmer ikke i tilstrækkelig
grad formår at behandle de mange nuancer i
detdanskesprog.
Professor i sprogteknologi på Københavns
Universitet, Bolette Sandford Pedersen,
mener, at der er brug for en slags digital dansk
sprogbank fyldt med data, så bl.a. oversættel-
ser bliver så præcise og gode som muligt. Med
hjælp fra sprogbanken kan forskere ifølge
professoren hjælpe virksomheder med at for-
bedreprogrammer,derskalhåndteresproglig
viden om bl.a. maskinoversættelse, tale-
genkendelseoginformationssøgning.
Dermedvilderblivelængeremellemfejlag-
tige oversættelser, som når »hæld olie på pan-
den« med Google Translate bliver til »pour oil
on the forehead« på engelsk. Oversættelser,
der er i værste fald er så upræcise, at danskere
ender med at fravælge deres eget sprog i den
digitaleverden.
Sproghjælp til virksomheder
Hun anerkender dog, at »teknologien til auto-
matiske oversættelser på mange måder er
fantastisk«.
»Den er bare ikke god nok, når det gælder
dansk,«sigerhun:
»Detersomom,atviietvistomfanglægger
det i hænderne på Google eller andre virk-
somheder at afgøre, om dansk skal behandles
godt nok eller ej. Men det danske marked
er ikke stort for dem. Spørgsmålet er derfor,
Dårlig sprogteknologi truer dansk på nettet
Ord. Forskere arbejder på at forbedre danske oversættelser på internettet.
om vi ikke i højere grad selv skal gøre noget
for at sikre, at det fornødne datamateriale er
til rådighed, så vi får gode oversættelser og
anden god sprogteknologi. Det kunne f.eks.
være ved, at vi gjorde en indsats for at få opret-
tet en sprogbank med en masse beriget mate-
rialeomdansk.«
»Hvis vi hele tiden oplever, at oversættel-
ser er behæftede med fejl, tør vi ikke stole på
dem,« siger hun og understreger, at »fejlagtige
oversættelserkanføretilstoremisforståelser«.
Ifølge Dansk Sprognævns direktør, Sabine
Kirchmeier-Andersen,kandårligsprogtekno-
logi have konsekvenser for mange danskere,
derikkeersågodetilengelsk.
»Hvis vi har ambitioner om at bruge det
danske sprog i fremtidens teknologiske
univers, skal der gøres en indsats nu for at
fastholde ekspertise og udbygge den viden, vi
har,«menerhun:
»Ellers risikerer vi, at kun folk, der taler fly-
dendeengelsk,vilfåglædeafdenyegeneratio-
ner af web-, tele- og robotteknologi, der er på
vej.«B
INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN KILDE: REGION HOVEDSTADEN
H Der er omkring 80 sprog i EU. For 21 af
dem – også dansk – gælder det, at der er
store sprogteknologiske mangler, når det
gælder bl.a. maskinoversættelse, talegenken-
delse og informationssøgning.
H Ifølge en EU-undersøgelse køber et
stigende antal europæiske internetbrugere
varer eller tjenester på nettet, hvor det sprog,
der bliver anvendt, ikke er deres eget. Det
gælder over halvdelen af brugerne.
H Over hver tredje anvender et fremmed-
sprog til at skrive mail eller indlæg på nettet.
fakta H
Sprog i Europa
38
Στην ψηφιακή εποχή δεν…
µιλούν ελληνικά, όπως και
αρκετές άλλες ευρωπαϊκές
γλώσσες, σύµφωνα µε πανευρωπαϊ-
κή έκθεση µε την υπογραφή 200 και
πλέον ειδικών. Η συγκεκριµένη µελέ-
τη δηµοσιεύτηκε από το επιστηµονικό
δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτε-
σινή Ευρωπαϊκή Ηµέρα Γλωσσών.
Για τις ανάγκες της έρευνάς τους,
γλωσσολόγοι από 34 χώρες της Γη-
ραιάς Ηπείρου βαθµολόγησαν τις
διαθέσιµες γλωσσικές υπηρεσίες
και δηµιούργησαν ένα «Λευκό Βι-
βλίο» για κάθε ευρωπαϊκή γλώσσα.
Στη µελέτη τους, οι ειδικοί αναζήτη-
σαν µεταξύ άλλων τέσσερα βασικά
ηλεκτρονικά εργαλεία, δηλαδή την
ύπαρξη αυτόµατης µετάφρασης,
τη δυνατότητα φωνητικής αλληλε-
πίδρασης και ψηφιακής ανάλυσης
κειµένου, ενώ ταυτόχρονα διερευνή-
θηκε και η διαθεσιµότητα γλωσσικών
πόρων ή πηγών.
Σε πρώτη φάση εξέτασαν τις ιστο-
σελίδες που επιτρέπουν στους χρή-
στες να κάνουν µεταφράσεις online,
όπως, για παράδειγµα, η υπηρεσία
του κολοσσού πληροφορικής Google
Translate. Την ίδια ώρα, εξετάστηκε
και η «επικοινωνία» των ελληνόφω-
νων χρηστών µε τις…συσκευές τους,
όπως για παράδειγµα η δυνατότητα
να «µιλήσει» κάποιος στο GPS στη
µητρική του γλώσσα. Οι ερευνητές
κατέληξαν στο συµπέρασµα ότι
υπάρχουν τέτοιες συσκευές, αλλά
δεν είναι τόσο διαδεδοµένες όσο οι
αγγλόφωνες.
Το «χρυσό» µετάλλιο κατακτά,
όπως είναι άλλωστε και λογικό, η
αγγλική γλώσσα. Οι αγγλόφωνοι χρή-
στες έχουν την καλύτερη δυνατή τε-
χνολογική υποστήριξη, κάτι το οποίο
ευνοεί την περαιτέρω εξάπλωση της
γλώσσας. Από «τεχνολογικό απο-
κλεισµό» κινδυνεύουν περισσότερο
η ισλανδική, η λετονική, η λιθουανική
και η µαλτέζικη γλώσσα, ενώ σε λίγο
καλύτερη µοίρα βρίσκονται η ελλη-
νική, η βουλγαρική, η ουγγρική και
η πολωνική, που όπως αναφέρει η
έρευνα έχουν «αποσπασµατική» τε-
χνολογική υποστήριξη.
«Μέτρια» χαρακτηρίζεται η υπο-
στήριξη χρηστών σε ολλανδική, γαλ-
λική, γερµανική, ιταλική και ισπανική
γλώσσα. Οι επικεφαλής της επιστη-
µονικής οµάδας, Χανς Ουζκοράιτ και
Γκεόργκ Ρεµ, αναφέρουν χαρακτηρι-
στικά: «Υπάρχουν δραµατικές διαφο-
ρές στην υποστήριξη της γλωσσικής
τεχνολογίας ανάµεσα στις διάφορες
ευρωπαϊκές γλώσσες. Το χάσµα µετα-
ξύ “µικρών” και “µεγάλων” γλωσσών
ολοένα και διευρύνεται. Πρέπει να
εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων
σε ψηφιακούς πόρους γλωσσών µε
τις απαραίτητες βασικές τεχνολογί-
ες. ∆ιαφορετικά, οι γλώσσες αυτές
είναι καταδικασµένες σε ψηφιακή
εξαφάνιση».
Μάλιστα, οι ειδικοί τονίζουν ότι χω-
ρίς αποφασιστική δράση οι γλώσσες
αυτές δύσκολα θα… επιβιώσουν στον
ψηφιακό κόσµου του 21ου αιώνα. Η
κ. Μαρία Γαβριηλίδου, µέλος της επι-
στηµονικής οµάδας από το Ινστιτούτο
Επεξεργασίας του Λόγου Ερευνητικό
Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η
έρευνα αυτή δεν λέει ότι δεν θα ζήσει
η ελληνική γλώσσα ή ότι κινδυνεύει
µε εξαφάνιση». Η ειδικός εξηγεί ότι
όσο υπάρχουν άνθρωποι που µιλά-
νε, γράφουν και επικοινωνούν µε µια
γλώσσα, τότε αυτή θα συνεχίσει να
υπάρχει. Είναι σηµαντικό, όµως, να
έχουν όλοι οι χρήστες τη δυνατότητα
να «µιλήσουν» στις µηχανές, όπως τα
GPS τους, στα ελληνικά και να έχουν
στη διάθεσή τους γλωσσικά εργαλεία
ηλεκτρονικών υπολογιστών.
Μεταξύ αυτών των «εργαλείων»
είναι οι διορθωτές ορθογραφικών και
συντακτικών λαθών, που χρησιµοποι-
ούνται καθηµερινά από εκατοντάδες
Ελληνες χρήστες και βασίζονται στη
γλωσσική τεχνολογία.
Παρ’ όλα αυτά, τονίζει ότι η ψη-
φιακή εξάπλωση µιας γλώσσας είναι
σηµαντική «∆εν είναι στα χέρια του
µέσου χρήστη. Οι εκάστοτε κυβερ-
νήσεις, η Ευρωπαϊκή Ενωση και ο
ιδιωτικός τοµέας πρέπει να χρηµα-
τοδοτήσουν την ανάπτυξη αυτής της
τεχνολογίας για όλες τις γλώσσες»,
αναφέρει και συνεχίζει: «Οι χρήστες,
όµως, πρέπει να απαιτούν να υπάρ-
χουν και στη γλώσσα τους τα µέσα
αυτά και να µην ικανοποιούνται µε
τα αγγλικά». ■
Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ
Life
ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ
Με ψηφιακή εξαφάνιση
κινδυνεύουν τα ελληνικά
ΕΛΕΝΗ ΒΕΡΓΟΥ
evergou@e-typos.com
Η γλώσσα της
αποξένωσης…
GREEKLISH
Οι αγγλόφωνοι
χρήστες έχουν
την καλύτερη
δυνατή τεχνολογική
υποστήριξη,
γεγονός που ευνοεί
την περαιτέρω
εξάπλωση
της γλώσσας
ΜΕ GREEKLISH επικοινω-
νούν πλέον µέσω µηνυµά-
των ή email οι περισσότεροι
νέοι της χώρας µας. Παρά
το γεγονός ότι τα τελευ-
ταία χρόνια υπάρχουν τα
γλωσσικά εργαλεία, τα
οποία επιτρέπουν τη χρήση
της ελληνικής γραµµατο-
σειράς, έφηβοι και νέοι
ενήλικες φαίνεται ότι δεν
έχουν «αγκαλιάσει» αυτές
τις τεχνολογίες. Ο καθη-
γητής Γλωσσολογίας, κ.
Γιώργος Μπαµπινιώτης, λέει
στον «Ε.Τ.»: «Τα greeklish
είναι πρόβληµα για την
ελληνική γλώσσα, ιδίως για
ανθρώπους νέας ηλικίας
για έναν καθαρά γλωσσικό
λόγο. Με τη χρήση των
greeklish αποξενώνονται
από τη µορφή της λέξης ή
όπως λέµε το ετυµολογικό
ίνδαλµα που δηλώνεται µε
την ορθογραφία της λέξης
και συνδέεται και µε τη ση-
µασία της λέξης και µε την
προέλευσή της». Ο κίνδυνος,
µε τον οποίο έρχονται αντι-
µέτωποι οι νέοι άνθρωποι,
είναι η αποξένωση από τη
γραπτή µορφή της γλώσ-
σας. Αυτή η «οικειότητα»,
όµως, βοηθάει και στην
κατανόηση της σηµασίας
αλλά και την προέλευση της
λέξης. «Αυτή η αποξένωση
δεν είναι άνευ σηµασίας»,
αναφέρει ο ειδικός, ο οποίος
εξηγεί ότι η διαδικασία της
γραφής βοηθάει να εντυπω-
θεί η λέξη και να συνδεθεί
µε άλλες οµόρριζες λέξεις.
«Οταν χρησιµοποιείται αυτή
η µορφή επικοινωνίας, κα-
ταστρέφονται, ατονούν. ∆εν
είναι προς θάνατο, αλλά θα
κάνει ζηµιά», αναφέρει ο
κ. Μπαµπινιώτης, ο οποίος
συµβουλεύει τους χρήστες
να επιλέγουν την ελληνική
γραµµατοσειρά.
Γιώργος
Μπαµπινιώτης.
Date 30 September 2012
Page 16
Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency
agreement (www.nla.co.uk) or with written publisher permission.
For external republishing rights see www.nla-republishing.com
49KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012
Η
26η Σεπτεµβρίου έχει καθιε-
ρωθεί από το Συµβούλιο της
Ευρώπης ως η Ευρωπαϊκή
Ηµέρα των Γλωσσών, αλλά,
σύµφωνα µε µια νέα ευρωπαϊκή επι-
στηµονική έκθεση, οι 21 από τις 30
γλώσσες της Ευρώπης -µεταξύ των οποί-
ων και η Ελληνική- αντιµετωπίζουν κίν-
δυνο ψηφιακής εξαφάνισης.
Η έρευνα κρούει τον κώδωνα κινδύ-
νου, καθώς διαπίστωσε ότι η ψηφιακή
βοήθεια για τις περισσότερες ευρωπαϊκές
γλώσσες είναι ελλιπής ή απολύτως ανύ-
παρκτη για τους χρήστες.
Τις έφαγαν οι κοινές
Η έκθεση, µε τη µορφή µιας σειράς
Λευκών Βίβλων (µε τίτλο «Γλώσσες στην
Ευρωπαϊκή Κοινωνία της Πληροφορίας»),
από το επιστηµονικό δίκτυο ΜΕΤΑ-
ΝΕΤ, το οποίο συνενώνει 60 ερευνητικά
κέντρα σε 34 χώρες, επισηµαίνει ότι οι
γλώσσες που µιλιούνται από σχετικά
µικρό αριθµό ανθρώπων κινδυνεύουν,
επειδή δεν έχουν τεχνολογική υποστή-
ριξη όπως έχουν οι ευρέως χρησιµο-
ποιούµενες γλώσσες. Λευκές Βίβλοι
έχουν καταρτιστεί για τις εξής ευρω-
παϊκές γλώσσες: αγγλικά, βασκικά,
βουλγαρικά, γαλικιανά, γαλλικά, γερ-
µανικά, δανικά, ελληνικά, εσθονικά,
ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά,
καταλανικά, κροατικά, λετονικά, λι-
θουανικά, µαλτέζικα, νορβηγικά (µπουκ-
µόλ και νινόρσκ), ολλανδικά, ουγγρικά,
πολωνικά, πορτογαλικά, ρουµανικά,
σερβικά, σλοβακικά, σλοβενικά, σουη-
δικά, τσεχικά και φινλανδικά. Κάθε
Λευκή Βίβλος είναι γραµµένη στη γλώσ-
σα στην οποία αναφέρεται και είναι
µεταφρασµένη στα αγγλικά.
Τέσσερις µεγάλοι κίνδυνοι
Σύµφωνα µε τη νέα µελέτη, η Ισ-
λανδική, η Λετονική, η Λιθουανική και
η Μαλτέζικη αντιµετωπίζουν τον µε-
γαλύτερο κίνδυνο εξαφάνισης σε µια
ευρωπαϊκή τεχνολογική κοινωνία, που
ολοένα περισσότερο προωθεί τη χρήση
συγκεκριµένων γλωσσών και ιδίως της
Αγγλικής. Όµως και άλλες γλώσσες,
όπως η Ελληνική, η Βουλγαρική, η Ουγ-
γρική και η Πολωνική, επίσης κινδυ-
νεύουν στον σύγχρονο ψηφιακό κόσµο.
Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία
συνέβαλαν περισσότεροι από 200 ειδικοί,
αξιολογεί τον κίνδυνο για κάθε γλώσσα
µε βάση τέσσερα βασικά κριτήρια σε
τεχνολογικό/ψηφιακό επίπεδο: την ύπαρ-
ξη αυτόµατης µετάφρασης στη συγκε-
κριµένη γλώσσα, τη δυνατότητα φωνη-
τικής αλληλεπίδρασης, τη δυνατότητα
ψηφιακής ανάλυσης κειµένου και τη
διαθεσιµότητα των σχετικών ψηφιακών
γλωσσικών πόρων/πηγών.
Οι δυνατές
Η γλώσσα µε την καλύτερη βαθµο-
λογία στα κριτήρια είναι ασφαλώς η
Αγγλική, που απολαµβάνει τη συγκριτικά
καλύτερη τεχνολογική υποστήριξη (αν
και όχι την καλύτερη δυνατή), γεγονός
που διευκολύνει την περαιτέρω εξά-
πλωσή της.
Ακολουθούν µε ικανοποιητική ή µέ-
τρια τεχνολογική/ψηφιακή υποστήριξη
η Ολλανδική, η Γαλλική, η Γερµανική,
η Ιταλική και η Ισπανική. Η Ελληνική,
όπως επίσης η Βασκική, η Καταλανική,
η Πολωνική, η Ουγγρική κ.ά. κατα-
τάσσονται στις γλώσσες µε «αποσπα-
σµατική» µόνο υποστήριξη, γι’ αυτό
ακριβώς θεωρούνται γλώσσες υψηλού
κινδύνου προς εξαφάνιση.
Δραµατικές διαφορές
Σύµφωνα µε τους επιµελητές της µε-
λέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ,
«υπάρχουν δραµατικές διαφορές στην
υποστήριξη της γλωσσικής τεχνολογίας
ανάµεσα στις διάφορες ευρωπαϊκές
γλώσσες και τεχνολογικές περιοχές. Το
χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’
γλωσσών ολοένα και διευρύνεται. Πρέπει
να εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων -σε
ψηφιακούς πόρους- γλωσσών µε τις
απαραίτητες βασικές τεχνολογίες, αλλιώς
οι γλώσσες αυτές είναι καταδικασµένες
σε ψηφιακή εξαφάνιση».
Ως ελπίδα αυτών των γλωσσών θεω-
ρείται η βελτίωση και η ευρύτερη αξιο-
ποίηση του λογισµικού γλωσσικής τε-
χνολογίας, το οποίο επιτρέπει τη φω-
νητική και τη γραπτή επεξεργασία των
διαφόρων γλωσσών.
Παραδείγµατα αυτών των δυνατοτή-
των είναι οι ηλεκτρονικοί ορθογραφικοί
και συντακτικοί διορθωτές κειµένων,
οι διαδραστικοί προσωπικοί «βοηθοί»
των έξυπνων κινητών τηλεφώνων (π.χ.
η Siri στο iPhone), τα συστήµατα αυ-
τόµατης µετάφρασης, τα ηλεκτρονικά
συστήµατα διαλόγου των τηλεφωνικών
κέντρων, οι µηχανές αναζήτησης, η
συνθετική φωνή στα συστήµατα πλοή-
γησης των αυτοκινήτων. κ.ά.
Το βασικό πρόβληµα
Το σηµαντικό, σύµφωνα µε την έκ-
θεση, είναι όλες αυτές οι δυνατότητες
να προσφέρονται στους χρήστες και στη
µητρική τους γλώσσα που κινδυνεύει
µε εξαφάνιση. Χωρίς αποφασιστική δρά-
ση, γίνεται η δυσοίωνη πρόβλεψη ότι
οι γλώσσες αυτές δύσκολα θα επιβιώσουν
στον ψηφιακό κόσµο του 21ου αιώνα.
Ένα πρόβληµα είναι ότι το λογισµικό
αυτών των συστηµάτων γλωσσικής τε-
χνολογίας στηρίζεται σε στατιστικές µε-
θόδους που απαιτούν τεράστιες ποσό-
τητες γραπτών ή φωνητικών δεδοµένων,
όµως τόσα πολλά δεδοµένα είναι δύσκολο
να αποκτηθούν για γλώσσες που οµι-
λούνται από σχετικά λίγους ανθρώπους.
Εξάλλου, ακόµα και για ευρέως χρη-
σιµοποιούµενες γλώσσες όπως τα αγ-
γλικά, η σχετική γλωσσική τεχνολογία
έχει ακόµα αδυναµίες, που είναι π.χ.
φανερές στις άκρως ανεπαρκείς και γε-
µάτες λάθη αυτόµατες µεταφράσεις. Η
έκθεση προτείνει ότι πρέπει να αναληφθεί
µια συντονισµένη µεγάλης κλίµακας
προσπάθεια στην Ευρώπη, προκειµένου
σταδιακά να δηµιουργηθούν ή να βελ-
τιωθούν οι αναγκαίες τεχνολογίες και
να βοηθηθούν οι γλώσσες που είναι ψη-
φιακά παραγκωνισµένες.
Τη γλώσσα
µού... έχασαν
Οι περισσότερες ευρωπαϊκές γλώσσες
κινδυνεύουν µε ψηφιακή εξαφάνιση
Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων
-σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες
?049-ΚΟΣΜΟΣ 29/09/2012 1:41 ?Μ Page 49
Update of the Study (2014)
q Study comprised 31 volumes/languages.
q Many languages missing! Need for
extension – at least of the comparison.
q We invited three language community
bodies to participate in the update:
European Federation of National
Institutions for Language (EFNIL)
Network to Promote Linguistic
Diversity (NPLD)
Experts Committee of the European
Language Charter (Council of Europe)
http://www.meta-net.eu 14
CCURL 2014 – Collaboration and Computing for Under-
Resourced Languages in the Linked Open Data Era
MT
English
good
French,
Spanish
moderate fragmentary
Catalan, Dutch, German,
Hungarian, Italian, Polish,
Romanian
weak or no support
Albanian, Asturian, Basque, Bosnian, Breton, Bulgarian, Croatian,
Czech, Danish, Estonian, Finnish, Frisian, Friulian, Galician, Greek,
Hebrew, Icelandic, Irish, Latvian, Limburgish, Lithuanian,
Luxembourgish, Macedonian, Maltese, Norwegian, Occitan,
Portuguese, Romany, Scots, Serbian, Slovak, Slovene, Swedish,
Turkish, Vlax Romani, Welsh, Yiddish
excellent
Czech, Dutch,
Finnish,
French,
German,
Italian,
Portuguese,
Spanish
moderate fragmentary
Basque, Bulgarian,
Catalan, Danish, Estonian,
Galician, Greek,
Hungarian, Irish,
Norwegian, Polish,
Serbian, Slovak, Slovene,
Swedish, Turkish
weak or no support
Albanian, Asturian, Bosnian, Breton, Croatian, Frisian,
Friulian, Hebrew, Icelandic, Latvian, Limburgish, Lithuanian,
Luxembourgish, Macedonian, Maltese, Occitan, Romanian,
Romany, Scots, Vlax Romani, Welsh, Yiddish
excellent
English
good
Speech
English
good
Dutch, French,
German,
Hebrew,
Italian, Spanish
moderate fragmentary
Basque, Bulgarian,
Catalan, Czech, Danish,
Finnish, Galician, Greek,
Hungarian, Norwegian,
Polish, Portuguese,
Romanian, Slovak,
Slovene, Swedish
weak or no support
Albanian, Asturian, Bosnian, Breton, Croatian, Estonian, Frisian,
Friulian, Icelandic, Irish, Latvian, Limburgish, Lithuanian,
Luxembourgish, Macedonian, Maltese, Occitan, Romany, Scots,
Serbian, Turkish, Vlax Romani, Welsh, Yiddish
excellent
English
good
Czech, Dutch,
French,
German,
Hungarian,
Italian, Polish,
Spanish,
Swedish
moderate fragmentary
Basque, Bulgarian,
Catalan, Croatian, Danish,
Estonian, Finnish,
Galician, Greek, Hebrew,
Norwegian, Portuguese,
Romanian, Serbian,
Slovak, Slovene
Albanian, Asturian, Bosnian, Breton, Frisian, Friulian, Icelandic,
Irish, Latvian, Limburgish, Lithuanian, Luxembourgish,
Macedonian, Maltese, Occitan, Romany, Scots, Turkish, Vlax
Romani, Welsh, Yiddish
weak/no supportexcellent
ResourcesTextAnalytics
Excellent
Good
Moderate
Fragmentary
Weak/no
support
LanguageTechnologySupport
MillionsofNativeSpeakers(Worldwide)
Yiddish
Welsh
VlaxRomani
Turkish
Scots
Romany
Occitan
Maltese
Macedonian
Luxembourgish
Lithuanian
Limburgish
Latvian
Icelandic
Friulian
Frisian
Breton
Bosnian
Asturian
Albanian
Irish
Croatian
Serbian
Hebrew
Estonian
Slovene
Slovak
Romanian
Norwegian
Greek
Galician
Danish
Bulgarian
Basque
Swedish
Portuguese
Finnish
Catalan
Polish
Hungarian
Czech
Italian
German
Dutch
Spanish
French
English
0
50
100
150
200
250
300
350
400
Extension  of the META-­NET  White  Paper  Study  (2013/2014)
META-NET
Strategic Research
Agenda (SRA)
http://www.meta-net.eu 17
Three Ingredients
http://www.meta-net.eu 18
Appropriate
Programme
Vision & Agenda
Appropriate
Actors
Research &
Commercialisation
Appropriate
Support
Funding
Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media and
Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
Planning Process
Strategic
Research
Agenda
2010 2011 2012
Vision
Paper
Vision Group
Translation and
Localisation
Report
Vision Group
Interactive
Systems Report
Vision Group
Media and
Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
Planning Process: Documents
Strategic
Research
Agenda
2010 2011 2012
www.meta-net.eu
office@meta-net.eu
T: +49 30 23895 1833
The Future European Multilingual
Information Society
Vision Paper for a Strategic Research Agenda
“People can’t share knowledge
if they don’t speak a common language.”
Davenport, Thomas H, and Laurence Prusak, Working Knowledge:
How Organizations Manage What They Know, Harvard Business School,
Boston, 1997, p. 98.
Join the discussion at
www.meta-et.eu/forum
LT 2020
Vision and Priority Themes for
Language Technology Research
in Europe until the Year 2020
Towards the META-NET Strategic Research Agenda
The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro-
pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement
270893) and META-NORD (Grant Agreement 270899).
Do you have comments, ideas or suggestions
with regard to the content of this document?
Please send them to office@meta-net.eu or
discuss them online: http://www.meta-net.eu/sra.
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Translation and Localisation
Results of first two meetings
Editors: Aljoscha Burchardt, Georg Rehm
Dissemination Level: Public
Date: 3 December 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Media and Information Services:
Results of first two meetings
Editors: Maria Koutsombogera, Stelios Piperidis
Dissemination Level: Public
Date: 10 November 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Interactive Systems:
Results of first two meetings
Editors: Joseph Mariani, Bernardo Magnini
Dissemination Level: Public
Date: 28 December 2010
Strategic Research Agenda
q Addresses the problems we identified
when preparing the white papers.
q Can put Europe ahead of its
competitors in this technology area.
q 200 contributors; >2 years.
54% industry; 46% research;
4% (inter)national institutions.
q Presented and discussed at 90+
conferences and major workshops.
q Published in early 2013.
q http://www.meta-net.eu/sra
http://www.meta-net.eu 21
Priority Research Themes
q Three priority research themes:
§ Translingual Cloud
§ Social Intelligence and
e-Participation
§ Socially-Aware Interactive
Assistants
q Two additional themes:
§ European Service Platform
for Language Technologies
§ Core Technologies for
Language Analysis and Production
http://www.meta-net.eu 22
Providers of operational and research technologies and services
Research
Centres
European
Institutions
Other
companies (SMEs,
startups etc.)
National
Language
Institutions
Language
Technology
Providers
Language
Service
Providers
Universities
European
Institutions
Research
Centres
Public
Administrations
Enterprises
LT User
Industries
Universities
European
Citizens
Beneficiaries/users of the platform
Interfaces (web, speech, mobile etc.)
Priority Research Theme 1:
Translingual
Cloud
Priority Research Theme 2:
Social Intelligence
& e-Participation
Priority Research Theme 3:
Socially Aware
Interactive Assistants
European Service Platform for Language Technologies
(Cloud or Sky Computing Platform)
Multilingual
technologies
Text
analytics
Text
generation
Language
checking
Sentiment
analysis
Named entity
recognition
Summari-
sation
Knowledge access
and management
Information and
relation extraction
Language
Processing
Language
Understanding
Knowledge
Emotion/
Sentiment
Data protection
Tools
Data Sets
Resources
Components
Metadata
Standards
Interfaces
APIs
Catalogues
Quality Assurance
Data Import/Export
Input/Output
Storage
Performance
Availability
Scalability
Features
Icelandic
French
Catalan
Italian
Maltese
Greek
Bulgarian
Romanian
Serbian
Croatian
Slovene Hungarian
Slovak
Czech
German
Danish Lithuanian
Latvian
Estonian
Finnish
Swedish
Norwegian
Basque
Spanish
Portuguese
Galician
English
Irish
PolishDutch
Polish
English
Irish
Icelandic
Italian
Maltese
Greek
Bulgarian
Romanian
SerbianCroatian
Slovene
Hungarian
Slovak
Czech
German
Dutch
Danish
Lithuanian
Latvian
Estonian
Finnish
Swedish
Norwegian
Basque
Spanish
Portuguese
Galician
French
Catalan
http://www.meta-net.eu 24
Concrete  result  of  these  activities: One  call  for  proposals  
around  Machine  Translation  in  Horizon  2020  WP  2015-­17.
CRACKER
http://www.meta-net.eu 25
1 DFKI Germany Georg Rehm
2 CUNI Czech Republic Jan Hajic
3 ELDA France Khalid Choukri
4 FBK Italy Marcello Federico
5 ATHENA RC Greece Stelios Piperidis
6 UEDIN UK Philipp Koehn
7 USFD UK Lucia Specia
Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu
Cracking the Language Barrier
Coordination, Evaluation and Resources for European MT Research
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
Language-blocking:
languages they do not speak
Geo-blocking and language-blocking are barriers to access
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Communities
• META-NET incl. META-SHARE and META
• MT evaluation initiatives – WMT, IWSLT, MT Marathons
• MT and other LT industry
• Language resources – META-SHARE, ELRA
• HT/MT evaluation tools – translate5
• Translation industry, translation profession
• MT user communities
Strategic Agenda for the Multilingual Digital Single Market
• Version 0.5 presented at META-FORUM 2015 (Riga)
• Version 0.9 presented at META-FORUM 2016 (Lisbon)
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
Selected Activities
2015 2016 2017
M12
M1
M24
M36
Kick-off meeting
for all ICT-17
Projects
translate5
WMT
2016
WMT
2017
IWSLT
2015
IWSLT
2016
IWSLT
2017
QT Marathon
2015
QT Marathon
2016
Roadmap for
European MT
Research
Survey on the State
of HQMT in Industry
and LSPs
SRIA
(initial version)
SRIA
(update)
SRIA
(final)
version 2version 1
• Production of  resources  (e.g.,  for  WMT  
2016  and  2017,  IWSLT  2015-­2017)
• Tools (quality  control,  evaluations)
• Strategies and  roadmaps  (SRIA,  
Roadmap  for  European  MT  Research)
• Exchange  and  sharing  facility  for  
resources  (META-­SHARE)
Recent or Upcoming Events
• LREC Workshop on MT Eval. (May 25)
• META-FORUM 2016 (July 4/5, Lisbon)
• WMT 2016 (Aug. 11/12, Berlin)
• IWSLT 2016 (Dec. 8/9, Seattle)
• Federation of organisations and
projects working on technologies
for multilingual Europe.
• 10 organisations; 24 projects.
• Areas of collaboration: data
management and repositories,
tools, shared tasks, evaluations.
• Goal: provide one umbrella
organisation for the whole
community.
http://www.cracking-the-language-barrier.eu
q META-FORUM 2016 – July 04/05, Lisbon, Portugal
Beyond Multilingual Europe
q META-FORUM 2015 – April 27, Riga, Latvia
Technologies for the Multilingual Digital Single Market
q META-FORUM 2013 – Sept. 19/20, Berlin, Germany
Connecting Europe for New Horizons
q META-FORUM 2012 – June 20/21, Brussels, Belgium
A Strategy for Multilingual Europe
q META-FORUM 2011 – June 27/28, Budapest, Hungary
Solutions for Multilingual Europe
q META-FORUM 2010 – Nov. 17/18, Brussels, Belgium
Challenges for Multilingual Europe
http://www.meta-net.eu 28
The Multilingual
Digital Single Market
http://www.meta-net.eu 29
q Top priority in the European Union.
q Expected to add 400b€ to European GDP
and hundreds of thousands of new jobs.
q Unfortunately, the language topic is not
included in the EC’s Digital Single Market
strategy (published in May 2015).
http://www.meta-net.eu 33
Facts and Figures
http://www.meta-net.eu 34
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Facts and Figures
http://www.meta-net.eu 35
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
and marketing costs.
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
The MDSM Fact Sheet
http://www.meta-net.eu 36
Current eCommerce growth within Europe is about half that of the US,
due partially to a lack of language coverage from European SMEs.
Lessthan5%ofEuropeanSMEscurrentlysellcross-language.
Multilingual Digital Single Market
Why Europe needs a
No single language accounts
for more than 20% of the
potential Multilingual
Digital Single Market.
Most account for less than
3% of the DSM.
Without a solution, the
European Digital Single
Market will remain
fragmented.
Europe’s 24 official
languages present
a tremendous
opportunity for
European business
Removing language barriers within
Europe would open access to 73%
(with >€25 trillion in annual
revenue!) of the world’s digitally
accessible market to European
enterprise.
Europetodayisnotasinglemarket:
itisaseparatedinto20+smalllanguagemarkets.
www.meta-net.eu
Chinese
(510 million)
W
orld
Spanish
(1
65
millio
n)
W
orld
Po
rtug
ue
se
(8
3
millio
n)
English
(565 million)
Ja
pane
se
(1
00
millio
n)
Rus
sian
(6
0
millio
n)
Europe today
(Many small
markets)
LANGUAGE TECHNOLOGY
The Multilingual Digital Single Market
Online Population
Source:InternetWorldStats(MiniwattMarketingGroup)InternetWorldStats(Mini
THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET
Multilingual access to all digital goods and services across Europe1
Geo-blocking:
due to nationality, location, or residence
customers
Language-blocking:
languages they do not speak
however, current online translation is insufficient
trying to conduct
common languages
Geo-blocking and language-blocking are barriers to access
Both geo-blocking and language-blocking are
daily problems for tens of millions of EU citizens.
Customers are six times more likely to buy from sites in their native language.
Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those
languages are spoken.
Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in
European businesses.
Language can be expensive for SMEs
Online businesses face around €5,000 in up-front costs for each
new language they translate their websites into, plus similar
and marketing costs.
Even when sites are translated, the vast majority of
SMEs cannot respond to support requests or
customer feedback in other languages. Such
responsiveness is needed to achieve customer
satisfaction and build brand loyalty.
English is not the answer
52% of EU customers do not purchase
Adding even a few languages to an SME’s website beyond English
can have a major impact on revenue. Large organizations today
to increase market share.
6x more
likely to
purchase
Site in buyer’s
native language
Site in foreign
language
Likelihoodofpurchasing
Good
Moderate
Fragmentary
Weak/no
support
0
50
100
150
200
250
300
350
400
LanguageTechnologySupport*
MillionsofNativeSpeakers(Worldwide)
LanguageTechnology Danger Zone
(≈150 million EU citizens)
LanguageTechnology Danger Zone
(≈150 million EU citizens)
Spanish
English
Portuguese
German
French
Italian
Polish
Romanian
Dutch
Greek
Hungarian
Czech
Swedish
Bulgarian
Danish
Croatian
Slovak
Finnish
Lithuanian
Slovene
Latvian
Estonian
Maltese
Irish
140 million EU
citizens are in the LanguageTechnology Danger
Zone, where language technology is inadequate to
support the DSM.
Current online automatic
translation provided by US
tech giants does not solve
less than 30% of
automatically translated
content is truly useful for
online commerce.
Only three European languages
Boosting commerce through multilingual technologies2
Connecting citizens to European digital public services3
Without LanguageTechnology, the European Commission has no way to respond effectively to citizen participation.
Current language technology is inadequate for over half
of the EU official languages to help the European
Commission solve its citizen engagement problem.
Translation opens 20 times its cost in revenue opportunity.
However, translation remains too expensive for many
European SMEs, blocking this opportunity and limiting economic
growth in Europe. Lowering these costs is a strategic opportunity
Translation
Costs
Increase in
Revenue
good
bad
ugly
OnlineAutomatic
TranslationQuality
Most local governmental services are monolingual only.
This poses a problem for tourists, expatriates, and
linguistic minorities. Language technology can provide the
Multilingual eParticipation can help build the European Identity
with one another in their respective native languages with sophisticated machine translation working behind the scenes. Only
when EU citizens can interact in their own languages will they truly develop a sense of European identity and community.
Over half of EU citizens are language blocked from interacting with
the European Commission’s web resources for citizen participation.
290 million EU citizens excluded Speakers of other
languages are
language
blocked from
full participation
Speakers of
English, French,
German can
participate
fully
Strategic Agenda for the Multilingual Digital Single Market http://rigasummit2015.eu.
META, the Multilingual EuropeTechnology Alliance, has more than 750 members (http://www.meta-net.eu
LT-Innovate, the European Association of the LanguageTechnology Industry, has 180 corporate members throughout Europe (http://lt-innovate.eu
Technology support has improved for some languages since this study was completed.
Technology Solutions
Investment in the following solutions will help achieve the
Multilingual Digital Single Market
Unified Customer Experience
care, customer relationship, discussion fora,
Multimodal User Experience for
Connected Devices
interfaces
household appliances, and consumer
Voice of the Customer
market research
Content Curation and Production
DigitalTranslation Centre
customers, citizens
TheforthcomingStrategicAgendafortheMultilingualDigitalSingleMarketwillprovideadditional
detailsontheseandothersolutionsfortheneedsoftheMultilingualDigitalSingleMarket.
Downloadthisfactsheetfromhttp://cracker-project.eu.
FormoreinformationcontactDr.GeorgRehm(DFKI)atgeorg.rehm@dfki.de.
http://cracker-project.eu/wp-content/uploads/2015/11/mDSM-Fact-Sheet.pdf
META-FORUM 2015
AND MDSM SRIA V0.5
http://www.meta-net.eu 37
Open Letter to the EC
q On Friday, March 20, 2015, we published an open letter to the EC on
http://multilingualeurope.eu.
q On Monday, March 23, 2015, we informed
President Juncker and all Commissioners
about the campaign and the 1300+ signatures.
q By now more than 3600 signatures!
38
q 5 Members of the European
Parliament
q 150+ high-level representatives from
industry (CxO level)
q 1200+ professors
q 400+ project or research managers
q 20+ entrepreneurs and founders
q hundreds of language and language
technology professionals, officials,
researchers, administrators and
representatives from related
stakeholder groups
Who  signed?
META-FORUM 2015
q April 27 in Riga, Latvia
q Riga Summit 2015 on the Multi-
lingual Digital Single Market
q Two important components:
§ MDSM SRIA Version 0.5
§ Further community fusing
q http://www.meta-forum.eu
Joint EFNIL and NPLD Panel
q Joint EFNIL and NPLD panel at META-FORUM 2015.
q Joint position paper.
Initially presented at META-FORUM 2015 and the Riga Summit 2015
on the Multilingual Digital Single Market, April 27, 2015
www.rigasummit2015.eu
Joint NPLD/EFNIL
Position Paper on the
Multilingual Digital Single Market
!
“Languages are not only a means of communication. They also have embedded in them people’s
values, aspirations and hopes.”(European Roadmap for Linguistic Diversity 2015, NPLD)
“Many European languages run the risk of becoming victims of the digital age as they are un-
der-represented and under-resourced online. Huge regional market opportunities remain un-
tapped because of language barriers.” (Multilingual Europe: A challenge for language tech.
MultiLingual. April/May 2011, page 51/52)
Vision
Paper
Vision Group
Translation and
Localisation Report
Vision Group
Interactive Systems
Report
Vision Group Media
and Information
Services Report
Priority
Themes
Paper
Expert meeting
minutes
Expert meeting
minutes
Expert meeting
minutes
META-NET Strategic
Research Agenda for
Multilingual Europe 2020
2010 2011 2012 2013 2014 2015
www.meta-net.eu
office@meta-net.eu
T: +49 30 23895 1833
The Future European Multilingual
Information Society
Vision Paper for a Strategic Research Agenda
“People can’t share knowledge
if they don’t speak a common language.”
Davenport, Thomas H, and Laurence Prusak, Working Knowledge:
How Organizations Manage What They Know, Harvard Business School,
Boston, 1997, p. 98.
Join the discussion at
www.meta-et.eu/forum
LT 2020
Vision and Priority Themes for
Language Technology Research
in Europe until the Year 2020
Towards the META-NET Strategic Research Agenda
The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro-
pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement
270893) and META-NORD (Grant Agreement 270899).
Do you have comment
s, ideas or suggestio
ns
with regard
to the content
of this document
?
Please
send them to office@m
eta-net.eu
or
discuss
them online:
http://ww
w.meta-n
et.eu/sra.
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Translation and Localisation
Results of first two meetings
Editors: Aljoscha Burchardt, Georg Rehm
Dissemination Level: Public
Date: 3 December 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Media and Information Services:
Results of first two meetings
Editors: Maria Koutsombogera, Stelios Piperidis
Dissemination Level: Public
Date: 10 November 2010
This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.
A Network of Excellence forging the
Multilingual Europe Technology Alliance
Vision Document
Vision Group Interactive Systems:
Results of first two meetings
Editors: Joseph Mariani, Bernardo Magnini
Dissemination Level: Public
Date: 28 December 2010
Strategic
Research and
Innovation Agenda
roadmaps,  agendas  and  any  
other  input  from  other  initiatives
…
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
Strategic Agenda for MDSM
q Presented at META-FORUM 2015
and Riga Summit for the first time.
q Version 0.5 – work in progress
q Builds upon many strategy papers
and roadmaps prepared by
several European projects,
incl. the META-NET SRA (2013).
q Input and feedback collected at the
Riga Summit 2015 to be used for
upcoming versions.
http://www.meta-net.eu D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
A Strategy for the MDSM
q Strategic R&I Agenda for the
Multilingual Digital Single Market
q Core: Technology Solutions
q Data economy is an inherent
component – LT for effective
multilingual data value chains.
http://www.meta-net.eu 43
ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015
Contents
Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i
1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6
1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10
2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10
2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18
3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18
3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29
5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
q Letter from Andrus Ansip (June 2015)
q “We invite the European language
technology community to further
develop the ideas presented in the
draft Strategic Agenda for the
multilingual Digital Single Market”
Cracking the
Language Barrier
http://www.meta-net.eu 46
Riga Declaration
q 12 organisations present at
META-FORUM 2015 and the
Riga Summit 2015 drafted and
signed the “Declaration of
Common Interests”.
q CRACKER: community building,
mostly among projects.
q We combined these into the
Cracking the Language Barrier
federation.
q Important goal: measure against
community fragmentation.
http://www.meta-net.eu
DECLARATION OF COMMON INTERESTS
We, the undersigned, declare here, at the Riga Summit on the Multilingual Digital Single
Market, encouraged by the letter Vice President Andrus Ansip sent to its participants, that we
stand united in our goal and interest to:
- support multilingualism in Europe by employing language technology in business,
society and governance, to create a truly Multilingual Digital Single Market,
- exchange and share information in our efforts to promote our goals and interests at
local, national and European levels,
- raise awareness in society at large using channels available to our associations,
alliances and societies.
In the near future, we foresee the establishment of a Memorandum of Understanding among
our organisations towards a “Coalition for a Multilingual Europe”, to better serve our
members address the language barrier challenges towards establishing a truly integrated
Multilingual Digital Single Market.
Riga, 29. April 2015
Signed by (in alphabetical order):
BDVA Laure Le Bars
CITIA Steve Renals
CLARIN Steven Krauwer
EFNIL
Sabine Kirchmeier-Andersen,
Tamás Váradi
ELEN Davyth Hicks, Claudia Soria
ELRA
Nicoletta Calzolari,
Khalid Choukri
GALA
Laura Brandon, Robert E. Etches,
Sergey Gladkov
LT Innovate
Jochen Hummel,
Philippe Wacker
META-NET
Jan Hajic, Josef van Genabith,
Georg Rehm, Andrejs Vasiljevs
NPLD Meirion Prys Jones
TAUS Jaap van der Meer
W3C Richard Ishida, Felix Sasaki
For any questions, please contact Georg.Rehm@dfki.de.
http://www.cracker-project.eu • http://www.meta-net.eu
• A federation of European projects and
organisations working on technologies
for a multilingual Europe.
• Multi-lateral Memorandum of Understanding;
10 organisations and 24 projects on board
already (including FP7 and H2020-ICT15).
• Getting new members on a regular basis.
• Selected areas of collaboration: data
management and repositories, tools,
shared tasks, evaluations, events.
• Goal: provide one umbrella organisation
for the whole community.
Project Members
Organisation Members
http://www.cracker-project.eu • http://www.meta-net.eu
• Website: information about the initia-
tive, all projects and organisations
• Downloadable documents
• List of events
• LREC 2016 MT Eval Workshop
• Several new members will
join the initiative soon
http://www.cracking-the-language-barrier.eu
META-FORUM 2016
AND MDSM SRIA V0.9
http://www.meta-net.eu 51
Andrus Ansip’s Blog Post
q Posted on 27 May 2016.
q First public acknowledgment
of the EC that the language
topic is of very high relevance
for the Digital Single Market.
q “Overcoming language
barriers is vital for building the
DSM, which is by definition
multilingual. It is now time to
reduce and remove the
language barriers that are
holding back its advance, and
turn them into competitive
advantages.”
http://www.meta-net.eu 52
Reorganisation of DG CONNECT (01/07/2016)
01/07/2016
DG CONNECT
Communications Networks,
Content & Technology
Director-General
R. Viola (60240
Assistants
O. Bringer (92067
P. Stuckmann (21097
Deputy Director-General
in charge of Directorates
A, C, E & H
G. Kent (acting) (91945
Assistant
E. Mitjana (81149
Deputy Director-General
in charge of Directorates
B, D, F, G & I
C. Bury (60499
Assistant
P. Lamotte (98892
Directorate F
Digital Single Market
G. de Graaf
(68466
Directorate E
Future Networks
M. Campolargo
(63479
Directorate D
Policy Strategy
& Outreach
L. Corugedo
Steneberg (96383
Directorate C
Digital Excellence
& Science Infrastructure
Th. Skordas (acting)
(68908
Directorate B
Electronic
Communications
Networks & Services
A. Whelan (50941
Directorate A
Digital Industry
K. Rouhana
(68057
Principal Adviser
F. Lupescu
(68538
Directorate R
Resources
& Support
G. Kent
(91945
Directorate I
Media Policy
G. Abbamonte
(93573
Directorate H
Digital Society, Trust
& Cybersecurity
P. Timmers
(90245
Directorate G
Data
J. Hernández-Ros
(acting) (34533
F.1: Digital Policy
Development &
Coordination
M. Bailey (acting)
(69176
E.1: Future
Connectivity
Systems
B. Barani (acting)
(69616
D.1: Research
Strategy &
Programme
Coordination
M. Fjalland (50021
C.1: eInfrastructure &
Science Cloud
A. Burgueño Arjona
(92471
B.1: Electronic
Communications
Policy
V. Terävä
(92381
A.1: Robotics
& Artificial
Intelligence
J. Heikkilä
(35325
R.1: Human
Resources &
Competences
I. Mariën-Dusak
(92376
I.1: Audiovisual &
Media Services
Policy
L. Boix Alonso
(90009
H.1: Cybersecurity
& Digital Privacy
J. Boratynski
(69452
G.1: Data Policy &
Innovation
M. Nagy-Rothengass
(31680
F.2: E-Commerce &
Platforms
P. Agarwal (acting)
(87153
E.2: Cloud &
Software
P. O’Donohue
(91280
D.2: Policy
Implementation &
Planning
E. Forti
(65172
C.2: High Performance
Computing &
Quantum Technology
G. Kalbe
(32866
B.2: Implementation
of the Regulatory
Framework
W-D. Grussmann
(58559
A.2: Technologies
& Systems for
Digitising Industry
M. Lemke
(91575
R.2: Budget &
Finance
M-C. Laffineur
(68515
I.2: Copyright
M. Martin-Prat
(65157
H.2: Smart
Mobility & Living
E. Hartog
(90084
G.2: Data
Applications &
Creativity
J. Hernández-Ros
(34533
F.3: Start-ups &
Innovation
P. Zilgalvis
(50935
E.3: Next-
Generation Internet
J. Villasante
(63521
D.3: Policy Outreach
& International
Affairs
A. Angelova-Krasteva
(91145
C.3: Future &
Emerging
Technologies (FET)
V. Peca
(57843
B.3: Markets
R. Krüger
(61555
A.3: Competitive
Electronics
Industry
W. Van Puymbroeck
(68138
R.3: Knowledge
Management &
Support Systems
F. Accordino
(98272
I.3: Audiovisual
Industry & Media
Programme
L. Recalde Langarica
(91281
H.3: E-Health,
Well-Being &
Ageing
M. González-Sancho
(52918
G.3: Learning,
Multilingualism &
Accessibility
M. Marsella (acting)
(32750
F.4: Digital
Economy & Skills
L. Sioli
(51262
E.4: Internet of
Things
M. Rohen
(63674
D.4: Communication
D. Ringrose
(93913
C.4: Flagships
Th. Skordas
(68908
B.4: Radio
Spectrum Policy
A. Geiss
(59466
A.4: Photonics
C. Maloney
(69082
R.4: Compliance &
Planning
K. Engelbosch
(54693
I.4: Media
Convergence &
Social Media
J. Cotta
(66407
H.4: E-Government
& Trust
A. Servida
(58186
G.4: Administration
& Finance
G. Kalbe (acting)
(32866
A.5: Administration
& Finance *
A. Fiala
(64787
B.5: Investment in
High-Capacity
Networks
A. Krzyżanowska
(87246
H.5: Administration
& Finance **
G. Van Caenegem
(acting) (61895
R.5: Programme
Operations &
Common Services
I. Malekos
(52902
Mirror-Unit REA.A.5
Fostering Novel
Ideas: FET-Open
T. Hallantie
(68167
Mirror-Unit EACEA.B.2
Creative Europe:
MEDIA
H. Trettenbrein
(84955
Mirror-Unit REA.C.4
Expert Contracting
& Payments
A. Oram
(97805
Principal Adviser
M. Richards
(62443
Adviser for Legal
& Legislative Issues
Ž. Bahovec (88284
Adviser for cross-cutting
Policy/Research Issues
G. Santucci (68963
Adviser for International
Relations linked to Future
Networks
P. Blixt (68048
Adviser for Societal
Issues
N. Dewandre (94925
Adviser for Organisational
Transition (Finance)
Vacant
Adviser for Societal
Challenges
Vacant
Adviser for Innovation
Systems
B. Salmelin (69564
Reporting lines are:
- R. Viola for Directorate R;
- G. Kent (acting) for Directorates A, C, E, H;
- C. Bury for Directorates B, D, F, G, I.
Luxembourg;
To be transferred to Luxembourg.
Shared Administration & Finance Unit for
Directorates A, B, C, D & F.
Shared Administration & Finance Unit for
Directorates E, H & I.
Unit G.1 “Data Policy & Innovation”
Unit G.3 “Learning,
Multilingualism & Accessibility”
• Support the data economy in the Digital Single Market
• Policy initiatives addressing new and emerging issues.
• Advance the Commission open data policy by ensuring the
correct implementation of the PSI Directive and the Pan-
European Open Data Portal
• Promote the emergence of an ecosystem comprising all the
players of the data value chain.
• Steers together with industry the SRIA.
• Addresses key framework conditions of the data economy
• Fund research and innovation in data technologies and
applications inter alia by driving the big data PPP.
• Make the DSM more accessible, secure and inclusive.
• Support policy, research, innovation and deployment of learning
technologies
• Support key enabling digital language technologies and
services to allow all European consumers and businesses
to fully benefit from the Digital Single Market.
• Responsible for Web Accessibility Directive
• Promote a better Internet for children by protecting and
empowering children online, and improving the quality of content
available to them.
Communities & Stakeholders
54
...  and  many  more  research  centres,  companies,  EU  projects  etc.
MDSM SRIA
q Version 0.5 unveiled at META-FORUM 2015
q Version 0.9 unveiled at META-FORUM 2016
q Version 1.0 foreseen for Nov./Dec. 2016
q Prepared and presented by Cracking the Language
Barrier federation (editorial team: 13 colleagues)
q SRIA addresses how the LT community is going
to act united in order to make the DSM multilingual
q Document available on http://www.cracker-project.eu
and also on http://www.cracking-the-language-barrier.eu
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
D
RAFT
Strategic Agenda for the
Multilingual Digital Single Market
Technologies for Overcoming Language Barriers towards
a truly integrated European Online Market
D
RAFT
Version 0.5 – April 22, 2015
MLV Programme
q Multilingual Value Programe*
§ Three-year programme
§ Requires modest investment
q “Enabling the Multilingual Digital Single
Market through technologies for
translating, analysing, processing and
curating natural language content”
q Three components address the main
needs of the Multilingual DSM (MDSM)
and how to put them into practice:
1. Multilingual Application Areas
2. Multilingual Services
3. Research
http://www.meta-net.eu 57
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
* SRIA V0.9 and MLV Programme devised
before re-organisation of DG CONNECT.
MDSM: Goals and Needs
q Crosslingual communication for SMEs, public institutions, citizens
q Crosslingual SME presales communication and aftersales services
q Multilingual (big) data, language and knowledge value chains
q Multilingual websites, product catalogues, product descriptions
q Multilingual knowledge bases and knowledge graphs (and services)
q Multilingual conversational interfaces for connected devices (IoT)
q Crosslingual business intelligence (e.g., based on UGC)
q Crosslingual social media analytics for EU-wide societal issues
q Multilingual text and report generation (knowledge/data to text)
q All services must be domain-adaptable (no one size fits all)
q Translation Centre (Cloud) – HQ automated translation for all
http://www.meta-net.eu 58
Multilingual Digital Single Market
Automated Translation
E-Commerce
Content, Media,
Verticals
Translation, Language,
Knowledge, Data
Knowledge and
Data Repositories
Multilingual Applications
Multilingual Services
Research
Crosslingual Big
Data Language
Analytics
Meaning,
Semantics,
Knowledge
High-Quality
Machine
Translation
SMEs CEF DSIs IT Integrators Research
provide innovative
applications
fills gaps
H2020 RIAs
H2020 CSAs, IAs, RIAs
H2020 CSAs, RAs, national funding
Multimodal Interaction
Language Processing, Analysis and Production – Language Resources
Citizens Public Business
interoperable and standardised
collaboration with member states
Conversational
Technologies
Strategic Research and Innovation Agenda
Language as a Data Type and
Key Challenge for Big Data
Enabling the Multilingual Digital Single Market
through technologies for translating, analysing, processing
and curating natural language content
SRIA Editorial Team
Version 0.9 – July 2016
MLV Programme
Application Areas (Selection)
q Multilingual E-commerce
§ Customer-facing vs. back-office facing (after-market, after-sales)
§ Crosslingual search, CRM, helpdesks, processes, workflows
§ Semantic, crosslingual product descriptions and catalogues
§ Online dispute resolution
q Multilingual Content, Media, Verticals
§ Content analytics, curation, generation (incl. authoring support)
§ Multimodal communication (conversational, written, IoT)
§ Vertical domains: health, government, mobility, energy, legal.
q Translation, Language, Knowledge, Data
§ Translation Cloud – written/spoken, automatic/human
§ Crosslingual public and social intelligence, business intelligence
§ HQ resources, under-resourced languages, domain-specific LRs
Setup – Timeframe – Costs
q Close collaboration with EC, EP and all other stakeholders
(including SMEs, research centres, universities, NGOs etc.).
q Mix of funding sources:
§ Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA)
§ National/regional funding sources for work on monolingual LTs
and LRs and also to support and grow SMEs in this area
§ Include, strengthen and broaden role of CEF AT (public services)
q Estimated costs for basic MLV implementation: ca. 175-200M€
§ Includes set of mission-critical services and applications
§ Timeframe: 2018, 2019, 2020
http://www.meta-net.eu 61
Conclusions
and Next Steps
http://www.meta-net.eu 62
q There is a lot of traction for the multilingualism/language topic.
q The EU should develop a Multilingual Strategy (incl. technology).
q Strategy must take into account several stakeholders: citizens,
business/innovation, DSM, research (multiple communities).
q Most components in place: Communities, SRIAs, STOA Study etc.
q We need the political will to establish language policy change to
support multilingualism (both member state level, EU level).
q Some Member States are ahead (DK, IE, EE, ES, LT, LV, NL, SL).
q Coordinate, intensify the push and keep up the pressure from
Member States, EP, EC, research community, businesses etc.
q Goal: a shared programme (EU/MSs) as a concerted action.
http://www.meta-net.eu 63
Conclusions
Next Steps
q Several tightly interconnected goals:
§ Multilingual Technologies for Europe
§ Technologies for the Multilingual Digital Single Market
§ Multilingual Strategy of the European Union
§ The Human Language Project
1. Discuss and further shape MLV Programme V0.9 with EC
2. Extend the Cracking the Language Barrier federation
3. LT brainstorming meeting at EC, Unit G.3 (Dec. 2016)
4. EP STOA Workshop on Language Technologies (Jan. 2017)
5. MDSM SRIA V1.0 to be finalised (Q1 2017)
http://www.meta-net.eu 64
Thank you.
office@meta-net.eu
http://www.meta-net.eu
http://www.facebook.com/META.Alliance
65
Language Technology Topics
q Multilingual Europe – Technologies for all European languages
q Machine Translation, Text Analytics, Semantic Web etc.
q Healthcare, societal challenges (ageing population, refugees etc.)
q IoT, Smart Assistants and Conversational Interaction Technologies
q E-Learning – Language Technology for E-Learning
q Smart Homes, Cities, Manufacturing
q Smart Virtual Assistants
q Social Media Analytics
q E-Participation
q Games
q etc.
http://www.meta-net.eu 67
Digital Language Extinction
q Many smaller languages are experiencing problems digitally:
§ Loss of function – other languages take over entire functional areas
such as, e.g., texting, email, search, e-commerce etc.
§ Loss of prestige – if it’s not on the web, the languages doesn’t exist
§ Loss of competence – can you raise a digital native in your language?
q Andras Kornai’s classification – corresponds to the amount of digital
communication in that language:
1. digitally thriving languages (comfort zone languages)
2. vital languages
3. heritage languages
4. still/moribund/dead languages
q Implications for the European/global multilingual web?
http://www.meta-net.eu 68
potentially facing digital extinction …
http://www.meta-net.eu
q Pan-European infrastructure, bringing together providers and consumers of
language data, tools and services.
q LRs are documented, uploaded, stored, catalogued, downloaded, shared – to
improve visibility, documentation, identification, availability, interoperability.
q Caters for datasets, tools, services for LT research and development (both
academic and commercial); META-SHARE includes repository software, a
metadata model, licensing kit, statistics.
q 29 distributed repositories maintained
by 37 organisations in 25 countries.
q 2.600+ resources (corpora: 49%,
lexical: 38%, tools/services: 12%),
covering ca. 100 languages.
q 7.000+ downloads in total; ca. 70%
of all LRs have been downloaded.
Preparation of the SRA
q Strategic Research Agendas of other initiatives were screened.
q Many suggestions as input from Vision Group members.
q We discussed procedures, input and structure of the SRA in four
meetings of the META Technology Council.
§ Brussels, Belgium, November 16, 2010
§ Venice, Italy, May 25, 2011
§ Berlin, Germany, September 30, 2011
§ Brussels, Belgium, June 19, 2012
q Additional input in talks, meetings, workshops, discussions, etc.
§ Example: Three HLT Expert Meetings organised by the EC (end of 2011)
q Almost 200 experts contributed to the SRA (54% from industry; 46%
from research; 4% from national/international institutions).
http://www.meta-net.eu 71
• Published in early 2013.
• First strategic research
agenda for our field.
• Complex process of
collecting and shaping
technology visions.
• Hundreds of researchers
participated.
• Broad topics around multi-
lingual Europe in general.
PT1: Translingual Cloud
q Europe has a big need for translations of publishable quality.
q Focus on high-quality translation.
q New research paradigms
§ Inclusion of professional translators into the
research process
§ Inclusion of technologists into research on
human translation processes
q Different technological approaches
§ Stronger emphasis on the properties of
individual languages
§ A central role for semantics
q Methods for specific genres & domains
http://www.meta-net.eu 73
Priority Research Theme 1: Translingual Cloud
Any
device
Target groups: European citizen, language
professional, organisations, companies, European
institutions, software applications
Multiple target
formats
Single access
point
Automatic translation and
interpretation
Language checking
Post-editing
Workbenches for creative
translations
Novel translation and authoring
workflows
Quality assurance
Computer-supported human
translation
Multilingual content production and
text authoring
Trusted service centre (privacy,
confidentiality, security of source
data)
Services and Technologies:
Crosslingual communication,
translation and search
Real-time subtitling, voice-over
generation and translating speech
from live events
Mobile interactive interpretation
Multilingual content production
(media, web, technical, legal
documents)
Showcases: translingual spaces for
ambient translation
Applications:
Written (twitter, blog, article, newspaper,
text with/without metadata etc.) or
spoken input (spontaneous spoken
language, video/audio, multiple speakers)
Modular combination
of analysis, transfer
and generation
models
From very fast but lower
quality to slower but very
high quality (including
instant quality upgrades)
Exploiting strong
monolingual analysis
and generation methods
and resources
Multiple target
formats
Domain, task and
genre specialisation
models
Extending
translation with
semantic data and
linked open data
PT2: Social Intelligence
q Better decisions by monitoring social media
q Inclusion of citizens into collective decision processes
q Opinion formation, consensus building, decision making
q Evolution of new solutions
q New forms of democracy: e-democracy,
massive participation, transparency
q Dialogues and debates across language
boundaries and across parties, political
alliances, social classes
q Better than binary voting
q Documented transparent
decision processes
http://www.meta-net.eu 75
Priority Research Theme 2: Social Intelligence and e-Participation
From shallow to deep,
from coarse-grained to
detailed processing
techniques
Making language
technologies interoperable
with knowledge representa-
tion and the semantic web
“Semantification” of the
web: tight integration
with the Semantic Web
and Linked Open Data
Mapping large, heterogeneous,
unstructured volumes of online
content to structured, actionable
representations
Unleashing social intelligence by
detecting and monitoring opinions,
demands, needs and problems
Target groups: European citizen,
European institutions, discussion
participants, companies
Make use of the
wisdom of the
crowds
Improved
efficiency and
quality of decision
processes
Understanding influence
diffusion across social media
especially social media, comments,
blogs, forums
decision-relevant information
support
sentiment analysis and opinion mining
including the temporal dimension)
cues
from arbitrary online content
visualising discussions and opinion
statements
Services and Technologies:
collective deliberation and
e-participation
-
wide deliberation on pressing issues
and processes; modeling evolution of
opinions
analysis technologies
Applications:
Priority Research Theme 3: Socially-Aware Interactive Assistants
Interacting
naturally
with and in
groups
Learning
and
forgetting
information
Adaptable to the
user’s needs and
preferences and
the environment
Include human-computer,
human-artificial agent and
computer-mediated human-
human communication
Proactive,
self-aware,
user-adaptable
Interacts naturally with
humans, in any
language and modality
Can be personalised to
individual communication
abilities including special needs
Can learn incrementally
from all interactions and
other sources of information
recognition
and synthesis, providing expressive
voices
understanding
incremental conversational speech
models of human communication
inter-dependencies
priority themes
Services and Technologies:
Applications:
dialogue systems
environment
modalities (visual, tactile, haptic) verbal/non-verbal behaviour, social
context
ments, any
vocabulary
recovery,
self-
assessment
Multilingual
capabilities
ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015
Contents
Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i
1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6
1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10
2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10
2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18
3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18
3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29
5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
q European Parliament
§ Upcoming STOA Study and Workshop (Jan. 2017)
q European Commission
§ DG CONNECT: Horizon 2020 WP 2018-2020 (G1)
§ DG CONNECT: New Unit “Learning, Multilingualism, Inclusion” (G3)
§ DG Translation: Connecting Europe Facility, AT
q Language Communities: EFNIL and NPLD
§ Joint position paper META-FORUM 2015, 2016
q EU Member States and Non-Member States
§ National and regional funding agencies (ES, NL etc.)
q Research Communities, especially Big Data community (BDVA
SRIA V3.0), Web community and many others (Robotics, IoT etc.)
q Standardisation – W3C and others
http://www.meta-net.eu 80
Multilingual Europe Stakeholders
Multilingual Success Stories
q Moses SMT toolkit as well as research and technology ecosystem
q CEF AT for public online services – good and timely development
q eBay: MT to Russian – 50% increase in sales
q Hugo.lv for Latvian public services – better than Google Translate
q Hundreds of European startups in Language Technology and AI
q Conversational interfaces (Siri, Echo, Cortana): the next big thing
q IBM Watson – a billion dollar LT business
q Great Neural MT results reported by European researchers (QT21)
q Very rapid development – many opportunities for European R&D&I
http://www.meta-net.eu 81

More Related Content

What's hot

The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020Georg Rehm
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeGeorg Rehm
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationGeorg Rehm
 
The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...Georg Rehm
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS - The Language Data Network
 
ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?PretaLLOD
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Georg Rehm
 
The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper SeriesGeorg Rehm
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for EuropeGeorg Rehm
 
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Georg Rehm
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeABBYY Language Serivces
 
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeMETA-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeGeorg Rehm
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataGeorg Rehm
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyDafydd Gibbon
 
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerBiblioteca Nacional de España
 
ELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languagesELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languagesDafydd Gibbon
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)cneudecker
 

What's hot (19)

The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual Europe
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
 
The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
 
ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
 
The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper Series
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
 
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
 
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeMETA-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open Data
 
ELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technologyELKL 5 Language documentation for linguistics and technology
ELKL 5 Language documentation for linguistics and technology
 
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens Neudecker
 
ELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languagesELKL 4, Language Technology: learning from endangered languages
ELKL 4, Language Technology: learning from endangered languages
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)
 

Similar to Multilingualism for Digital Europe

Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Georg Rehm
 
Human Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeHuman Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeGeorg Rehm
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Georg Rehm
 
Language Technology for Multilingual Europe
Language Technology for Multilingual EuropeLanguage Technology for Multilingual Europe
Language Technology for Multilingual EuropeGeorg Rehm
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS - The Language Data Network
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...Georg Rehm
 
Adnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh ToolsAdnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh ToolsGareth Morlais
 
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptxDataScienceConferenc1
 
Why language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languagesWhy language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languagesGareth Morlais
 
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...TAUS - The Language Data Network
 
Protecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionProtecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionTeresa Lynn
 
MLi - Project presentation
MLi - Project presentationMLi - Project presentation
MLi - Project presentationMLi Project
 
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...Gareth Morlais
 

Similar to Multilingualism for Digital Europe (15)

Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
 
Human Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeHuman Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual Europe
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
 
Language Technology for Multilingual Europe
Language Technology for Multilingual EuropeLanguage Technology for Multilingual Europe
Language Technology for Multilingual Europe
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
 
Adnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh ToolsAdnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh Tools
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
[DSC Europe 23] Slobodan Markovic - NLP for Serbian.pptx
 
Why language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languagesWhy language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languages
 
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
 
Protecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionProtecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital Extinction
 
MLi - Project presentation
MLi - Project presentationMLi - Project presentation
MLi - Project presentation
 
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...
Pam mae technoleg iaith Cymraeg yn bwysig? Why Welsh language technology ...
 
Bne impact co_c
Bne impact co_cBne impact co_c
Bne impact co_c
 

More from Georg Rehm

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...Georg Rehm
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Georg Rehm
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...Georg Rehm
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenGeorg Rehm
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Georg Rehm
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickGeorg Rehm
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIGeorg Rehm
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryGeorg Rehm
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die KundenkommunikationGeorg Rehm
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Georg Rehm
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenGeorg Rehm
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CGeorg Rehm
 
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Georg Rehm
 
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Georg Rehm
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeGeorg Rehm
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Georg Rehm
 
Globale Standards im Web of Things
Globale Standards im Web of ThingsGlobale Standards im Web of Things
Globale Standards im Web of ThingsGeorg Rehm
 
W3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopW3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopGeorg Rehm
 
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten BranchenDigitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten BranchenGeorg Rehm
 

More from Georg Rehm (19)

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und Übersetzen
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KI
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film Industry
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die Kundenkommunikation
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3C
 
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
 
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
 
Globale Standards im Web of Things
Globale Standards im Web of ThingsGlobale Standards im Web of Things
Globale Standards im Web of Things
 
W3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopW3C/DFKI Automotive Workshop
W3C/DFKI Automotive Workshop
 
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten BranchenDigitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
Digitale Kuratierungstechnologien – Beispiele aus ausgewählten Branchen
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Multilingualism for Digital Europe

  • 1. META-NET has received funding from the EU’s Horizon 2020 research and innovation programme through the contract CRACKER (grant agreement no.: 645357). Formerly co-funded by FP7 and ICT PSP through the contracts T4ME (grant agreement no.: 249119), CESAR (grant agreement no.: 271022), METANET4U (grant agreement no.: 270893) and META-NORD (grant agreement no.: 270899). Multilingualism for Digital Europe Georg Rehm General Secretary META-NET, Coordinator CRACKER DFKI, Germany georg.rehm@dfki.de Ringvorlesung Digitale Lebenswelten – Universität Hildesheim, 15th November 2016
  • 2. Outline q A Multilingual Europe Initiative: META-NET § LT Support – META-NET White Paper Series § LT Strategy – META-NET SRA q Continuing the Initiative – Recent Developments § The Digital Single Market and Multilingualism § Cracking the Language Barrier § META-FORUM 2015/2016 – MDSM SRIA V0.5/V0.9 q Goals and Next Steps http://www.meta-net.eu 2
  • 3. META-NET and META: Brief History http://www.meta-net.eu 3
  • 4. Multilingual Europe in 2010 4http://www.meta-net.eu q Challenge: Providing each language community with the most advanced technologies for communication and information so that maintaining their mother tongue does not turn into a disadvantage. q While research has made considerable progress in recent years, the pace of progress is not fast enough to meet the challenge within the next 10-20 years. q All stakeholders – researchers, LT industries, policy makers, language communities, funding programmes – should team up in a strategic alliance for a major dedicated push.
  • 5. q 60 research centres in 34 countries (founded in 2010) Chair of Executive Board: Jan Hajic (CUNI) Dep.: J. van Genabith (DFKI), A. Vasiljevs (Tilde) General Secretary: Georg Rehm (DFKI) q Multilingual Europe Technology Alliance. 826 members in 67 countries (published in 2013) (31 volumes; published in 2012) T4ME (META-NET) CESAR METANET4UMETA-NORDMultilingual Europe Technology AllianceNET
  • 7. q Basque q Bulgarian* q Catalan q Croatian* q Czech* q Danish* q Dutch* q English* q Estonian* q Finnish* q French* q Galician q German* q Greek* q Hungarian* q Icelandic q Irish* q Italian* q Latvian* q Lithuanian* q Maltese* q Norwegian q Polish* q Portuguese* q Romanian* q Serbian q Slovak* q Slovene* q Spanish* q Swedish* q Welsh * Official EU languagehttp://www.meta-net.eu/whitepapers
  • 8. Cross-Lingual Comparison q 1. Machine Translation 2. Text Analytics 3. Speech Processing/Synthesis 4. Language Resources q Ranking: from excellent LT support to weak/no LT support. q Cross-lingual comparison discussed and finalised at a network meeting with representatives of all languages (Oct., 2011). http://www.meta-net.eu 8
  • 9. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support through LT Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support through LT Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh weak or no support through LTexcellent ResourcesTextAnalytics
  • 11. Observations and Results http://www.meta-net.eu 11 q When it comes to technology support, there are massive differences between Europe’s languages and technology areas. q Support for English is ahead of any other language. q But: even support for English is far from being perfect. q Several languages get the weakest score in all four areas (e.g., Icelan- dic, Latvian, Lithuanian, Maltese)!
  • 12. Digital Language Extinction! q “At Least 21 European Languages in Danger of Digital Extinction!” q Press release on European Day of Languages (Sept. 26, 2012). q Huge global interest in the topic and our key findings! q 600+ mentions in the press. q News from 40+ countries in 35+ different languages. q 20+ television reports and 30+ broadcast interviews (radio, tv) with META-NET representatives. q Two Parliamentary Questions in the EP on the “digital extinction of languages” topic. q These results lead to a STOA Workshop in the EP (Dec. 3, 2013). http://www.meta-net.eu 12
  • 13. Desudensættesderpengeaftilatøgeantal- let af operationer og udvide ambulatorieka- paciteten på det urologiske område på Herlev, »Mensåerdetogsåvigtigtatholdefastidet målogikkestillesigtilfredsmed,at80eller85 pct.kommerigennemtiltiden.«B Af Jens Ejsing // ejs@berlingske.dk Det danske sprog har det svært i den digitale verden. Det konstaterer danske sprogforskere- og eksperter i forbindelse med den nye inter- nationale undersøgelse META-NET, der ser nærmere på, hvordan en lang række mindre, europæiske sprog som dansk klarer sig i den digitaleverden. Forskerne fra bl.a. Københavns Universitet og Dansk Sprognævn når frem til, at dansk i fremtiden kan få det endnu sværere i den digitale verden, fordi Google Translate, GPSer, applikationertilsmartphonesogandresprog- teknologiske programmer ikke i tilstrækkelig grad formår at behandle de mange nuancer i detdanskesprog. Professor i sprogteknologi på Københavns Universitet, Bolette Sandford Pedersen, mener, at der er brug for en slags digital dansk sprogbank fyldt med data, så bl.a. oversættel- ser bliver så præcise og gode som muligt. Med hjælp fra sprogbanken kan forskere ifølge professoren hjælpe virksomheder med at for- bedreprogrammer,derskalhåndteresproglig viden om bl.a. maskinoversættelse, tale- genkendelseoginformationssøgning. Dermedvilderblivelængeremellemfejlag- tige oversættelser, som når »hæld olie på pan- den« med Google Translate bliver til »pour oil on the forehead« på engelsk. Oversættelser, der er i værste fald er så upræcise, at danskere ender med at fravælge deres eget sprog i den digitaleverden. Sproghjælp til virksomheder Hun anerkender dog, at »teknologien til auto- matiske oversættelser på mange måder er fantastisk«. »Den er bare ikke god nok, når det gælder dansk,«sigerhun: »Detersomom,atviietvistomfanglægger det i hænderne på Google eller andre virk- somheder at afgøre, om dansk skal behandles godt nok eller ej. Men det danske marked er ikke stort for dem. Spørgsmålet er derfor, Dårlig sprogteknologi truer dansk på nettet Ord. Forskere arbejder på at forbedre danske oversættelser på internettet. om vi ikke i højere grad selv skal gøre noget for at sikre, at det fornødne datamateriale er til rådighed, så vi får gode oversættelser og anden god sprogteknologi. Det kunne f.eks. være ved, at vi gjorde en indsats for at få opret- tet en sprogbank med en masse beriget mate- rialeomdansk.« »Hvis vi hele tiden oplever, at oversættel- ser er behæftede med fejl, tør vi ikke stole på dem,« siger hun og understreger, at »fejlagtige oversættelserkanføretilstoremisforståelser«. Ifølge Dansk Sprognævns direktør, Sabine Kirchmeier-Andersen,kandårligsprogtekno- logi have konsekvenser for mange danskere, derikkeersågodetilengelsk. »Hvis vi har ambitioner om at bruge det danske sprog i fremtidens teknologiske univers, skal der gøres en indsats nu for at fastholde ekspertise og udbygge den viden, vi har,«menerhun: »Ellers risikerer vi, at kun folk, der taler fly- dendeengelsk,vilfåglædeafdenyegeneratio- ner af web-, tele- og robotteknologi, der er på vej.«B INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN KILDE: REGION HOVEDSTADEN H Der er omkring 80 sprog i EU. For 21 af dem – også dansk – gælder det, at der er store sprogteknologiske mangler, når det gælder bl.a. maskinoversættelse, talegenken- delse og informationssøgning. H Ifølge en EU-undersøgelse køber et stigende antal europæiske internetbrugere varer eller tjenester på nettet, hvor det sprog, der bliver anvendt, ikke er deres eget. Det gælder over halvdelen af brugerne. H Over hver tredje anvender et fremmed- sprog til at skrive mail eller indlæg på nettet. fakta H Sprog i Europa 38 Στην ψηφιακή εποχή δεν… µιλούν ελληνικά, όπως και αρκετές άλλες ευρωπαϊκές γλώσσες, σύµφωνα µε πανευρωπαϊ- κή έκθεση µε την υπογραφή 200 και πλέον ειδικών. Η συγκεκριµένη µελέ- τη δηµοσιεύτηκε από το επιστηµονικό δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτε- σινή Ευρωπαϊκή Ηµέρα Γλωσσών. Για τις ανάγκες της έρευνάς τους, γλωσσολόγοι από 34 χώρες της Γη- ραιάς Ηπείρου βαθµολόγησαν τις διαθέσιµες γλωσσικές υπηρεσίες και δηµιούργησαν ένα «Λευκό Βι- βλίο» για κάθε ευρωπαϊκή γλώσσα. Στη µελέτη τους, οι ειδικοί αναζήτη- σαν µεταξύ άλλων τέσσερα βασικά ηλεκτρονικά εργαλεία, δηλαδή την ύπαρξη αυτόµατης µετάφρασης, τη δυνατότητα φωνητικής αλληλε- πίδρασης και ψηφιακής ανάλυσης κειµένου, ενώ ταυτόχρονα διερευνή- θηκε και η διαθεσιµότητα γλωσσικών πόρων ή πηγών. Σε πρώτη φάση εξέτασαν τις ιστο- σελίδες που επιτρέπουν στους χρή- στες να κάνουν µεταφράσεις online, όπως, για παράδειγµα, η υπηρεσία του κολοσσού πληροφορικής Google Translate. Την ίδια ώρα, εξετάστηκε και η «επικοινωνία» των ελληνόφω- νων χρηστών µε τις…συσκευές τους, όπως για παράδειγµα η δυνατότητα να «µιλήσει» κάποιος στο GPS στη µητρική του γλώσσα. Οι ερευνητές κατέληξαν στο συµπέρασµα ότι υπάρχουν τέτοιες συσκευές, αλλά δεν είναι τόσο διαδεδοµένες όσο οι αγγλόφωνες. Το «χρυσό» µετάλλιο κατακτά, όπως είναι άλλωστε και λογικό, η αγγλική γλώσσα. Οι αγγλόφωνοι χρή- στες έχουν την καλύτερη δυνατή τε- χνολογική υποστήριξη, κάτι το οποίο ευνοεί την περαιτέρω εξάπλωση της γλώσσας. Από «τεχνολογικό απο- κλεισµό» κινδυνεύουν περισσότερο η ισλανδική, η λετονική, η λιθουανική και η µαλτέζικη γλώσσα, ενώ σε λίγο καλύτερη µοίρα βρίσκονται η ελλη- νική, η βουλγαρική, η ουγγρική και η πολωνική, που όπως αναφέρει η έρευνα έχουν «αποσπασµατική» τε- χνολογική υποστήριξη. «Μέτρια» χαρακτηρίζεται η υπο- στήριξη χρηστών σε ολλανδική, γαλ- λική, γερµανική, ιταλική και ισπανική γλώσσα. Οι επικεφαλής της επιστη- µονικής οµάδας, Χανς Ουζκοράιτ και Γκεόργκ Ρεµ, αναφέρουν χαρακτηρι- στικά: «Υπάρχουν δραµατικές διαφο- ρές στην υποστήριξη της γλωσσικής τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες. Το χάσµα µετα- ξύ “µικρών” και “µεγάλων” γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων σε ψηφιακούς πόρους γλωσσών µε τις απαραίτητες βασικές τεχνολογί- ες. ∆ιαφορετικά, οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Μάλιστα, οι ειδικοί τονίζουν ότι χω- ρίς αποφασιστική δράση οι γλώσσες αυτές δύσκολα θα… επιβιώσουν στον ψηφιακό κόσµου του 21ου αιώνα. Η κ. Μαρία Γαβριηλίδου, µέλος της επι- στηµονικής οµάδας από το Ινστιτούτο Επεξεργασίας του Λόγου Ερευνητικό Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η έρευνα αυτή δεν λέει ότι δεν θα ζήσει η ελληνική γλώσσα ή ότι κινδυνεύει µε εξαφάνιση». Η ειδικός εξηγεί ότι όσο υπάρχουν άνθρωποι που µιλά- νε, γράφουν και επικοινωνούν µε µια γλώσσα, τότε αυτή θα συνεχίσει να υπάρχει. Είναι σηµαντικό, όµως, να έχουν όλοι οι χρήστες τη δυνατότητα να «µιλήσουν» στις µηχανές, όπως τα GPS τους, στα ελληνικά και να έχουν στη διάθεσή τους γλωσσικά εργαλεία ηλεκτρονικών υπολογιστών. Μεταξύ αυτών των «εργαλείων» είναι οι διορθωτές ορθογραφικών και συντακτικών λαθών, που χρησιµοποι- ούνται καθηµερινά από εκατοντάδες Ελληνες χρήστες και βασίζονται στη γλωσσική τεχνολογία. Παρ’ όλα αυτά, τονίζει ότι η ψη- φιακή εξάπλωση µιας γλώσσας είναι σηµαντική «∆εν είναι στα χέρια του µέσου χρήστη. Οι εκάστοτε κυβερ- νήσεις, η Ευρωπαϊκή Ενωση και ο ιδιωτικός τοµέας πρέπει να χρηµα- τοδοτήσουν την ανάπτυξη αυτής της τεχνολογίας για όλες τις γλώσσες», αναφέρει και συνεχίζει: «Οι χρήστες, όµως, πρέπει να απαιτούν να υπάρ- χουν και στη γλώσσα τους τα µέσα αυτά και να µην ικανοποιούνται µε τα αγγλικά». ■ Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ Life ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ Με ψηφιακή εξαφάνιση κινδυνεύουν τα ελληνικά ΕΛΕΝΗ ΒΕΡΓΟΥ evergou@e-typos.com Η γλώσσα της αποξένωσης… GREEKLISH Οι αγγλόφωνοι χρήστες έχουν την καλύτερη δυνατή τεχνολογική υποστήριξη, γεγονός που ευνοεί την περαιτέρω εξάπλωση της γλώσσας ΜΕ GREEKLISH επικοινω- νούν πλέον µέσω µηνυµά- των ή email οι περισσότεροι νέοι της χώρας µας. Παρά το γεγονός ότι τα τελευ- ταία χρόνια υπάρχουν τα γλωσσικά εργαλεία, τα οποία επιτρέπουν τη χρήση της ελληνικής γραµµατο- σειράς, έφηβοι και νέοι ενήλικες φαίνεται ότι δεν έχουν «αγκαλιάσει» αυτές τις τεχνολογίες. Ο καθη- γητής Γλωσσολογίας, κ. Γιώργος Μπαµπινιώτης, λέει στον «Ε.Τ.»: «Τα greeklish είναι πρόβληµα για την ελληνική γλώσσα, ιδίως για ανθρώπους νέας ηλικίας για έναν καθαρά γλωσσικό λόγο. Με τη χρήση των greeklish αποξενώνονται από τη µορφή της λέξης ή όπως λέµε το ετυµολογικό ίνδαλµα που δηλώνεται µε την ορθογραφία της λέξης και συνδέεται και µε τη ση- µασία της λέξης και µε την προέλευσή της». Ο κίνδυνος, µε τον οποίο έρχονται αντι- µέτωποι οι νέοι άνθρωποι, είναι η αποξένωση από τη γραπτή µορφή της γλώσ- σας. Αυτή η «οικειότητα», όµως, βοηθάει και στην κατανόηση της σηµασίας αλλά και την προέλευση της λέξης. «Αυτή η αποξένωση δεν είναι άνευ σηµασίας», αναφέρει ο ειδικός, ο οποίος εξηγεί ότι η διαδικασία της γραφής βοηθάει να εντυπω- θεί η λέξη και να συνδεθεί µε άλλες οµόρριζες λέξεις. «Οταν χρησιµοποιείται αυτή η µορφή επικοινωνίας, κα- ταστρέφονται, ατονούν. ∆εν είναι προς θάνατο, αλλά θα κάνει ζηµιά», αναφέρει ο κ. Μπαµπινιώτης, ο οποίος συµβουλεύει τους χρήστες να επιλέγουν την ελληνική γραµµατοσειρά. Γιώργος Μπαµπινιώτης. Date 30 September 2012 Page 16 Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency agreement (www.nla.co.uk) or with written publisher permission. For external republishing rights see www.nla-republishing.com 49KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012 Η 26η Σεπτεµβρίου έχει καθιε- ρωθεί από το Συµβούλιο της Ευρώπης ως η Ευρωπαϊκή Ηµέρα των Γλωσσών, αλλά, σύµφωνα µε µια νέα ευρωπαϊκή επι- στηµονική έκθεση, οι 21 από τις 30 γλώσσες της Ευρώπης -µεταξύ των οποί- ων και η Ελληνική- αντιµετωπίζουν κίν- δυνο ψηφιακής εξαφάνισης. Η έρευνα κρούει τον κώδωνα κινδύ- νου, καθώς διαπίστωσε ότι η ψηφιακή βοήθεια για τις περισσότερες ευρωπαϊκές γλώσσες είναι ελλιπής ή απολύτως ανύ- παρκτη για τους χρήστες. Τις έφαγαν οι κοινές Η έκθεση, µε τη µορφή µιας σειράς Λευκών Βίβλων (µε τίτλο «Γλώσσες στην Ευρωπαϊκή Κοινωνία της Πληροφορίας»), από το επιστηµονικό δίκτυο ΜΕΤΑ- ΝΕΤ, το οποίο συνενώνει 60 ερευνητικά κέντρα σε 34 χώρες, επισηµαίνει ότι οι γλώσσες που µιλιούνται από σχετικά µικρό αριθµό ανθρώπων κινδυνεύουν, επειδή δεν έχουν τεχνολογική υποστή- ριξη όπως έχουν οι ευρέως χρησιµο- ποιούµενες γλώσσες. Λευκές Βίβλοι έχουν καταρτιστεί για τις εξής ευρω- παϊκές γλώσσες: αγγλικά, βασκικά, βουλγαρικά, γαλικιανά, γαλλικά, γερ- µανικά, δανικά, ελληνικά, εσθονικά, ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά, καταλανικά, κροατικά, λετονικά, λι- θουανικά, µαλτέζικα, νορβηγικά (µπουκ- µόλ και νινόρσκ), ολλανδικά, ουγγρικά, πολωνικά, πορτογαλικά, ρουµανικά, σερβικά, σλοβακικά, σλοβενικά, σουη- δικά, τσεχικά και φινλανδικά. Κάθε Λευκή Βίβλος είναι γραµµένη στη γλώσ- σα στην οποία αναφέρεται και είναι µεταφρασµένη στα αγγλικά. Τέσσερις µεγάλοι κίνδυνοι Σύµφωνα µε τη νέα µελέτη, η Ισ- λανδική, η Λετονική, η Λιθουανική και η Μαλτέζικη αντιµετωπίζουν τον µε- γαλύτερο κίνδυνο εξαφάνισης σε µια ευρωπαϊκή τεχνολογική κοινωνία, που ολοένα περισσότερο προωθεί τη χρήση συγκεκριµένων γλωσσών και ιδίως της Αγγλικής. Όµως και άλλες γλώσσες, όπως η Ελληνική, η Βουλγαρική, η Ουγ- γρική και η Πολωνική, επίσης κινδυ- νεύουν στον σύγχρονο ψηφιακό κόσµο. Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία συνέβαλαν περισσότεροι από 200 ειδικοί, αξιολογεί τον κίνδυνο για κάθε γλώσσα µε βάση τέσσερα βασικά κριτήρια σε τεχνολογικό/ψηφιακό επίπεδο: την ύπαρ- ξη αυτόµατης µετάφρασης στη συγκε- κριµένη γλώσσα, τη δυνατότητα φωνη- τικής αλληλεπίδρασης, τη δυνατότητα ψηφιακής ανάλυσης κειµένου και τη διαθεσιµότητα των σχετικών ψηφιακών γλωσσικών πόρων/πηγών. Οι δυνατές Η γλώσσα µε την καλύτερη βαθµο- λογία στα κριτήρια είναι ασφαλώς η Αγγλική, που απολαµβάνει τη συγκριτικά καλύτερη τεχνολογική υποστήριξη (αν και όχι την καλύτερη δυνατή), γεγονός που διευκολύνει την περαιτέρω εξά- πλωσή της. Ακολουθούν µε ικανοποιητική ή µέ- τρια τεχνολογική/ψηφιακή υποστήριξη η Ολλανδική, η Γαλλική, η Γερµανική, η Ιταλική και η Ισπανική. Η Ελληνική, όπως επίσης η Βασκική, η Καταλανική, η Πολωνική, η Ουγγρική κ.ά. κατα- τάσσονται στις γλώσσες µε «αποσπα- σµατική» µόνο υποστήριξη, γι’ αυτό ακριβώς θεωρούνται γλώσσες υψηλού κινδύνου προς εξαφάνιση. Δραµατικές διαφορές Σύµφωνα µε τους επιµελητές της µε- λέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ, «υπάρχουν δραµατικές διαφορές στην υποστήριξη της γλωσσικής τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες και τεχνολογικές περιοχές. Το χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’ γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες, αλλιώς οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Ως ελπίδα αυτών των γλωσσών θεω- ρείται η βελτίωση και η ευρύτερη αξιο- ποίηση του λογισµικού γλωσσικής τε- χνολογίας, το οποίο επιτρέπει τη φω- νητική και τη γραπτή επεξεργασία των διαφόρων γλωσσών. Παραδείγµατα αυτών των δυνατοτή- των είναι οι ηλεκτρονικοί ορθογραφικοί και συντακτικοί διορθωτές κειµένων, οι διαδραστικοί προσωπικοί «βοηθοί» των έξυπνων κινητών τηλεφώνων (π.χ. η Siri στο iPhone), τα συστήµατα αυ- τόµατης µετάφρασης, τα ηλεκτρονικά συστήµατα διαλόγου των τηλεφωνικών κέντρων, οι µηχανές αναζήτησης, η συνθετική φωνή στα συστήµατα πλοή- γησης των αυτοκινήτων. κ.ά. Το βασικό πρόβληµα Το σηµαντικό, σύµφωνα µε την έκ- θεση, είναι όλες αυτές οι δυνατότητες να προσφέρονται στους χρήστες και στη µητρική τους γλώσσα που κινδυνεύει µε εξαφάνιση. Χωρίς αποφασιστική δρά- ση, γίνεται η δυσοίωνη πρόβλεψη ότι οι γλώσσες αυτές δύσκολα θα επιβιώσουν στον ψηφιακό κόσµο του 21ου αιώνα. Ένα πρόβληµα είναι ότι το λογισµικό αυτών των συστηµάτων γλωσσικής τε- χνολογίας στηρίζεται σε στατιστικές µε- θόδους που απαιτούν τεράστιες ποσό- τητες γραπτών ή φωνητικών δεδοµένων, όµως τόσα πολλά δεδοµένα είναι δύσκολο να αποκτηθούν για γλώσσες που οµι- λούνται από σχετικά λίγους ανθρώπους. Εξάλλου, ακόµα και για ευρέως χρη- σιµοποιούµενες γλώσσες όπως τα αγ- γλικά, η σχετική γλωσσική τεχνολογία έχει ακόµα αδυναµίες, που είναι π.χ. φανερές στις άκρως ανεπαρκείς και γε- µάτες λάθη αυτόµατες µεταφράσεις. Η έκθεση προτείνει ότι πρέπει να αναληφθεί µια συντονισµένη µεγάλης κλίµακας προσπάθεια στην Ευρώπη, προκειµένου σταδιακά να δηµιουργηθούν ή να βελ- τιωθούν οι αναγκαίες τεχνολογίες και να βοηθηθούν οι γλώσσες που είναι ψη- φιακά παραγκωνισµένες. Τη γλώσσα µού... έχασαν Οι περισσότερες ευρωπαϊκές γλώσσες κινδυνεύουν µε ψηφιακή εξαφάνιση Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες ?049-ΚΟΣΜΟΣ 29/09/2012 1:41 ?Μ Page 49
  • 14. Update of the Study (2014) q Study comprised 31 volumes/languages. q Many languages missing! Need for extension – at least of the comparison. q We invited three language community bodies to participate in the update: European Federation of National Institutions for Language (EFNIL) Network to Promote Linguistic Diversity (NPLD) Experts Committee of the European Language Charter (Council of Europe) http://www.meta-net.eu 14 CCURL 2014 – Collaboration and Computing for Under- Resourced Languages in the Linked Open Data Era
  • 15. MT English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support Albanian, Asturian, Basque, Bosnian, Breton, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Frisian, Friulian, Galician, Greek, Hebrew, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Norwegian, Occitan, Portuguese, Romany, Scots, Serbian, Slovak, Slovene, Swedish, Turkish, Vlax Romani, Welsh, Yiddish excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish, Turkish weak or no support Albanian, Asturian, Bosnian, Breton, Croatian, Frisian, Friulian, Hebrew, Icelandic, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romanian, Romany, Scots, Vlax Romani, Welsh, Yiddish excellent English good Speech English good Dutch, French, German, Hebrew, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no support Albanian, Asturian, Bosnian, Breton, Croatian, Estonian, Frisian, Friulian, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romany, Scots, Serbian, Turkish, Vlax Romani, Welsh, Yiddish excellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Hebrew, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Albanian, Asturian, Bosnian, Breton, Frisian, Friulian, Icelandic, Irish, Latvian, Limburgish, Lithuanian, Luxembourgish, Macedonian, Maltese, Occitan, Romany, Scots, Turkish, Vlax Romani, Welsh, Yiddish weak/no supportexcellent ResourcesTextAnalytics
  • 18. Three Ingredients http://www.meta-net.eu 18 Appropriate Programme Vision & Agenda Appropriate Actors Research & Commercialisation Appropriate Support Funding
  • 19. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes Planning Process Strategic Research Agenda 2010 2011 2012
  • 20. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes Planning Process: Documents Strategic Research Agenda 2010 2011 2012 www.meta-net.eu office@meta-net.eu T: +49 30 23895 1833 The Future European Multilingual Information Society Vision Paper for a Strategic Research Agenda “People can’t share knowledge if they don’t speak a common language.” Davenport, Thomas H, and Laurence Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School, Boston, 1997, p. 98. Join the discussion at www.meta-et.eu/forum LT 2020 Vision and Priority Themes for Language Technology Research in Europe until the Year 2020 Towards the META-NET Strategic Research Agenda The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro- pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement 270893) and META-NORD (Grant Agreement 270899). Do you have comments, ideas or suggestions with regard to the content of this document? Please send them to office@meta-net.eu or discuss them online: http://www.meta-net.eu/sra. This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Translation and Localisation Results of first two meetings Editors: Aljoscha Burchardt, Georg Rehm Dissemination Level: Public Date: 3 December 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Media and Information Services: Results of first two meetings Editors: Maria Koutsombogera, Stelios Piperidis Dissemination Level: Public Date: 10 November 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Interactive Systems: Results of first two meetings Editors: Joseph Mariani, Bernardo Magnini Dissemination Level: Public Date: 28 December 2010
  • 21. Strategic Research Agenda q Addresses the problems we identified when preparing the white papers. q Can put Europe ahead of its competitors in this technology area. q 200 contributors; >2 years. 54% industry; 46% research; 4% (inter)national institutions. q Presented and discussed at 90+ conferences and major workshops. q Published in early 2013. q http://www.meta-net.eu/sra http://www.meta-net.eu 21
  • 22. Priority Research Themes q Three priority research themes: § Translingual Cloud § Social Intelligence and e-Participation § Socially-Aware Interactive Assistants q Two additional themes: § European Service Platform for Language Technologies § Core Technologies for Language Analysis and Production http://www.meta-net.eu 22
  • 23. Providers of operational and research technologies and services Research Centres European Institutions Other companies (SMEs, startups etc.) National Language Institutions Language Technology Providers Language Service Providers Universities European Institutions Research Centres Public Administrations Enterprises LT User Industries Universities European Citizens Beneficiaries/users of the platform Interfaces (web, speech, mobile etc.) Priority Research Theme 1: Translingual Cloud Priority Research Theme 2: Social Intelligence & e-Participation Priority Research Theme 3: Socially Aware Interactive Assistants European Service Platform for Language Technologies (Cloud or Sky Computing Platform) Multilingual technologies Text analytics Text generation Language checking Sentiment analysis Named entity recognition Summari- sation Knowledge access and management Information and relation extraction Language Processing Language Understanding Knowledge Emotion/ Sentiment Data protection Tools Data Sets Resources Components Metadata Standards Interfaces APIs Catalogues Quality Assurance Data Import/Export Input/Output Storage Performance Availability Scalability Features
  • 26. 1 DFKI Germany Georg Rehm 2 CUNI Czech Republic Jan Hajic 3 ELDA France Khalid Choukri 4 FBK Italy Marcello Federico 5 ATHENA RC Greece Stelios Piperidis 6 UEDIN UK Philipp Koehn 7 USFD UK Lucia Specia Coordination and Support Action, H2020-ICT17, 2015–2017, 36 months – http://www.cracker-project.eu Cracking the Language Barrier Coordination, Evaluation and Resources for European MT Research THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence Language-blocking: languages they do not speak Geo-blocking and language-blocking are barriers to access Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing Communities • META-NET incl. META-SHARE and META • MT evaluation initiatives – WMT, IWSLT, MT Marathons • MT and other LT industry • Language resources – META-SHARE, ELRA • HT/MT evaluation tools – translate5 • Translation industry, translation profession • MT user communities Strategic Agenda for the Multilingual Digital Single Market • Version 0.5 presented at META-FORUM 2015 (Riga) • Version 0.9 presented at META-FORUM 2016 (Lisbon) Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016
  • 27. Selected Activities 2015 2016 2017 M12 M1 M24 M36 Kick-off meeting for all ICT-17 Projects translate5 WMT 2016 WMT 2017 IWSLT 2015 IWSLT 2016 IWSLT 2017 QT Marathon 2015 QT Marathon 2016 Roadmap for European MT Research Survey on the State of HQMT in Industry and LSPs SRIA (initial version) SRIA (update) SRIA (final) version 2version 1 • Production of  resources  (e.g.,  for  WMT   2016  and  2017,  IWSLT  2015-­2017) • Tools (quality  control,  evaluations) • Strategies and  roadmaps  (SRIA,   Roadmap  for  European  MT  Research) • Exchange  and  sharing  facility  for   resources  (META-­SHARE) Recent or Upcoming Events • LREC Workshop on MT Eval. (May 25) • META-FORUM 2016 (July 4/5, Lisbon) • WMT 2016 (Aug. 11/12, Berlin) • IWSLT 2016 (Dec. 8/9, Seattle) • Federation of organisations and projects working on technologies for multilingual Europe. • 10 organisations; 24 projects. • Areas of collaboration: data management and repositories, tools, shared tasks, evaluations. • Goal: provide one umbrella organisation for the whole community. http://www.cracking-the-language-barrier.eu
  • 28. q META-FORUM 2016 – July 04/05, Lisbon, Portugal Beyond Multilingual Europe q META-FORUM 2015 – April 27, Riga, Latvia Technologies for the Multilingual Digital Single Market q META-FORUM 2013 – Sept. 19/20, Berlin, Germany Connecting Europe for New Horizons q META-FORUM 2012 – June 20/21, Brussels, Belgium A Strategy for Multilingual Europe q META-FORUM 2011 – June 27/28, Budapest, Hungary Solutions for Multilingual Europe q META-FORUM 2010 – Nov. 17/18, Brussels, Belgium Challenges for Multilingual Europe http://www.meta-net.eu 28
  • 29. The Multilingual Digital Single Market http://www.meta-net.eu 29
  • 30. q Top priority in the European Union. q Expected to add 400b€ to European GDP and hundreds of thousands of new jobs. q Unfortunately, the language topic is not included in the EC’s Digital Single Market strategy (published in May 2015).
  • 31.
  • 32.
  • 34. Facts and Figures http://www.meta-net.eu 34 THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing
  • 35. Facts and Figures http://www.meta-net.eu 35 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. and marketing costs. responsiveness is needed to achieve customer satisfaction and build brand loyalty.
  • 36. The MDSM Fact Sheet http://www.meta-net.eu 36 Current eCommerce growth within Europe is about half that of the US, due partially to a lack of language coverage from European SMEs. Lessthan5%ofEuropeanSMEscurrentlysellcross-language. Multilingual Digital Single Market Why Europe needs a No single language accounts for more than 20% of the potential Multilingual Digital Single Market. Most account for less than 3% of the DSM. Without a solution, the European Digital Single Market will remain fragmented. Europe’s 24 official languages present a tremendous opportunity for European business Removing language barriers within Europe would open access to 73% (with >€25 trillion in annual revenue!) of the world’s digitally accessible market to European enterprise. Europetodayisnotasinglemarket: itisaseparatedinto20+smalllanguagemarkets. www.meta-net.eu Chinese (510 million) W orld Spanish (1 65 millio n) W orld Po rtug ue se (8 3 millio n) English (565 million) Ja pane se (1 00 millio n) Rus sian (6 0 millio n) Europe today (Many small markets) LANGUAGE TECHNOLOGY The Multilingual Digital Single Market Online Population Source:InternetWorldStats(MiniwattMarketingGroup)InternetWorldStats(Mini THREE PRIORITY AREAS FOR ACHIEVINGTHE MULTILINGUAL DIGITAL SINGLE MARKET Multilingual access to all digital goods and services across Europe1 Geo-blocking: due to nationality, location, or residence customers Language-blocking: languages they do not speak however, current online translation is insufficient trying to conduct common languages Geo-blocking and language-blocking are barriers to access Both geo-blocking and language-blocking are daily problems for tens of millions of EU citizens. Customers are six times more likely to buy from sites in their native language. Most EU languages address less than 3% of the market, fundamentally limiting SMEs operating in countries where those languages are spoken. Lack of language technology support (automatic translation, tools to assist human translators, and multilingual support in European businesses. Language can be expensive for SMEs Online businesses face around €5,000 in up-front costs for each new language they translate their websites into, plus similar and marketing costs. Even when sites are translated, the vast majority of SMEs cannot respond to support requests or customer feedback in other languages. Such responsiveness is needed to achieve customer satisfaction and build brand loyalty. English is not the answer 52% of EU customers do not purchase Adding even a few languages to an SME’s website beyond English can have a major impact on revenue. Large organizations today to increase market share. 6x more likely to purchase Site in buyer’s native language Site in foreign language Likelihoodofpurchasing Good Moderate Fragmentary Weak/no support 0 50 100 150 200 250 300 350 400 LanguageTechnologySupport* MillionsofNativeSpeakers(Worldwide) LanguageTechnology Danger Zone (≈150 million EU citizens) LanguageTechnology Danger Zone (≈150 million EU citizens) Spanish English Portuguese German French Italian Polish Romanian Dutch Greek Hungarian Czech Swedish Bulgarian Danish Croatian Slovak Finnish Lithuanian Slovene Latvian Estonian Maltese Irish 140 million EU citizens are in the LanguageTechnology Danger Zone, where language technology is inadequate to support the DSM. Current online automatic translation provided by US tech giants does not solve less than 30% of automatically translated content is truly useful for online commerce. Only three European languages Boosting commerce through multilingual technologies2 Connecting citizens to European digital public services3 Without LanguageTechnology, the European Commission has no way to respond effectively to citizen participation. Current language technology is inadequate for over half of the EU official languages to help the European Commission solve its citizen engagement problem. Translation opens 20 times its cost in revenue opportunity. However, translation remains too expensive for many European SMEs, blocking this opportunity and limiting economic growth in Europe. Lowering these costs is a strategic opportunity Translation Costs Increase in Revenue good bad ugly OnlineAutomatic TranslationQuality Most local governmental services are monolingual only. This poses a problem for tourists, expatriates, and linguistic minorities. Language technology can provide the Multilingual eParticipation can help build the European Identity with one another in their respective native languages with sophisticated machine translation working behind the scenes. Only when EU citizens can interact in their own languages will they truly develop a sense of European identity and community. Over half of EU citizens are language blocked from interacting with the European Commission’s web resources for citizen participation. 290 million EU citizens excluded Speakers of other languages are language blocked from full participation Speakers of English, French, German can participate fully Strategic Agenda for the Multilingual Digital Single Market http://rigasummit2015.eu. META, the Multilingual EuropeTechnology Alliance, has more than 750 members (http://www.meta-net.eu LT-Innovate, the European Association of the LanguageTechnology Industry, has 180 corporate members throughout Europe (http://lt-innovate.eu Technology support has improved for some languages since this study was completed. Technology Solutions Investment in the following solutions will help achieve the Multilingual Digital Single Market Unified Customer Experience care, customer relationship, discussion fora, Multimodal User Experience for Connected Devices interfaces household appliances, and consumer Voice of the Customer market research Content Curation and Production DigitalTranslation Centre customers, citizens TheforthcomingStrategicAgendafortheMultilingualDigitalSingleMarketwillprovideadditional detailsontheseandothersolutionsfortheneedsoftheMultilingualDigitalSingleMarket. Downloadthisfactsheetfromhttp://cracker-project.eu. FormoreinformationcontactDr.GeorgRehm(DFKI)atgeorg.rehm@dfki.de. http://cracker-project.eu/wp-content/uploads/2015/11/mDSM-Fact-Sheet.pdf
  • 37. META-FORUM 2015 AND MDSM SRIA V0.5 http://www.meta-net.eu 37
  • 38. Open Letter to the EC q On Friday, March 20, 2015, we published an open letter to the EC on http://multilingualeurope.eu. q On Monday, March 23, 2015, we informed President Juncker and all Commissioners about the campaign and the 1300+ signatures. q By now more than 3600 signatures! 38 q 5 Members of the European Parliament q 150+ high-level representatives from industry (CxO level) q 1200+ professors q 400+ project or research managers q 20+ entrepreneurs and founders q hundreds of language and language technology professionals, officials, researchers, administrators and representatives from related stakeholder groups Who  signed?
  • 39. META-FORUM 2015 q April 27 in Riga, Latvia q Riga Summit 2015 on the Multi- lingual Digital Single Market q Two important components: § MDSM SRIA Version 0.5 § Further community fusing q http://www.meta-forum.eu
  • 40. Joint EFNIL and NPLD Panel q Joint EFNIL and NPLD panel at META-FORUM 2015. q Joint position paper. Initially presented at META-FORUM 2015 and the Riga Summit 2015 on the Multilingual Digital Single Market, April 27, 2015 www.rigasummit2015.eu Joint NPLD/EFNIL Position Paper on the Multilingual Digital Single Market ! “Languages are not only a means of communication. They also have embedded in them people’s values, aspirations and hopes.”(European Roadmap for Linguistic Diversity 2015, NPLD) “Many European languages run the risk of becoming victims of the digital age as they are un- der-represented and under-resourced online. Huge regional market opportunities remain un- tapped because of language barriers.” (Multilingual Europe: A challenge for language tech. MultiLingual. April/May 2011, page 51/52)
  • 41. Vision Paper Vision Group Translation and Localisation Report Vision Group Interactive Systems Report Vision Group Media and Information Services Report Priority Themes Paper Expert meeting minutes Expert meeting minutes Expert meeting minutes META-NET Strategic Research Agenda for Multilingual Europe 2020 2010 2011 2012 2013 2014 2015 www.meta-net.eu office@meta-net.eu T: +49 30 23895 1833 The Future European Multilingual Information Society Vision Paper for a Strategic Research Agenda “People can’t share knowledge if they don’t speak a common language.” Davenport, Thomas H, and Laurence Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School, Boston, 1997, p. 98. Join the discussion at www.meta-et.eu/forum LT 2020 Vision and Priority Themes for Language Technology Research in Europe until the Year 2020 Towards the META-NET Strategic Research Agenda The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the Euro- pean Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement 270893) and META-NORD (Grant Agreement 270899). Do you have comment s, ideas or suggestio ns with regard to the content of this document ? Please send them to office@m eta-net.eu or discuss them online: http://ww w.meta-n et.eu/sra. This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Translation and Localisation Results of first two meetings Editors: Aljoscha Burchardt, Georg Rehm Dissemination Level: Public Date: 3 December 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Media and Information Services: Results of first two meetings Editors: Maria Koutsombogera, Stelios Piperidis Dissemination Level: Public Date: 10 November 2010 This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance Vision Document Vision Group Interactive Systems: Results of first two meetings Editors: Joseph Mariani, Bernardo Magnini Dissemination Level: Public Date: 28 December 2010 Strategic Research and Innovation Agenda roadmaps,  agendas  and  any   other  input  from  other  initiatives … D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  • 42. Strategic Agenda for MDSM q Presented at META-FORUM 2015 and Riga Summit for the first time. q Version 0.5 – work in progress q Builds upon many strategy papers and roadmaps prepared by several European projects, incl. the META-NET SRA (2013). q Input and feedback collected at the Riga Summit 2015 to be used for upcoming versions. http://www.meta-net.eu D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  • 43. A Strategy for the MDSM q Strategic R&I Agenda for the Multilingual Digital Single Market q Core: Technology Solutions q Data economy is an inherent component – LT for effective multilingual data value chains. http://www.meta-net.eu 43
  • 44. ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015 Contents Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i 1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6 1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10 2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10 2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18 3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18 3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29 5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
  • 45. q Letter from Andrus Ansip (June 2015) q “We invite the European language technology community to further develop the ideas presented in the draft Strategic Agenda for the multilingual Digital Single Market”
  • 47. Riga Declaration q 12 organisations present at META-FORUM 2015 and the Riga Summit 2015 drafted and signed the “Declaration of Common Interests”. q CRACKER: community building, mostly among projects. q We combined these into the Cracking the Language Barrier federation. q Important goal: measure against community fragmentation. http://www.meta-net.eu DECLARATION OF COMMON INTERESTS We, the undersigned, declare here, at the Riga Summit on the Multilingual Digital Single Market, encouraged by the letter Vice President Andrus Ansip sent to its participants, that we stand united in our goal and interest to: - support multilingualism in Europe by employing language technology in business, society and governance, to create a truly Multilingual Digital Single Market, - exchange and share information in our efforts to promote our goals and interests at local, national and European levels, - raise awareness in society at large using channels available to our associations, alliances and societies. In the near future, we foresee the establishment of a Memorandum of Understanding among our organisations towards a “Coalition for a Multilingual Europe”, to better serve our members address the language barrier challenges towards establishing a truly integrated Multilingual Digital Single Market. Riga, 29. April 2015 Signed by (in alphabetical order): BDVA Laure Le Bars CITIA Steve Renals CLARIN Steven Krauwer EFNIL Sabine Kirchmeier-Andersen, Tamás Váradi ELEN Davyth Hicks, Claudia Soria ELRA Nicoletta Calzolari, Khalid Choukri GALA Laura Brandon, Robert E. Etches, Sergey Gladkov LT Innovate Jochen Hummel, Philippe Wacker META-NET Jan Hajic, Josef van Genabith, Georg Rehm, Andrejs Vasiljevs NPLD Meirion Prys Jones TAUS Jaap van der Meer W3C Richard Ishida, Felix Sasaki For any questions, please contact Georg.Rehm@dfki.de.
  • 48. http://www.cracker-project.eu • http://www.meta-net.eu • A federation of European projects and organisations working on technologies for a multilingual Europe. • Multi-lateral Memorandum of Understanding; 10 organisations and 24 projects on board already (including FP7 and H2020-ICT15). • Getting new members on a regular basis. • Selected areas of collaboration: data management and repositories, tools, shared tasks, evaluations, events. • Goal: provide one umbrella organisation for the whole community.
  • 50. http://www.cracker-project.eu • http://www.meta-net.eu • Website: information about the initia- tive, all projects and organisations • Downloadable documents • List of events • LREC 2016 MT Eval Workshop • Several new members will join the initiative soon http://www.cracking-the-language-barrier.eu
  • 51. META-FORUM 2016 AND MDSM SRIA V0.9 http://www.meta-net.eu 51
  • 52. Andrus Ansip’s Blog Post q Posted on 27 May 2016. q First public acknowledgment of the EC that the language topic is of very high relevance for the Digital Single Market. q “Overcoming language barriers is vital for building the DSM, which is by definition multilingual. It is now time to reduce and remove the language barriers that are holding back its advance, and turn them into competitive advantages.” http://www.meta-net.eu 52
  • 53. Reorganisation of DG CONNECT (01/07/2016) 01/07/2016 DG CONNECT Communications Networks, Content & Technology Director-General R. Viola (60240 Assistants O. Bringer (92067 P. Stuckmann (21097 Deputy Director-General in charge of Directorates A, C, E & H G. Kent (acting) (91945 Assistant E. Mitjana (81149 Deputy Director-General in charge of Directorates B, D, F, G & I C. Bury (60499 Assistant P. Lamotte (98892 Directorate F Digital Single Market G. de Graaf (68466 Directorate E Future Networks M. Campolargo (63479 Directorate D Policy Strategy & Outreach L. Corugedo Steneberg (96383 Directorate C Digital Excellence & Science Infrastructure Th. Skordas (acting) (68908 Directorate B Electronic Communications Networks & Services A. Whelan (50941 Directorate A Digital Industry K. Rouhana (68057 Principal Adviser F. Lupescu (68538 Directorate R Resources & Support G. Kent (91945 Directorate I Media Policy G. Abbamonte (93573 Directorate H Digital Society, Trust & Cybersecurity P. Timmers (90245 Directorate G Data J. Hernández-Ros (acting) (34533 F.1: Digital Policy Development & Coordination M. Bailey (acting) (69176 E.1: Future Connectivity Systems B. Barani (acting) (69616 D.1: Research Strategy & Programme Coordination M. Fjalland (50021 C.1: eInfrastructure & Science Cloud A. Burgueño Arjona (92471 B.1: Electronic Communications Policy V. Terävä (92381 A.1: Robotics & Artificial Intelligence J. Heikkilä (35325 R.1: Human Resources & Competences I. Mariën-Dusak (92376 I.1: Audiovisual & Media Services Policy L. Boix Alonso (90009 H.1: Cybersecurity & Digital Privacy J. Boratynski (69452 G.1: Data Policy & Innovation M. Nagy-Rothengass (31680 F.2: E-Commerce & Platforms P. Agarwal (acting) (87153 E.2: Cloud & Software P. O’Donohue (91280 D.2: Policy Implementation & Planning E. Forti (65172 C.2: High Performance Computing & Quantum Technology G. Kalbe (32866 B.2: Implementation of the Regulatory Framework W-D. Grussmann (58559 A.2: Technologies & Systems for Digitising Industry M. Lemke (91575 R.2: Budget & Finance M-C. Laffineur (68515 I.2: Copyright M. Martin-Prat (65157 H.2: Smart Mobility & Living E. Hartog (90084 G.2: Data Applications & Creativity J. Hernández-Ros (34533 F.3: Start-ups & Innovation P. Zilgalvis (50935 E.3: Next- Generation Internet J. Villasante (63521 D.3: Policy Outreach & International Affairs A. Angelova-Krasteva (91145 C.3: Future & Emerging Technologies (FET) V. Peca (57843 B.3: Markets R. Krüger (61555 A.3: Competitive Electronics Industry W. Van Puymbroeck (68138 R.3: Knowledge Management & Support Systems F. Accordino (98272 I.3: Audiovisual Industry & Media Programme L. Recalde Langarica (91281 H.3: E-Health, Well-Being & Ageing M. González-Sancho (52918 G.3: Learning, Multilingualism & Accessibility M. Marsella (acting) (32750 F.4: Digital Economy & Skills L. Sioli (51262 E.4: Internet of Things M. Rohen (63674 D.4: Communication D. Ringrose (93913 C.4: Flagships Th. Skordas (68908 B.4: Radio Spectrum Policy A. Geiss (59466 A.4: Photonics C. Maloney (69082 R.4: Compliance & Planning K. Engelbosch (54693 I.4: Media Convergence & Social Media J. Cotta (66407 H.4: E-Government & Trust A. Servida (58186 G.4: Administration & Finance G. Kalbe (acting) (32866 A.5: Administration & Finance * A. Fiala (64787 B.5: Investment in High-Capacity Networks A. Krzyżanowska (87246 H.5: Administration & Finance ** G. Van Caenegem (acting) (61895 R.5: Programme Operations & Common Services I. Malekos (52902 Mirror-Unit REA.A.5 Fostering Novel Ideas: FET-Open T. Hallantie (68167 Mirror-Unit EACEA.B.2 Creative Europe: MEDIA H. Trettenbrein (84955 Mirror-Unit REA.C.4 Expert Contracting & Payments A. Oram (97805 Principal Adviser M. Richards (62443 Adviser for Legal & Legislative Issues Ž. Bahovec (88284 Adviser for cross-cutting Policy/Research Issues G. Santucci (68963 Adviser for International Relations linked to Future Networks P. Blixt (68048 Adviser for Societal Issues N. Dewandre (94925 Adviser for Organisational Transition (Finance) Vacant Adviser for Societal Challenges Vacant Adviser for Innovation Systems B. Salmelin (69564 Reporting lines are: - R. Viola for Directorate R; - G. Kent (acting) for Directorates A, C, E, H; - C. Bury for Directorates B, D, F, G, I. Luxembourg; To be transferred to Luxembourg. Shared Administration & Finance Unit for Directorates A, B, C, D & F. Shared Administration & Finance Unit for Directorates E, H & I. Unit G.1 “Data Policy & Innovation” Unit G.3 “Learning, Multilingualism & Accessibility” • Support the data economy in the Digital Single Market • Policy initiatives addressing new and emerging issues. • Advance the Commission open data policy by ensuring the correct implementation of the PSI Directive and the Pan- European Open Data Portal • Promote the emergence of an ecosystem comprising all the players of the data value chain. • Steers together with industry the SRIA. • Addresses key framework conditions of the data economy • Fund research and innovation in data technologies and applications inter alia by driving the big data PPP. • Make the DSM more accessible, secure and inclusive. • Support policy, research, innovation and deployment of learning technologies • Support key enabling digital language technologies and services to allow all European consumers and businesses to fully benefit from the Digital Single Market. • Responsible for Web Accessibility Directive • Promote a better Internet for children by protecting and empowering children online, and improving the quality of content available to them.
  • 54. Communities & Stakeholders 54 ...  and  many  more  research  centres,  companies,  EU  projects  etc.
  • 55.
  • 56. MDSM SRIA q Version 0.5 unveiled at META-FORUM 2015 q Version 0.9 unveiled at META-FORUM 2016 q Version 1.0 foreseen for Nov./Dec. 2016 q Prepared and presented by Cracking the Language Barrier federation (editorial team: 13 colleagues) q SRIA addresses how the LT community is going to act united in order to make the DSM multilingual q Document available on http://www.cracker-project.eu and also on http://www.cracking-the-language-barrier.eu D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT D RAFT Strategic Agenda for the Multilingual Digital Single Market Technologies for Overcoming Language Barriers towards a truly integrated European Online Market D RAFT Version 0.5 – April 22, 2015
  • 57. MLV Programme q Multilingual Value Programe* § Three-year programme § Requires modest investment q “Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content” q Three components address the main needs of the Multilingual DSM (MDSM) and how to put them into practice: 1. Multilingual Application Areas 2. Multilingual Services 3. Research http://www.meta-net.eu 57 Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 * SRIA V0.9 and MLV Programme devised before re-organisation of DG CONNECT.
  • 58. MDSM: Goals and Needs q Crosslingual communication for SMEs, public institutions, citizens q Crosslingual SME presales communication and aftersales services q Multilingual (big) data, language and knowledge value chains q Multilingual websites, product catalogues, product descriptions q Multilingual knowledge bases and knowledge graphs (and services) q Multilingual conversational interfaces for connected devices (IoT) q Crosslingual business intelligence (e.g., based on UGC) q Crosslingual social media analytics for EU-wide societal issues q Multilingual text and report generation (knowledge/data to text) q All services must be domain-adaptable (no one size fits all) q Translation Centre (Cloud) – HQ automated translation for all http://www.meta-net.eu 58
  • 59. Multilingual Digital Single Market Automated Translation E-Commerce Content, Media, Verticals Translation, Language, Knowledge, Data Knowledge and Data Repositories Multilingual Applications Multilingual Services Research Crosslingual Big Data Language Analytics Meaning, Semantics, Knowledge High-Quality Machine Translation SMEs CEF DSIs IT Integrators Research provide innovative applications fills gaps H2020 RIAs H2020 CSAs, IAs, RIAs H2020 CSAs, RAs, national funding Multimodal Interaction Language Processing, Analysis and Production – Language Resources Citizens Public Business interoperable and standardised collaboration with member states Conversational Technologies Strategic Research and Innovation Agenda Language as a Data Type and Key Challenge for Big Data Enabling the Multilingual Digital Single Market through technologies for translating, analysing, processing and curating natural language content SRIA Editorial Team Version 0.9 – July 2016 MLV Programme
  • 60. Application Areas (Selection) q Multilingual E-commerce § Customer-facing vs. back-office facing (after-market, after-sales) § Crosslingual search, CRM, helpdesks, processes, workflows § Semantic, crosslingual product descriptions and catalogues § Online dispute resolution q Multilingual Content, Media, Verticals § Content analytics, curation, generation (incl. authoring support) § Multimodal communication (conversational, written, IoT) § Vertical domains: health, government, mobility, energy, legal. q Translation, Language, Knowledge, Data § Translation Cloud – written/spoken, automatic/human § Crosslingual public and social intelligence, business intelligence § HQ resources, under-resourced languages, domain-specific LRs
  • 61. Setup – Timeframe – Costs q Close collaboration with EC, EP and all other stakeholders (including SMEs, research centres, universities, NGOs etc.). q Mix of funding sources: § Horizon 2020 (WP 2018-2020) for EU projects (RA, RIA, CSA) § National/regional funding sources for work on monolingual LTs and LRs and also to support and grow SMEs in this area § Include, strengthen and broaden role of CEF AT (public services) q Estimated costs for basic MLV implementation: ca. 175-200M€ § Includes set of mission-critical services and applications § Timeframe: 2018, 2019, 2020 http://www.meta-net.eu 61
  • 63. q There is a lot of traction for the multilingualism/language topic. q The EU should develop a Multilingual Strategy (incl. technology). q Strategy must take into account several stakeholders: citizens, business/innovation, DSM, research (multiple communities). q Most components in place: Communities, SRIAs, STOA Study etc. q We need the political will to establish language policy change to support multilingualism (both member state level, EU level). q Some Member States are ahead (DK, IE, EE, ES, LT, LV, NL, SL). q Coordinate, intensify the push and keep up the pressure from Member States, EP, EC, research community, businesses etc. q Goal: a shared programme (EU/MSs) as a concerted action. http://www.meta-net.eu 63 Conclusions
  • 64. Next Steps q Several tightly interconnected goals: § Multilingual Technologies for Europe § Technologies for the Multilingual Digital Single Market § Multilingual Strategy of the European Union § The Human Language Project 1. Discuss and further shape MLV Programme V0.9 with EC 2. Extend the Cracking the Language Barrier federation 3. LT brainstorming meeting at EC, Unit G.3 (Dec. 2016) 4. EP STOA Workshop on Language Technologies (Jan. 2017) 5. MDSM SRIA V1.0 to be finalised (Q1 2017) http://www.meta-net.eu 64
  • 66.
  • 67. Language Technology Topics q Multilingual Europe – Technologies for all European languages q Machine Translation, Text Analytics, Semantic Web etc. q Healthcare, societal challenges (ageing population, refugees etc.) q IoT, Smart Assistants and Conversational Interaction Technologies q E-Learning – Language Technology for E-Learning q Smart Homes, Cities, Manufacturing q Smart Virtual Assistants q Social Media Analytics q E-Participation q Games q etc. http://www.meta-net.eu 67
  • 68. Digital Language Extinction q Many smaller languages are experiencing problems digitally: § Loss of function – other languages take over entire functional areas such as, e.g., texting, email, search, e-commerce etc. § Loss of prestige – if it’s not on the web, the languages doesn’t exist § Loss of competence – can you raise a digital native in your language? q Andras Kornai’s classification – corresponds to the amount of digital communication in that language: 1. digitally thriving languages (comfort zone languages) 2. vital languages 3. heritage languages 4. still/moribund/dead languages q Implications for the European/global multilingual web? http://www.meta-net.eu 68 potentially facing digital extinction …
  • 69. http://www.meta-net.eu q Pan-European infrastructure, bringing together providers and consumers of language data, tools and services. q LRs are documented, uploaded, stored, catalogued, downloaded, shared – to improve visibility, documentation, identification, availability, interoperability. q Caters for datasets, tools, services for LT research and development (both academic and commercial); META-SHARE includes repository software, a metadata model, licensing kit, statistics. q 29 distributed repositories maintained by 37 organisations in 25 countries. q 2.600+ resources (corpora: 49%, lexical: 38%, tools/services: 12%), covering ca. 100 languages. q 7.000+ downloads in total; ca. 70% of all LRs have been downloaded.
  • 70.
  • 71. Preparation of the SRA q Strategic Research Agendas of other initiatives were screened. q Many suggestions as input from Vision Group members. q We discussed procedures, input and structure of the SRA in four meetings of the META Technology Council. § Brussels, Belgium, November 16, 2010 § Venice, Italy, May 25, 2011 § Berlin, Germany, September 30, 2011 § Brussels, Belgium, June 19, 2012 q Additional input in talks, meetings, workshops, discussions, etc. § Example: Three HLT Expert Meetings organised by the EC (end of 2011) q Almost 200 experts contributed to the SRA (54% from industry; 46% from research; 4% from national/international institutions). http://www.meta-net.eu 71
  • 72. • Published in early 2013. • First strategic research agenda for our field. • Complex process of collecting and shaping technology visions. • Hundreds of researchers participated. • Broad topics around multi- lingual Europe in general.
  • 73. PT1: Translingual Cloud q Europe has a big need for translations of publishable quality. q Focus on high-quality translation. q New research paradigms § Inclusion of professional translators into the research process § Inclusion of technologists into research on human translation processes q Different technological approaches § Stronger emphasis on the properties of individual languages § A central role for semantics q Methods for specific genres & domains http://www.meta-net.eu 73
  • 74. Priority Research Theme 1: Translingual Cloud Any device Target groups: European citizen, language professional, organisations, companies, European institutions, software applications Multiple target formats Single access point Automatic translation and interpretation Language checking Post-editing Workbenches for creative translations Novel translation and authoring workflows Quality assurance Computer-supported human translation Multilingual content production and text authoring Trusted service centre (privacy, confidentiality, security of source data) Services and Technologies: Crosslingual communication, translation and search Real-time subtitling, voice-over generation and translating speech from live events Mobile interactive interpretation Multilingual content production (media, web, technical, legal documents) Showcases: translingual spaces for ambient translation Applications: Written (twitter, blog, article, newspaper, text with/without metadata etc.) or spoken input (spontaneous spoken language, video/audio, multiple speakers) Modular combination of analysis, transfer and generation models From very fast but lower quality to slower but very high quality (including instant quality upgrades) Exploiting strong monolingual analysis and generation methods and resources Multiple target formats Domain, task and genre specialisation models Extending translation with semantic data and linked open data
  • 75. PT2: Social Intelligence q Better decisions by monitoring social media q Inclusion of citizens into collective decision processes q Opinion formation, consensus building, decision making q Evolution of new solutions q New forms of democracy: e-democracy, massive participation, transparency q Dialogues and debates across language boundaries and across parties, political alliances, social classes q Better than binary voting q Documented transparent decision processes http://www.meta-net.eu 75
  • 76. Priority Research Theme 2: Social Intelligence and e-Participation From shallow to deep, from coarse-grained to detailed processing techniques Making language technologies interoperable with knowledge representa- tion and the semantic web “Semantification” of the web: tight integration with the Semantic Web and Linked Open Data Mapping large, heterogeneous, unstructured volumes of online content to structured, actionable representations Unleashing social intelligence by detecting and monitoring opinions, demands, needs and problems Target groups: European citizen, European institutions, discussion participants, companies Make use of the wisdom of the crowds Improved efficiency and quality of decision processes Understanding influence diffusion across social media especially social media, comments, blogs, forums decision-relevant information support sentiment analysis and opinion mining including the temporal dimension) cues from arbitrary online content visualising discussions and opinion statements Services and Technologies: collective deliberation and e-participation - wide deliberation on pressing issues and processes; modeling evolution of opinions analysis technologies Applications:
  • 77. Priority Research Theme 3: Socially-Aware Interactive Assistants Interacting naturally with and in groups Learning and forgetting information Adaptable to the user’s needs and preferences and the environment Include human-computer, human-artificial agent and computer-mediated human- human communication Proactive, self-aware, user-adaptable Interacts naturally with humans, in any language and modality Can be personalised to individual communication abilities including special needs Can learn incrementally from all interactions and other sources of information recognition and synthesis, providing expressive voices understanding incremental conversational speech models of human communication inter-dependencies priority themes Services and Technologies: Applications: dialogue systems environment modalities (visual, tactile, haptic) verbal/non-verbal behaviour, social context ments, any vocabulary recovery, self- assessment Multilingual capabilities
  • 78.
  • 79. ii Strategic Agenda for the Multilingual Digital Single Market –Version 0.5 – April, 2015 Contents Executive Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .i 1  The Digital Single Market is a Multilingual Challenge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1  Overcoming Language Barriers with Technologies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2  Language Technologies Made for Europe – in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3  Online Use of Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4  Multilingual Big Data Text Analytics for the European Data Economy. . . . . . . . . . . . . . . . . . . . . 6 1.5  EC and Language Technology – Past and Present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.6  The Economic Power of Language Technology and Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2  A Strategic Programme for the Multilingual Digital Single Market . . . . . . . . . . . . . . . . . . . . . . . 10 2.1  Layer 1: Innovative Technology Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 10 2.3  Layer 3: Priority Research Themes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4  Related Areas, Applications, and Societal Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3  Layer 1: Innovative Technology Solutions for the Multilingual Digital Single Market . . . . . . . 18 3.1  Technology Solutions for Businesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.1  Unified Customer Experience and Cross-Cultural CRM (E-Commerce) . . . . . . . . . . . . . . 18 3.1.2  Digital Translation Centre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.3  Content Curation and Content Production . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1.4  Virtual and Real Translingual Spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.5  Voice of the Customer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.6  Business Intelligence using Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.7  Multimodal User Experience for Connected Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.1.8  Smart Multilingual Assistants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2  Technology Solutions for Public Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.1  Voice of the Citizen – Social Intelligence on Big Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.2.2  Online Dispute Resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3  E-Participation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.4  E-Government. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.2.5  E-Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.6  E-Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4  Layer 2: Language Technology Services, Platforms, Infrastructures. . . . . . . . . . . . . . . . . . . . . . . 29 5  Layer 3: Priority Research Themes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6  Horizontal Framework Aspects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.1  Language Policies and Public Procurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 6.2  Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.3  Open Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.4  Copyright and Data Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 7  Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.1  Expected Economic Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 7.2  Relevance to the EC’s Digital Single Market Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.3  Potential Funding Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 7.4  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Appendix A. Input Documents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Appendix B. Digital Language Extinction in Europe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
  • 80. q European Parliament § Upcoming STOA Study and Workshop (Jan. 2017) q European Commission § DG CONNECT: Horizon 2020 WP 2018-2020 (G1) § DG CONNECT: New Unit “Learning, Multilingualism, Inclusion” (G3) § DG Translation: Connecting Europe Facility, AT q Language Communities: EFNIL and NPLD § Joint position paper META-FORUM 2015, 2016 q EU Member States and Non-Member States § National and regional funding agencies (ES, NL etc.) q Research Communities, especially Big Data community (BDVA SRIA V3.0), Web community and many others (Robotics, IoT etc.) q Standardisation – W3C and others http://www.meta-net.eu 80 Multilingual Europe Stakeholders
  • 81. Multilingual Success Stories q Moses SMT toolkit as well as research and technology ecosystem q CEF AT for public online services – good and timely development q eBay: MT to Russian – 50% increase in sales q Hugo.lv for Latvian public services – better than Google Translate q Hundreds of European startups in Language Technology and AI q Conversational interfaces (Siri, Echo, Cortana): the next big thing q IBM Watson – a billion dollar LT business q Great Neural MT results reported by European researchers (QT21) q Very rapid development – many opportunities for European R&D&I http://www.meta-net.eu 81