SlideShare a Scribd company logo
Celtic Language Technologies in the
Digital Age
John Judge, Adapt Centre, DCU
www.adaptcentre.ieBackground on Me
• Background: Computational Linguist – research and real world
• Interests in: Natural Language Processing, Text Analytics, Machine
Translation, …
• National Centre for Language Technology
• Research Integration Coordinator for the ADAPT Centre of Excellence
for Digital Content and Media Innovation
• Focus on EU collaborations
• META-NET
• QT LaunchPad
• LT Web
• FALCON
• Mli
• QT21
• TraMOOC
• EXPERT
www.adaptcentre.ieADAPT Centre
• ADAPT Science Foundation Ireland Direct Funding over six years
(until 2020)
• Academic/Industry partnership built on top of CNGL
• Five research themes
• Six application areas
• TCD and DCU co-leads; UCD and DIT partners
• Open ended number of industry partners
www.adaptcentre.ieADAPT Centre
E-Commerce
Financial
E-LearningLife SciencesICT Localisation Content & Media
Entertainment
www.adaptcentre.ieGlobal Digital Content: Platform Research
www.adaptcentre.ieAmbitious Metrics for Success
13
Spin Out
Companies
€5m
Commercialisation
Awards
1,650
Top Quality
Publications
€110m
Won in Total
Competitive
Research
500
Jobs
€9m
From
Commercial
Sources
60
Major EU
Initiatives
200
Postgraduate
Students
88
Licence
Agreements
www.adaptcentre.ieAgus Gaeilge…?
How much of all of this relates to Irish?
www.adaptcentre.ie
Language Technology
www.adaptcentre.ieLanguage Technology and Applications
www.adaptcentre.ieLT is not…
• Localised Software
• A website in your language
• A static online dictionary
But these are all VERY valuable resources for a language!
…and can form part of a healthy LT ecosystem
www.adaptcentre.ieWhat is LT – Where I’m coming from
• Technology for processing information (speech, text, gestures,…) in
a given language
• An enabling technology
• Added intelligence to both content (creation, management/etc) and
HCI
• Set of tools and resources – part of a bigger picture and a larger
ecosystem
• Interactive
• Not monolithic resources
www.adaptcentre.ieIt’s already right under your noses
• These concepts (and some others) already being used for a wide
range of applications
• Marketing/Brand awareness
• Customer Sentiment Analysis
• Political barometers (Obama)
• Information analysis and extraction (IBM Watson)
• Offensive content filtering
• Security applications
www.adaptcentre.ie
A look at the Irish LT perspective
www.adaptcentre.ieLT Landscape in Ireland
• Historically strong in Translation and Localisation industry
• Home to several internationally recognised research centres
• NCLT
• DERI
• CNGL >>> ADAPT
• INSIGHT
• Government funding for research has been consistent despite
worsening economic conditions
www.adaptcentre.ieLT for Irish
• Many of the basics are covered
• Spell checker
• Grammar checking
• T9 predictive text, smartphone predictive text (through additional
software)
• Localisation of open source software, and many major applications
• Some of the more advanced stuff
• Speech synthesiser
• Part-of-Speech Tagger
• (Dependency Parser)
www.adaptcentre.ieLT for Irish
• But there’s not much else
• Availability of text corpora, speech corpora, parallel texts, wordnets
and other LT building blocks is limited or poor
• Some resources exist – small, narrow coverage, restricted
availability
• Lack of basic linguistic resources is stifling development of modern
language processing technologies for Irish
• Yet our own research centres are producing world leading LT for
other languages
www.adaptcentre.ieState of LT Support for Irish
Source: META-NET Whitepaper Series The Irish Language in the Digital Age
www.adaptcentre.ie
MT
19
English
good
French, Spanish
moderate fragmentary
Catalan, Dutch, German, Hungarian,
Italian, Polish, Romanian
weak or no support
Basque, Bulgarian, Croatian, Czech,
Danish, Estonian, Finnish, Galician,
Greek, Icelandic, Irish, Latvian, Lithu-
anian, Maltese, Norwegian, Portuguese,
Serbian, Slovak, Slovene, Swedish,
Welsh
excellent
Czech, Dutch, Finnish,
French, German,
Italian, Portuguese,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Danish, Estonian, Galician, Greek,
Hungarian, Irish, Norwegian,
Polish, Serbian, Slovak, Slovene,
Swedish
weak or no support
Croatian, Icelandic, Latvian,
Lithuanian, Maltese, Romanian,
Welsh
excellent
English
good
Speech
English
good
Dutch, French,
German, Italian,
Spanish
moderate fragmentary
Basque, Bulgarian, Catalan,
Czech, Danish, Finnish, Galician,
Greek, Hungarian, Norwegian,
Polish, Portuguese, Romanian,
Slovak, Slovene, Swedish
weak or no supportexcellent
English
good
Czech, Dutch, French,
German, Hungarian,
Italian, Polish,
Spanish, Swedish
moderate fragmentary
Basque, Bulgarian, Catalan,
Croatian, Danish, Estonian, Finnish,
Galician, Greek, Norwegian,
Portuguese, Romanian, Serbian,
Slovak, Slovene
weak/no supportexcellent
Resources
Text
Analysis
Croatian, Estonian, Icelandic, Irish,
Latvian, Lithuanian, Maltese, Serbian,
Welsh
Icelandic, Irish, Latvian, Lithuanian,
Maltese, Welsh
www.adaptcentre.ieEurope’s Languages and LT
Dutch
French
German
Italian
Spanish
Catalan
Czech
Finnish
Hungarian
Polish
Portuguese
Swedish
Basque
Bulgarian
Danish
Galician
Greek
Norwegian
Romanian
Slovak
Slovene
Croatian
Estonian
Icelandic
Irish
Latvian
Lithuanian
Maltese
Serbian
Welsh
English
good support through
Language Technology
weak or
no support
www.adaptcentre.ieSo What?
• Take a closer look at the least equipped languages
• Only 3 compete with English in their native countries
• Maltese native fluency ~100% (Eurobarometer)
• Irish and Welsh are at risk
• So too are other RMLs which compete with any better resourced
language on a day to day basis
Croatian
Estonian
Icelandic
Irish
Latvian
Lithuanian
Maltese
Serbian
Welsh
weak or
no support
Basque
Bulgarian
Danish
Galician
Greek
Norwegian
Romanian
Slovak
Slovene
www.adaptcentre.ieLanguages at risk in the pre-digital age
www.adaptcentre.ieLanguages at risk in the print age
• Invention of the moveable type printing press
• Improved literacy
• Standardisation
• The Reformation
• The Renaissance
• The Enlightenment
• Death of hundreds of European RMLs that never made it into
print
www.adaptcentre.ieLanguages in the Digital Age
• The leap into the digital age has had profound effects
• Need to equip all languages with digital resources to ensure survival
• Otherwise they are doomed to history
• The Celtic Languages need to address under-resourcing
www.adaptcentre.ie
A High Level Solution - Europe
www.adaptcentre.ieEuropean Level Action
• Multilingual Europe Technology Alliance
• Bring together Language Technology stakeholders
• Concerted effort to influence EU research programmes for LT
• Strategic Research Agenda for Multilingual Europe
• Success in H2020 Funding calls – specifically in ICT 17 “Cracking
the Language Barrier”
• “.. to facilitate multilingual online communication for the benefit
of the digital single market which is still fragmented by language
barriers that hamper a wide penetration of cross-border
commerce, social communication and exchange of cultural
content.”
• “Special focus is on the 21 EU languages (both as source
and target languages) that have ‘fragmentary’ or ‘weak/no’
machine translation support according to the META-NET
language white papers.”
www.adaptcentre.ieAddressing the Gap – CRACKER Project
• CRACKER (Feb 2015) – follow up to META-NET. Stated goals:
• Initiating a programme of ground-breaking actions that will deliver, by
2025, an online EU internal market free of language barriers,
delivering automated translation quality, equal to currently best
performing language pair/direction, in most relevant use situations and
for at least 90% of the EU official languages.
• Significantly improving the quality, coverage and technical maturity of
automatic translation for at least half of the 21 EU languages that
currently have "weak or no support" or "fragmentary support" of
machine translation solutions, according to the META-NET
Language White Papers referenced before.
• Attracting a community of hundreds of contributors of language
resources and language technology tools (from all EU Member
States and Associated Countries) to adopt and support a single
platform for sharing, maintaining and making use of language
resources and tools; establishing widely agreed benchmarks for
machine translation quality and stimulating competition between
methods and systems.
www.adaptcentre.ieEU Actions Recap
• The EU is calling for improved resources for our languages
• The big players (industry and research) are organising to do
something about it
• Celtic languages can be part of this if we position ourselves to be
there
www.adaptcentre.ieEU Actions – Getting on board
• Riga Summit 2015, April 27-29
• http://www.rigasummit2015.eu
• Venue for META-FORUM
• Multilingual Technologies for the Digital Single Market
• Language Technologies for the Big Data Challenge and Data
Economy
• High-Quality Machine Translation
• Towards European Language Technology Platforms
• Strategic Agenda for the Multilingual Digital Single Market
www.adaptcentre.ieSummit Agenda
Opening addresses
H.E. Andris Bērziņš, President of the Republic of Latvia
First session
Setting the Strategic Agenda for the Multilingual Digital Single Market
Coffee break
Second session
Breaking the Language Barrier for Cross-Border Public Services
Lunch
Third session
Language Technology: Enabling European Business
Coffee break
Fourth session
Empowering the Multilingual Data Economy
Closing session
EU Innovation Excellence to Address Multilingual Challenges
www.adaptcentre.ieNational Policy/Funding Agency Round Table
• Roundtable session to discuss where languages and language
technologies currently stand in the different countries and regions
and how to improve the situation
• Goal: Shape a Strategic Research and Innovation Agenda with input
(and buy in) directly from those responsible for our languages at a
regional level
www.adaptcentre.ie
Towards a Celtic Language Technology
Community
www.adaptcentre.ieLanguages in the Digital Age
• Not all doom and gloom!
• Significant opportunity: LT and language promotion/rejuvenation
• Community effort can provide the basic building blocks
• Techniques can do more with less
• Policy makers can be hard to convince
• We have to start somewhere – Celtic Language Technology
Community Workshop
www.adaptcentre.ieCeltic Language Technology Workshop
“The Celtic Language Technology Workshop (CLTW) series of
workshops provides a forum for researchers interested in developing
NLP (Natural Language Processing) resources and technologies for
Celtic languages.
As Celtic languages are under-resourced, our goal is to encourage
collaboration and communication between researchers working on
language technologies and resources for Celtic languages.”
www.adaptcentre.ieFirst CLTW at COLING 2014
• Held in association with COLING 2014 (top tier CL/LT conference)
• Full day of research presentations (papers and posters)
• Attended by about 30 people
• Published 12 papers
• Representing work on: Irish, Welsh, Scots Gaelic, Breton (and an
invited talk that covered aspects of Manx)
• Including an open forum session to discuss how to move the area
forward
• Endorsed by Irish Government, Ofis Publik ar Brezhoneg (among
others)
www.adaptcentre.ieCLTW Topics of Interest
• Language resources
• Syntax, semantics, grammar,
lexicons
• Phonology / morphology, tagging
• Morphological analysis
• Part-of-speech taggers
• Computer-Assisted Language
Learning (CALL)
• Translation memory
• Machine translation
• Parsing / chunking
• Ontologies, terminology and
knowledge representation
• Speech processing / generation
• Digital humanities
• Corpus development /
analysis
• Treebanking
• Evaluation methods
• Ontology-lexica
• Metadata
• Linked data resources
• Linguistic linked data
resources
• Semantic annotation
• Information Extraction
www.adaptcentre.ieWorkshop Outcomes
• A great time!
• Community forum
• Momentum
• Ideas for further collaboration
• Possible EU level action to address under-resourcing
www.adaptcentre.ie
Future Directions
www.adaptcentre.ieWithin the LT Community
• Under resourced languages are a challenge for science
• The best researchers LOVE a challenge
• Celtic LT community position itself as a provider of interesting
challenges
• BUT: We still need wider language community help to ensure
adequate data is available to the R&D community
www.adaptcentre.ieWhat Can/Should We Do?
• Concerted Community Action
• Data is key
• Collections of digital data in a language
• Appropriate format
• Appropriate annotation
• Appropriate licence
• Appropriately available
• The R&D community will combine to build more sophisticated tools
and solve bigger problems…
• This should not be done in isolation by each RML community
• Band together and also look to EU initiatives
www.adaptcentre.ieCeltic LT Community Efforts
• Next CLTW – Proposal for part of LREC 2016
• Semi formal meet ups (today)
• Budding Irish LT lobby group CIGILT
• COST (European COoperation in Science and Technology) Action
• Reaching out further to the Humanities
• Needs support from policy makers
• Needs to produce results that generate buy in from language
communities
www.adaptcentre.ieThe Grass Roots
• Small numbers of speakers
• Typically minority (or marginalised languages)
• Everyone has a role to play
• LT Community needs to speak out more
• Show tangible benefits
www.adaptcentre.ieDiolch! – Thank You!
Me
jjudge@computing.dcu.ie
http://ie.linkedin.com/in/judgejohn/
http://www.adaptcentre.ie
CLTW
https://groups.google.com/forum/#!forum/celtic-language-technology
META-NET LWPs
http://www.meta-net.eu
http://www.meta-net.eu/whitepapers/e-book/welsh.pdf
http://www.meta-net.eu/whitepapers/e-book/irish.pdf
http://www.meta-net.eu/whitepapers/e-book/basque.pdf
EU initiatives
http://www.cracker-project.eu
http://www.rigasummit2015.eu

More Related Content

What's hot

Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
Georg Rehm
 
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
Georg Rehm
 
Cracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeCracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual Europe
Georg Rehm
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS - The Language Data Network
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana
 
The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9
Georg Rehm
 
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Georg Rehm
 
Protecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionProtecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital Extinction
Teresa Lynn
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
Georg Rehm
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for Europe
Georg Rehm
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual Europe
Georg Rehm
 
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
Georg Rehm
 
Sustainability in OER for less used languages
Sustainability in OER for less used languagesSustainability in OER for less used languages
Sustainability in OER for less used languages
Web2Learn
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...
TAUS - The Language Data Network
 
The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...
Georg Rehm
 
Language Technology for Multilingual Europe
Language Technology for Multilingual EuropeLanguage Technology for Multilingual Europe
Language Technology for Multilingual Europe
Georg Rehm
 
The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper Series
Georg Rehm
 
Achievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An LocAchievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An Loc
EPFL (École polytechnique fédérale de Lausanne)
 
MLi - Project presentation
MLi - Project presentationMLi - Project presentation
MLi - Project presentation
MLi Project
 

What's hot (20)

Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
 
Cracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeCracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual Europe
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9
 
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...Computational Morphology and the META-NET Strategic Research Agenda for Multi...
Computational Morphology and the META-NET Strategic Research Agenda for Multi...
 
Protecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital ExtinctionProtecting Minority Languages from Digital Extinction
Protecting Minority Languages from Digital Extinction
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
 
META-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for EuropeMETA-NET and META-SHARE: Language Technology for Europe
META-NET and META-SHARE: Language Technology for Europe
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual Europe
 
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
 
Sustainability in OER for less used languages
Sustainability in OER for less used languagesSustainability in OER for less used languages
Sustainability in OER for less used languages
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...
 
The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...
 
Language Technology for Multilingual Europe
Language Technology for Multilingual EuropeLanguage Technology for Multilingual Europe
Language Technology for Multilingual Europe
 
The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper Series
 
Achievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An LocAchievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An Loc
 
MLi - Project presentation
MLi - Project presentationMLi - Project presentation
MLi - Project presentation
 

Similar to Celtic language technologies in the digital age

The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open Data
Georg Rehm
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
ABBYY Language Serivces
 
EDF2012 Aris Karanikas - PortDial
EDF2012  Aris Karanikas - PortDialEDF2012  Aris Karanikas - PortDial
EDF2012 Aris Karanikas - PortDial
European Data Forum
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS - The Language Data Network
 
Adnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh ToolsAdnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh Tools
Gareth Morlais
 
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
PretaLLOD
 
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
European Data Forum
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS - The Language Data Network
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
canolfanbedwyr
 
Language technology market and components taxonomy
Language technology market and components taxonomyLanguage technology market and components taxonomy
Language technology market and components taxonomy
PretaLLOD
 
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
TAUS - The Language Data Network
 
Why language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languagesWhy language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languages
Gareth Morlais
 
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeMETA-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
Georg Rehm
 
Bne impact co_c
Bne impact co_cBne impact co_c
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens Neudecker
Biblioteca Nacional de España
 

Similar to Celtic language technologies in the digital age (15)

The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open Data
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
 
EDF2012 Aris Karanikas - PortDial
EDF2012  Aris Karanikas - PortDialEDF2012  Aris Karanikas - PortDial
EDF2012 Aris Karanikas - PortDial
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
 
Adnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh ToolsAdnoddau Cymraeg - Welsh Tools
Adnoddau Cymraeg - Welsh Tools
 
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
ELSE IF 2019: Language Technology Market: State-of-the-Art, Trends and Value ...
 
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...EDF2012   Stefano Bertolo - Future European activities and funding perspectiv...
EDF2012 Stefano Bertolo - Future European activities and funding perspectiv...
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Language technology market and components taxonomy
Language technology market and components taxonomyLanguage technology market and components taxonomy
Language technology market and components taxonomy
 
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
Spanish Language Technology Plan. David Pérez Fernández, Cabinet of State Sec...
 
Why language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languagesWhy language technology resources matter to Welsh and other less-used languages
Why language technology resources matter to Welsh and other less-used languages
 
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual EuropeMETA-NET: Towards a Strategic Research Agenda for Multilingual Europe
META-NET: Towards a Strategic Research Agenda for Multilingual Europe
 
Bne impact co_c
Bne impact co_cBne impact co_c
Bne impact co_c
 
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens Neudecker
 

More from techiaith

Welsh National Language Technologies Portal - Mozfest2015
Welsh National Language Technologies Portal - Mozfest2015Welsh National Language Technologies Portal - Mozfest2015
Welsh National Language Technologies Portal - Mozfest2015
techiaith
 
Analysing welsh language tweets
Analysing welsh language tweetsAnalysing welsh language tweets
Analysing welsh language tweets
techiaith
 
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau CymraegDatblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
techiaith
 
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
techiaith
 
Cyfieithu Cymru
Cyfieithu CymruCyfieithu Cymru
Cyfieithu Cymrutechiaith
 
Ystyriaethau gramadegol wrth lunio Geiriadur yr Academi
Ystyriaethau gramadegol wrth lunio Geiriadur yr AcademiYstyriaethau gramadegol wrth lunio Geiriadur yr Academi
Ystyriaethau gramadegol wrth lunio Geiriadur yr Academitechiaith
 
Newid ac amrywiaeth mewn Cymraeg cyfoes
Newid ac amrywiaeth mewn Cymraeg cyfoesNewid ac amrywiaeth mewn Cymraeg cyfoes
Newid ac amrywiaeth mewn Cymraeg cyfoestechiaith
 

More from techiaith (7)

Welsh National Language Technologies Portal - Mozfest2015
Welsh National Language Technologies Portal - Mozfest2015Welsh National Language Technologies Portal - Mozfest2015
Welsh National Language Technologies Portal - Mozfest2015
 
Analysing welsh language tweets
Analysing welsh language tweetsAnalysing welsh language tweets
Analysing welsh language tweets
 
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau CymraegDatblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
Datblygiad Y Gymraeg Ar Twitter: Dehongli Data’r Corpws Trydariadau Cymraeg
 
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
Y Cyfieithydd a’r Cyfrifiadur : Pa un yw’r meistr?
 
Cyfieithu Cymru
Cyfieithu CymruCyfieithu Cymru
Cyfieithu Cymru
 
Ystyriaethau gramadegol wrth lunio Geiriadur yr Academi
Ystyriaethau gramadegol wrth lunio Geiriadur yr AcademiYstyriaethau gramadegol wrth lunio Geiriadur yr Academi
Ystyriaethau gramadegol wrth lunio Geiriadur yr Academi
 
Newid ac amrywiaeth mewn Cymraeg cyfoes
Newid ac amrywiaeth mewn Cymraeg cyfoesNewid ac amrywiaeth mewn Cymraeg cyfoes
Newid ac amrywiaeth mewn Cymraeg cyfoes
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 

Celtic language technologies in the digital age

  • 1. Celtic Language Technologies in the Digital Age John Judge, Adapt Centre, DCU
  • 2. www.adaptcentre.ieBackground on Me • Background: Computational Linguist – research and real world • Interests in: Natural Language Processing, Text Analytics, Machine Translation, … • National Centre for Language Technology • Research Integration Coordinator for the ADAPT Centre of Excellence for Digital Content and Media Innovation • Focus on EU collaborations • META-NET • QT LaunchPad • LT Web • FALCON • Mli • QT21 • TraMOOC • EXPERT
  • 3. www.adaptcentre.ieADAPT Centre • ADAPT Science Foundation Ireland Direct Funding over six years (until 2020) • Academic/Industry partnership built on top of CNGL • Five research themes • Six application areas • TCD and DCU co-leads; UCD and DIT partners • Open ended number of industry partners
  • 6. www.adaptcentre.ieAmbitious Metrics for Success 13 Spin Out Companies €5m Commercialisation Awards 1,650 Top Quality Publications €110m Won in Total Competitive Research 500 Jobs €9m From Commercial Sources 60 Major EU Initiatives 200 Postgraduate Students 88 Licence Agreements
  • 7. www.adaptcentre.ieAgus Gaeilge…? How much of all of this relates to Irish?
  • 10. www.adaptcentre.ieLT is not… • Localised Software • A website in your language • A static online dictionary But these are all VERY valuable resources for a language! …and can form part of a healthy LT ecosystem
  • 11. www.adaptcentre.ieWhat is LT – Where I’m coming from • Technology for processing information (speech, text, gestures,…) in a given language • An enabling technology • Added intelligence to both content (creation, management/etc) and HCI • Set of tools and resources – part of a bigger picture and a larger ecosystem • Interactive • Not monolithic resources
  • 12. www.adaptcentre.ieIt’s already right under your noses • These concepts (and some others) already being used for a wide range of applications • Marketing/Brand awareness • Customer Sentiment Analysis • Political barometers (Obama) • Information analysis and extraction (IBM Watson) • Offensive content filtering • Security applications
  • 13. www.adaptcentre.ie A look at the Irish LT perspective
  • 14. www.adaptcentre.ieLT Landscape in Ireland • Historically strong in Translation and Localisation industry • Home to several internationally recognised research centres • NCLT • DERI • CNGL >>> ADAPT • INSIGHT • Government funding for research has been consistent despite worsening economic conditions
  • 15. www.adaptcentre.ieLT for Irish • Many of the basics are covered • Spell checker • Grammar checking • T9 predictive text, smartphone predictive text (through additional software) • Localisation of open source software, and many major applications • Some of the more advanced stuff • Speech synthesiser • Part-of-Speech Tagger • (Dependency Parser)
  • 16. www.adaptcentre.ieLT for Irish • But there’s not much else • Availability of text corpora, speech corpora, parallel texts, wordnets and other LT building blocks is limited or poor • Some resources exist – small, narrow coverage, restricted availability • Lack of basic linguistic resources is stifling development of modern language processing technologies for Irish • Yet our own research centres are producing world leading LT for other languages
  • 17. www.adaptcentre.ieState of LT Support for Irish Source: META-NET Whitepaper Series The Irish Language in the Digital Age
  • 18. www.adaptcentre.ie MT 19 English good French, Spanish moderate fragmentary Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian weak or no support Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithu- anian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish, Welsh excellent Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish weak or no support Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian, Welsh excellent English good Speech English good Dutch, French, German, Italian, Spanish moderate fragmentary Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish weak or no supportexcellent English good Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish moderate fragmentary Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene weak/no supportexcellent Resources Text Analysis Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian, Welsh Icelandic, Irish, Latvian, Lithuanian, Maltese, Welsh
  • 19. www.adaptcentre.ieEurope’s Languages and LT Dutch French German Italian Spanish Catalan Czech Finnish Hungarian Polish Portuguese Swedish Basque Bulgarian Danish Galician Greek Norwegian Romanian Slovak Slovene Croatian Estonian Icelandic Irish Latvian Lithuanian Maltese Serbian Welsh English good support through Language Technology weak or no support
  • 20. www.adaptcentre.ieSo What? • Take a closer look at the least equipped languages • Only 3 compete with English in their native countries • Maltese native fluency ~100% (Eurobarometer) • Irish and Welsh are at risk • So too are other RMLs which compete with any better resourced language on a day to day basis Croatian Estonian Icelandic Irish Latvian Lithuanian Maltese Serbian Welsh weak or no support Basque Bulgarian Danish Galician Greek Norwegian Romanian Slovak Slovene
  • 21. www.adaptcentre.ieLanguages at risk in the pre-digital age
  • 22. www.adaptcentre.ieLanguages at risk in the print age • Invention of the moveable type printing press • Improved literacy • Standardisation • The Reformation • The Renaissance • The Enlightenment • Death of hundreds of European RMLs that never made it into print
  • 23. www.adaptcentre.ieLanguages in the Digital Age • The leap into the digital age has had profound effects • Need to equip all languages with digital resources to ensure survival • Otherwise they are doomed to history • The Celtic Languages need to address under-resourcing
  • 24. www.adaptcentre.ie A High Level Solution - Europe
  • 25. www.adaptcentre.ieEuropean Level Action • Multilingual Europe Technology Alliance • Bring together Language Technology stakeholders • Concerted effort to influence EU research programmes for LT • Strategic Research Agenda for Multilingual Europe • Success in H2020 Funding calls – specifically in ICT 17 “Cracking the Language Barrier” • “.. to facilitate multilingual online communication for the benefit of the digital single market which is still fragmented by language barriers that hamper a wide penetration of cross-border commerce, social communication and exchange of cultural content.” • “Special focus is on the 21 EU languages (both as source and target languages) that have ‘fragmentary’ or ‘weak/no’ machine translation support according to the META-NET language white papers.”
  • 26. www.adaptcentre.ieAddressing the Gap – CRACKER Project • CRACKER (Feb 2015) – follow up to META-NET. Stated goals: • Initiating a programme of ground-breaking actions that will deliver, by 2025, an online EU internal market free of language barriers, delivering automated translation quality, equal to currently best performing language pair/direction, in most relevant use situations and for at least 90% of the EU official languages. • Significantly improving the quality, coverage and technical maturity of automatic translation for at least half of the 21 EU languages that currently have "weak or no support" or "fragmentary support" of machine translation solutions, according to the META-NET Language White Papers referenced before. • Attracting a community of hundreds of contributors of language resources and language technology tools (from all EU Member States and Associated Countries) to adopt and support a single platform for sharing, maintaining and making use of language resources and tools; establishing widely agreed benchmarks for machine translation quality and stimulating competition between methods and systems.
  • 27. www.adaptcentre.ieEU Actions Recap • The EU is calling for improved resources for our languages • The big players (industry and research) are organising to do something about it • Celtic languages can be part of this if we position ourselves to be there
  • 28. www.adaptcentre.ieEU Actions – Getting on board • Riga Summit 2015, April 27-29 • http://www.rigasummit2015.eu • Venue for META-FORUM • Multilingual Technologies for the Digital Single Market • Language Technologies for the Big Data Challenge and Data Economy • High-Quality Machine Translation • Towards European Language Technology Platforms • Strategic Agenda for the Multilingual Digital Single Market
  • 29. www.adaptcentre.ieSummit Agenda Opening addresses H.E. Andris Bērziņš, President of the Republic of Latvia First session Setting the Strategic Agenda for the Multilingual Digital Single Market Coffee break Second session Breaking the Language Barrier for Cross-Border Public Services Lunch Third session Language Technology: Enabling European Business Coffee break Fourth session Empowering the Multilingual Data Economy Closing session EU Innovation Excellence to Address Multilingual Challenges
  • 30. www.adaptcentre.ieNational Policy/Funding Agency Round Table • Roundtable session to discuss where languages and language technologies currently stand in the different countries and regions and how to improve the situation • Goal: Shape a Strategic Research and Innovation Agenda with input (and buy in) directly from those responsible for our languages at a regional level
  • 31. www.adaptcentre.ie Towards a Celtic Language Technology Community
  • 32. www.adaptcentre.ieLanguages in the Digital Age • Not all doom and gloom! • Significant opportunity: LT and language promotion/rejuvenation • Community effort can provide the basic building blocks • Techniques can do more with less • Policy makers can be hard to convince • We have to start somewhere – Celtic Language Technology Community Workshop
  • 33. www.adaptcentre.ieCeltic Language Technology Workshop “The Celtic Language Technology Workshop (CLTW) series of workshops provides a forum for researchers interested in developing NLP (Natural Language Processing) resources and technologies for Celtic languages. As Celtic languages are under-resourced, our goal is to encourage collaboration and communication between researchers working on language technologies and resources for Celtic languages.”
  • 34. www.adaptcentre.ieFirst CLTW at COLING 2014 • Held in association with COLING 2014 (top tier CL/LT conference) • Full day of research presentations (papers and posters) • Attended by about 30 people • Published 12 papers • Representing work on: Irish, Welsh, Scots Gaelic, Breton (and an invited talk that covered aspects of Manx) • Including an open forum session to discuss how to move the area forward • Endorsed by Irish Government, Ofis Publik ar Brezhoneg (among others)
  • 35. www.adaptcentre.ieCLTW Topics of Interest • Language resources • Syntax, semantics, grammar, lexicons • Phonology / morphology, tagging • Morphological analysis • Part-of-speech taggers • Computer-Assisted Language Learning (CALL) • Translation memory • Machine translation • Parsing / chunking • Ontologies, terminology and knowledge representation • Speech processing / generation • Digital humanities • Corpus development / analysis • Treebanking • Evaluation methods • Ontology-lexica • Metadata • Linked data resources • Linguistic linked data resources • Semantic annotation • Information Extraction
  • 36. www.adaptcentre.ieWorkshop Outcomes • A great time! • Community forum • Momentum • Ideas for further collaboration • Possible EU level action to address under-resourcing
  • 38. www.adaptcentre.ieWithin the LT Community • Under resourced languages are a challenge for science • The best researchers LOVE a challenge • Celtic LT community position itself as a provider of interesting challenges • BUT: We still need wider language community help to ensure adequate data is available to the R&D community
  • 39. www.adaptcentre.ieWhat Can/Should We Do? • Concerted Community Action • Data is key • Collections of digital data in a language • Appropriate format • Appropriate annotation • Appropriate licence • Appropriately available • The R&D community will combine to build more sophisticated tools and solve bigger problems… • This should not be done in isolation by each RML community • Band together and also look to EU initiatives
  • 40. www.adaptcentre.ieCeltic LT Community Efforts • Next CLTW – Proposal for part of LREC 2016 • Semi formal meet ups (today) • Budding Irish LT lobby group CIGILT • COST (European COoperation in Science and Technology) Action • Reaching out further to the Humanities • Needs support from policy makers • Needs to produce results that generate buy in from language communities
  • 41. www.adaptcentre.ieThe Grass Roots • Small numbers of speakers • Typically minority (or marginalised languages) • Everyone has a role to play • LT Community needs to speak out more • Show tangible benefits
  • 42. www.adaptcentre.ieDiolch! – Thank You! Me jjudge@computing.dcu.ie http://ie.linkedin.com/in/judgejohn/ http://www.adaptcentre.ie CLTW https://groups.google.com/forum/#!forum/celtic-language-technology META-NET LWPs http://www.meta-net.eu http://www.meta-net.eu/whitepapers/e-book/welsh.pdf http://www.meta-net.eu/whitepapers/e-book/irish.pdf http://www.meta-net.eu/whitepapers/e-book/basque.pdf EU initiatives http://www.cracker-project.eu http://www.rigasummit2015.eu

Editor's Notes

  1. If the digital age is already heavily affecting english, the lingua franca of the WORLD (eg. selfie in the OED) And we already have evidence that a similar previous information revolution killed off the lingua franca of Europe, of the Church AND 100’s of RMLs What chance do languages that are under resourced digitally have?