TAUS ROUNDTABLE 2014
22 May/ Moscow (Russia)
THURSDAY, 22 May /12:40 – 13:10
Is MT Ready for e-Government? The Latvian Story
Indra Samite, Tilde
TAUS ROUNDTABLE 2014
22 May/ Moscow (Russia)
Indra Sāmīte
indra.samite@tilde.com
Is MT ready for e-gov?
The Latvian story
• about Tilde
• eGovernment and the
challenge of language
diversity
• european language
landscape
• promise of MT
• MT for eGov in Latvia
• data challenge and
META-NET
• MT for Europe
outline
• Language technology developer
• Localization service provider
• Leadership in smaller languages
• Offices in Riga (Latvia), Tallinn
(Estonia) and Vilnius (Lithuania)
• 130 employees
• Strong R&D team
• 9 PhDs and candidates
• Trusted partner of the EU for
significant research projects
G2C
Government to Citizens
G2C
Government to Businesses
G2E
Government to Employees
G2G
Government to Governments
C2G
Citizens to Governments
eGovernment
transparency
customers on-line, NOT in line
efficiency
increase participation
reach marginalized groups
Goals of
eGovernment
• providing information
regulatory services, public hearing schedules,
issue briefs, notifications, etc.
• two-way communications with the citizen, a
business, or another government agency
• dialogue with agencies to post problems,
comments, or requests
• conducting transactions
lodging tax returns, applying for services and
grants, etc.
• Enabling citizen transition from passive
information access to active participation by:
– informing the citizen
– representing the citizen
– encouraging the citizen to vote
– consulting the citizen
– involving the citizen
eGov activities
80 European languages
23 official languages
New countries and
languages joining soon
Croatia, IcelandMultilingual
Europe
98%
Luxembourg
95%
Latvia
94%
Netherlands
93%
Malta
92%
Slovenia
Lithuania
91%
Sweden
EU countries
where people
can speak
at least
one language
in addition
to their mother
tongue
65%
Hungary
62%
Italy
61%
United Kingdom
Portugal
60%
Ireland
EU countries
where majority of
people
cannot
speak any
foreign language
What role does
translation play
in your everyday life?
is translation
important
?
43
16
30
% of EU population
important
very important
no role
The importance of languages was
emphasized in the Council Resolution
on linguistic diversity of 14 February
2002 on acknowledging the part
played by languages in social,
economic and political integration,
particularly in an enlarged Europe.
Linguistic diversity is one of the
operating principles of the European
institutions. The Treaty on European
Union entitles every citizen to write to
any of the institutions in one of these
languages and to have an answer in
the same language (Article 21).
EU Policy
Preserving the European cultural and
linguistic diversity in the united
information and knowledge society
Securing at affordable costs the free flow
of information and thought across
language boundaries in the resulting
single information space
Providing each language community with
the most advanced technologies for
communication, information and
knowledge management so that
maintaining their mother tongue does not
turn into a disadvantage
Challenge
Credits: Hans Uszkoreit
EU MULTILINGUALITY
IN PRACTICE:
CASE STUDY
In October 2010, a Spanish lawyer turned to the Ombudsman,
complaining that many public consultations are only published
in English, for example, consultations concerning a new
partnership to help small and medium-sized enterprises and
concerning the freedom of movement of workers.
“The Commission should
ensure that all European
citizens are able to
understand its public
consultations,
which should [..]
be published in all the
official languages.
Its failure to do so is an
instance of
maladministration.”
4 October 2012
The European Ombudsman,
P. Nikiforos Diamandouros
[European] Commission [has] to ensure that every EU citizen's right to address the EU institutions in any of
the EU official languages is fully respected and implemented by ensuring that public consultations are
available in all EU official languages,[..] and that there is no language-based discrimination [..]
European Parliament resolution 2012/2676(RSP)
Fulfill the vision of
e-Government
AND the promise of
language diversity
The eGovernment
Challenge
machine translation
machine translation
Bridge the language
barrier
Speak your citizens’
languages
Promote diversity
Overcome barriers to
communication
Grow business
World peace 
MT for eGov
What MT
serves best
short shelf life
immediacy
large volume
multiple languages
Where it works
embedded in web
pages
multilingual online
services
social media
multi-lingual chat
mobile devices
customizable,
trainable
domain specific
on-demand
in the cloud
real –time
security
privacy
Specific
requirements
• EU Official
languages: 23
• EC procedural
languages: 3 (EN,
FR, DE)
• DGT:1750 linguists
and 600 support
• Where: in Brussels,
Luxembourg and in
local offices in
Member States
DGT Translation
The past: ECMT
• Rule-based machine translation
• Developed between 1975 and 1998
• 28 language pairs available (ten
languages)
• Since 2006 only linguistic maintenance
work on a
couple of systems
• Suspended in 12/2010
The future: MT@EC
• 05/2010 Commission Task Force
confirmed need
for new MT for the Commission
• 06/2010 Action plan approved by
management
• 09/2010 Work started for MT@EC
Machine
Translation
@
European
Commission
• Based on data-driven MT
technology
• Making best use of Commission
language resources
• Making best use of internal
linguistic expertise (1700
translators for 23 languages)
• Open and flexible
• Ensuring technological
independence
• Being built by DG Translation
• Started: summer 2010
• Deploy: summer 2013
MT@EC
Open to the market
Language technology watch (continuous)
Linguistic interventions - demonstration
projects in 2011
Comparison of baseline engines to market
offerings - 2012
… and to research
Using Moses
A major institutional user of MT
Involvement in projects (e.g. Multilingual
web)
Conferences for EU institutions staff (e.g.
EM+ workshop)
Provider of language resources…
MT@EC
Credits: Spyros Pilos, DGT
The DGT Multilingual
Translation Memory
of the
Acquis
Communautaire
http://langtech.jrc.it/DGT-TM.html
JRC-Acquis
The total body of European Union law applicable in the EU Member States
http://langtech.jrc.it/JRC-Acquis.html
Data for SMT training
META-NET Language
Whitepapers
30 European languages
Analyzing language
readiness for the digital
age
21 language under
long-term threat due to
inadequate technological
support
www.meta-
net.eu/whitepapers
Strategic Research Agenda
 Europe-wide social and business
networking in native language
 Mobile and internet services in native
language for e-Commerce, education,
travel, entertainment, etc.
 eGovernment reaching all linguistic
groups and enabling political
discussion across borders
 Unlimited TV/movie cross-language
subtitling/interpretation
 Ever present Personal Interpreter
 Translingual Spaces: dedicated
locations for ambient interpretation
Vision:
Applications
needed by EU
citizens and
businesses
• A ubiquitous online platform combining automatic
translation, language checking, post-editing, as well
as human creativity and quality assurance
• for generic and special-purpose
services
• free for small volume use and for high-
volume baseline quality
• involve providers of computer
supported HQ human translation
• business opportunities for a wide
range of service and technology
providers
• Assured privacy, confidentiality and security
provided by trusted service centers
• Quality upscale models: instant quality upgrades
• Domain and Task specialization
Vision: Services for
the EU Society and
Citizens
Ubiquitous translation
services for a full range of
quality levels, fast, affordable
Covers written and spoken
language from formal
language to chats and social
networking
Multi-media multi-language
content delivery
 On mobiles, tablets, PCs, etc.
Vision: Services for
the EU Society and
Citizens
Large cooperating projects
Sharing infrastructures: resources, evaluation
Smaller projects – providing building
blocks
National languages (resources, technologies)
Component technologies
Combined funding (EC, national, private)
Inclusion of industry and translation
professionals in the entire research and
innovation process
Solving legal hurdles on using data for
research
Connection to CEF
Infrastructural support (selected areas)
Resources, evaluation suites, organisation of
challenges
Organisation of
Research and
Innovation
Case Study:
LATVIA
The Latvian Story
population 2,1 M
1,6 million native
Latvian speakers
large Russian speaking
population (36%)
the
situation in
Latvia
less than 10m speakers
lack of parallel data
(corpora)
complex language
structure
highly inflected
language components
under developed
terminology
Under-resourced
provide e-services to all the
population / linguistic groups
develop technologies for
supporting Latvian in
information society
facilitate access to the
information of European
Union institutions
integrate in the infrastructure
of EU multilingual services
MT
@
eGov.LV
large corpus of parallel
data
large corpus of
monolingual data
MT core system
Infrastructure
language specific tools
such as morphology tools
What is
necessary to
develop
statistical MT
online translation
service
translation widget for
integration in eGov
service sites
standardized API for
universal integratability
Integration
Solution
for
MT
@
eGov.LV
the digitalization of culture
.
Solution
for
MT
@
eGov.LV
custom
machine
translation
as easy
and
affordable
as
a cup
of coffee
• upload your data
TMX, XLIFF, DOC, PDF, XLZ, TXT
• combine it with the data on the LetsMT
public repository
• generate your custom MT
with a few mouse clicks
• run your MT system
on the LetsMT cloud
• use it in your CAT tool
with LetsMT plug-in
• integrate through LetsMT API
in your online or desktop app
1,4 billion
parallel sentences
102
languages
129
MT systems trained
*status on 12-10-2012
custom
terminology
incremental
data
custom
MT
cloud-based
terminology services
for
term extraction
and
multilingual term
glossary creation
for
human and
machine translation
.
Machine translation
bringing governments and citizens closer
tilde.com
Author:-Copyright:Stocklib©RobertWilson
The research within the projects META-NET, LetsMT! and TaaS has received funding from the European Commission ICT
Policy Support Programme (ICT PSP) and FP7 Programme

Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde

  • 1.
    TAUS ROUNDTABLE 2014 22May/ Moscow (Russia)
  • 2.
    THURSDAY, 22 May/12:40 – 13:10 Is MT Ready for e-Government? The Latvian Story Indra Samite, Tilde TAUS ROUNDTABLE 2014 22 May/ Moscow (Russia)
  • 3.
    Indra Sāmīte indra.samite@tilde.com Is MTready for e-gov? The Latvian story
  • 4.
    • about Tilde •eGovernment and the challenge of language diversity • european language landscape • promise of MT • MT for eGov in Latvia • data challenge and META-NET • MT for Europe outline
  • 5.
    • Language technologydeveloper • Localization service provider • Leadership in smaller languages • Offices in Riga (Latvia), Tallinn (Estonia) and Vilnius (Lithuania) • 130 employees • Strong R&D team • 9 PhDs and candidates • Trusted partner of the EU for significant research projects
  • 6.
    G2C Government to Citizens G2C Governmentto Businesses G2E Government to Employees G2G Government to Governments C2G Citizens to Governments eGovernment
  • 7.
    transparency customers on-line, NOTin line efficiency increase participation reach marginalized groups Goals of eGovernment
  • 8.
    • providing information regulatoryservices, public hearing schedules, issue briefs, notifications, etc. • two-way communications with the citizen, a business, or another government agency • dialogue with agencies to post problems, comments, or requests • conducting transactions lodging tax returns, applying for services and grants, etc. • Enabling citizen transition from passive information access to active participation by: – informing the citizen – representing the citizen – encouraging the citizen to vote – consulting the citizen – involving the citizen eGov activities
  • 9.
    80 European languages 23official languages New countries and languages joining soon Croatia, IcelandMultilingual Europe
  • 10.
  • 11.
  • 12.
    What role does translationplay in your everyday life? is translation important ? 43 16 30 % of EU population important very important no role
  • 13.
    The importance oflanguages was emphasized in the Council Resolution on linguistic diversity of 14 February 2002 on acknowledging the part played by languages in social, economic and political integration, particularly in an enlarged Europe. Linguistic diversity is one of the operating principles of the European institutions. The Treaty on European Union entitles every citizen to write to any of the institutions in one of these languages and to have an answer in the same language (Article 21). EU Policy
  • 14.
    Preserving the Europeancultural and linguistic diversity in the united information and knowledge society Securing at affordable costs the free flow of information and thought across language boundaries in the resulting single information space Providing each language community with the most advanced technologies for communication, information and knowledge management so that maintaining their mother tongue does not turn into a disadvantage Challenge Credits: Hans Uszkoreit
  • 15.
    EU MULTILINGUALITY IN PRACTICE: CASESTUDY In October 2010, a Spanish lawyer turned to the Ombudsman, complaining that many public consultations are only published in English, for example, consultations concerning a new partnership to help small and medium-sized enterprises and concerning the freedom of movement of workers.
  • 18.
    “The Commission should ensurethat all European citizens are able to understand its public consultations, which should [..] be published in all the official languages. Its failure to do so is an instance of maladministration.” 4 October 2012 The European Ombudsman, P. Nikiforos Diamandouros
  • 19.
    [European] Commission [has]to ensure that every EU citizen's right to address the EU institutions in any of the EU official languages is fully respected and implemented by ensuring that public consultations are available in all EU official languages,[..] and that there is no language-based discrimination [..] European Parliament resolution 2012/2676(RSP)
  • 20.
    Fulfill the visionof e-Government AND the promise of language diversity The eGovernment Challenge
  • 21.
  • 22.
    Bridge the language barrier Speakyour citizens’ languages Promote diversity Overcome barriers to communication Grow business World peace  MT for eGov
  • 23.
    What MT serves best shortshelf life immediacy large volume multiple languages
  • 24.
    Where it works embeddedin web pages multilingual online services social media multi-lingual chat mobile devices
  • 25.
    customizable, trainable domain specific on-demand in thecloud real –time security privacy Specific requirements
  • 26.
    • EU Official languages:23 • EC procedural languages: 3 (EN, FR, DE) • DGT:1750 linguists and 600 support • Where: in Brussels, Luxembourg and in local offices in Member States DGT Translation
  • 27.
    The past: ECMT •Rule-based machine translation • Developed between 1975 and 1998 • 28 language pairs available (ten languages) • Since 2006 only linguistic maintenance work on a couple of systems • Suspended in 12/2010 The future: MT@EC • 05/2010 Commission Task Force confirmed need for new MT for the Commission • 06/2010 Action plan approved by management • 09/2010 Work started for MT@EC Machine Translation @ European Commission
  • 28.
    • Based ondata-driven MT technology • Making best use of Commission language resources • Making best use of internal linguistic expertise (1700 translators for 23 languages) • Open and flexible • Ensuring technological independence • Being built by DG Translation • Started: summer 2010 • Deploy: summer 2013 MT@EC
  • 29.
    Open to themarket Language technology watch (continuous) Linguistic interventions - demonstration projects in 2011 Comparison of baseline engines to market offerings - 2012 … and to research Using Moses A major institutional user of MT Involvement in projects (e.g. Multilingual web) Conferences for EU institutions staff (e.g. EM+ workshop) Provider of language resources… MT@EC Credits: Spyros Pilos, DGT
  • 31.
    The DGT Multilingual TranslationMemory of the Acquis Communautaire http://langtech.jrc.it/DGT-TM.html
  • 32.
    JRC-Acquis The total bodyof European Union law applicable in the EU Member States http://langtech.jrc.it/JRC-Acquis.html
  • 33.
    Data for SMTtraining
  • 35.
    META-NET Language Whitepapers 30 Europeanlanguages Analyzing language readiness for the digital age 21 language under long-term threat due to inadequate technological support www.meta- net.eu/whitepapers
  • 36.
  • 37.
     Europe-wide socialand business networking in native language  Mobile and internet services in native language for e-Commerce, education, travel, entertainment, etc.  eGovernment reaching all linguistic groups and enabling political discussion across borders  Unlimited TV/movie cross-language subtitling/interpretation  Ever present Personal Interpreter  Translingual Spaces: dedicated locations for ambient interpretation Vision: Applications needed by EU citizens and businesses
  • 38.
    • A ubiquitousonline platform combining automatic translation, language checking, post-editing, as well as human creativity and quality assurance • for generic and special-purpose services • free for small volume use and for high- volume baseline quality • involve providers of computer supported HQ human translation • business opportunities for a wide range of service and technology providers • Assured privacy, confidentiality and security provided by trusted service centers • Quality upscale models: instant quality upgrades • Domain and Task specialization Vision: Services for the EU Society and Citizens
  • 39.
    Ubiquitous translation services fora full range of quality levels, fast, affordable Covers written and spoken language from formal language to chats and social networking Multi-media multi-language content delivery  On mobiles, tablets, PCs, etc. Vision: Services for the EU Society and Citizens
  • 40.
    Large cooperating projects Sharinginfrastructures: resources, evaluation Smaller projects – providing building blocks National languages (resources, technologies) Component technologies Combined funding (EC, national, private) Inclusion of industry and translation professionals in the entire research and innovation process Solving legal hurdles on using data for research Connection to CEF Infrastructural support (selected areas) Resources, evaluation suites, organisation of challenges Organisation of Research and Innovation
  • 41.
  • 42.
    population 2,1 M 1,6million native Latvian speakers large Russian speaking population (36%) the situation in Latvia
  • 43.
    less than 10mspeakers lack of parallel data (corpora) complex language structure highly inflected language components under developed terminology Under-resourced
  • 44.
    provide e-services toall the population / linguistic groups develop technologies for supporting Latvian in information society facilitate access to the information of European Union institutions integrate in the infrastructure of EU multilingual services MT @ eGov.LV
  • 45.
    large corpus ofparallel data large corpus of monolingual data MT core system Infrastructure language specific tools such as morphology tools What is necessary to develop statistical MT
  • 46.
    online translation service translation widgetfor integration in eGov service sites standardized API for universal integratability Integration
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
    • upload yourdata TMX, XLIFF, DOC, PDF, XLZ, TXT • combine it with the data on the LetsMT public repository • generate your custom MT with a few mouse clicks • run your MT system on the LetsMT cloud • use it in your CAT tool with LetsMT plug-in • integrate through LetsMT API in your online or desktop app
  • 52.
    1,4 billion parallel sentences 102 languages 129 MTsystems trained *status on 12-10-2012
  • 53.
  • 54.
    cloud-based terminology services for term extraction and multilingualterm glossary creation for human and machine translation
  • 55.
  • 56.
  • 57.
    tilde.com Author:-Copyright:Stocklib©RobertWilson The research withinthe projects META-NET, LetsMT! and TaaS has received funding from the European Commission ICT Policy Support Programme (ICT PSP) and FP7 Programme