SlideShare a Scribd company logo
1 of 13
Download to read offline
Apertium: Free/open-source rule-based machine
translation and language processors
Mikel L. Forcada
Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain
Riga TAUS Roundtable, June 1, 2016
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
What is Apertium?
Apertium (since 2005) is
a free/open-source platform for shallow-transfer rule-based machine
translation
which is collaboratively developed
and provides:
A congurable, language independent machine translation engine,
Data (dictionaries, rules) for more than 40 language pairs (in XML
and text-based formats), and
lots of tools for developers and users.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Pipeline architecture
A pipelined architecture allows for easy customization and diagnostics.
lexical
transfer
morph.
analyser
morph.
disambig.
morph.
generator
post-
generator
SL
text
TL
text
deformatter
reformatter
structural
transfer
lexical
selection
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Languages and language pairs
afr
nld
arg
cat
ita
bre
fra
spa
cym
eng
glg
dan
nno
nob
ast por ron
epo eus
hbs
mkd slv
bul
ind
zsmisl
swe
kaz
tat
mlt
ara
oci
sme
urd
hin
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium?
Apertium loves small languages
Some unique MT systems for small languages:
Breton→French Aragonese↔Spanish
Occitan↔Catalan Aragonese↔Catalan
Occitan↔Spanish North Sámi→Norwegian
To love is to give: e.g. provide small languages with
language resources, and
computational-linguistic descriptions of their language.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
What is Apertium good for?
What is Apertium good for?
Apertium is basically good to translate between related languages. Some
examples in Apertium:
Spanish ↔ Portuguese
Norwegian Nynorsk ↔ Norwegian Bokmål
Slovenian ↔ Croatian
Tatar ↔ Kazakh
Postediting Apertium output in these cases may save time compared to
translation from scratch.
It is also being used for less-related language pairs in gisting applications.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Apertium licensing: free/open-source
Apertium language data and code are both licensed under the GNU
General Public License:
a free/open-source license allowing free distribution of unmodied and
modied versions
a copylefted license: it avoids private appropriation and encourages
giving improvements back to the project (a commons) → community
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Apertium is collaboratively developed
Very active group of hundreds of developers (freelance developers,
researchers, industrial partners).
Wiki documentation (wiki.apertium.org) in addition to formal
documents.
Help available at IRC channel #apertium in freenode.net
Mailing lists: apertium-stuff@lists.sf.net and other
language-specic lists
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Apertium is collaboratively developed
Research and business with Apertium
Apertium is already an active research and business platform:
Research: 40+ publications, 2 PhD thesis, 4 master's theses
Business: companies (Prompsit, Eleka, Imaxin Software, etc.)
oering services to customers such as Autodesk, the Government of
Catalonia, one of the main Basque banks, the daily newspaper La Voz
de Galicia, etc.)
The free/open-source model creates a community which eectively
connects researchers, developers, vendors and users.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Becoming an Apertium user
Becoming an Apertium user
Professional translators can:
use Apertium oine plugins in the OmegaT free/open-source CAT
environment.
(as with any other system) easily align source and MT to generate
machine translation memories to feed into other CAT systems
Muggles can use:
a stand-alone Java application for the desktop: apertium-caffeine
an Android version for handhelds
a stand-alone version (Apertium Simpleton) for Windows and MacOS.
a plug-in for the OmegaT CAT platform apertium-omegat
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Becoming an Apertium developer
Becoming an Apertium developer
It's easy to become an Apertium developer. It just takes
reasonable computing skills (XML, shell commands, etc.), which are
not too hard to acquire,
good translation skills.
In no time, developers nd themselves contributing to a language pair with
the support of the community.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
A nice side eect: monolingual resources
A nice side eect: monolingual resources
When developing a language pair, monolingual language resources are
developed, such as
morphological dictionaries
morphological disambiguation rules and probabilities
The corresponding monolingual processors are available to help statistical
machine translation deal, for instance, with languages having a challenging
morphology.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13
Success cases
Success cases
Apertium a is mature technology which is used:
in Wikimedia Content Translation to generate Wikipedia content in
other languages,
to produce a Catalan edition of Valencia daily newspaper
Levante-EMV,
by Universities in the Catalan speaking area to help in the generation
of courseware and academic information,
in PLATA, the Spanish government platform for on-the-y webpage
machine translation of public-service webpages.
Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process
Riga TAUS Roundtable, June 1, 2016
/ 13

More Related Content

Similar to Apertium: Free Open-Source Rule-Based MT Platform for 40+ Languages

Economic aspects and business models of free software
Economic aspects and business models of free softwareEconomic aspects and business models of free software
Economic aspects and business models of free softwarermvvr143
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Prompsit Language Engineering
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Gema Ramirez-Sanchez
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...Forcada Mikel
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual EuropeGeorg Rehm
 
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...Gunther Eysenbach
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. IMPACT Centre of Competence
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
Galician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgGalician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgAlexandro Colorado
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital EuropeGeorg Rehm
 
Liberate Your Library Building A Scottish Consortium November 16th 2009
Liberate Your Library   Building A Scottish Consortium November 16th 2009Liberate Your Library   Building A Scottish Consortium November 16th 2009
Liberate Your Library Building A Scottish Consortium November 16th 2009Jonathan Field
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic RIILP
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Georg Rehm
 

Similar to Apertium: Free Open-Source Rule-Based MT Platform for 40+ Languages (20)

Economic aspects and business models of free software
Economic aspects and business models of free softwareEconomic aspects and business models of free software
Economic aspects and business models of free software
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...Apertium: a unique free/open-source MT system for related languages [but not ...
Apertium: a unique free/open-source MT system for related languages [but not ...
 
Open-source machine translation for Icelandic: the Apertium platform as an o...
Open-source machine translation for Icelandic:
 the Apertium platform as an o...Open-source machine translation for Icelandic:
 the Apertium platform as an o...
Open-source machine translation for Icelandic: the Apertium platform as an o...
 
Socializing and disseminating the academic and intellectual creation: experie...
Socializing and disseminating the academic and intellectual creation: experie...Socializing and disseminating the academic and intellectual creation: experie...
Socializing and disseminating the academic and intellectual creation: experie...
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
PESCA: Developing an Open Source Platform to Bring eHealth to Latin America a...
 
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project. 2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
2. Interoperability framework and Taverna. Enrique Molla, Succeed Project.
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
Presentation Prompsit Apertium Oswc 2012
Presentation Prompsit Apertium Oswc 2012Presentation Prompsit Apertium Oswc 2012
Presentation Prompsit Apertium Oswc 2012
 
Achievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An LocAchievement And Lessons Learned By An Loc
Achievement And Lessons Learned By An Loc
 
Galician Experience with OpenOffice.org
Galician Experience with OpenOffice.orgGalician Experience with OpenOffice.org
Galician Experience with OpenOffice.org
 
Socializing and disseminating the academic and intellectual creation: Experie...
Socializing and disseminating the academic and intellectual creation: Experie...Socializing and disseminating the academic and intellectual creation: Experie...
Socializing and disseminating the academic and intellectual creation: Experie...
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital Europe
 
Liberate Your Library Building A Scottish Consortium November 16th 2009
Liberate Your Library   Building A Scottish Consortium November 16th 2009Liberate Your Library   Building A Scottish Consortium November 16th 2009
Liberate Your Library Building A Scottish Consortium November 16th 2009
 
Niatalk24jan10
Niatalk24jan10Niatalk24jan10
Niatalk24jan10
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
 
Cyflwyniad Bloc
Cyflwyniad BlocCyflwyniad Bloc
Cyflwyniad Bloc
 
Concordances
Concordances Concordances
Concordances
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
 

More from TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

More from TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Recently uploaded

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptxBasil Achie
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Pooja Nehwal
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 

Recently uploaded (20)

VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
LANDMARKS  AND MONUMENTS IN NIGERIA.pptxLANDMARKS  AND MONUMENTS IN NIGERIA.pptx
LANDMARKS AND MONUMENTS IN NIGERIA.pptx
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
Navi Mumbai Call Girls Service Pooja 9892124323 Real Russian Girls Looking Mo...
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 

Apertium: Free Open-Source Rule-Based MT Platform for 40+ Languages

  • 1. Apertium: Free/open-source rule-based machine translation and language processors Mikel L. Forcada Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain Riga TAUS Roundtable, June 1, 2016 Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 2. What is Apertium? What is Apertium? Apertium (since 2005) is a free/open-source platform for shallow-transfer rule-based machine translation which is collaboratively developed and provides: A congurable, language independent machine translation engine, Data (dictionaries, rules) for more than 40 language pairs (in XML and text-based formats), and lots of tools for developers and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 3. What is Apertium? Pipeline architecture A pipelined architecture allows for easy customization and diagnostics. lexical transfer morph. analyser morph. disambig. morph. generator post- generator SL text TL text deformatter reformatter structural transfer lexical selection Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 4. What is Apertium? Languages and language pairs afr nld arg cat ita bre fra spa cym eng glg dan nno nob ast por ron epo eus hbs mkd slv bul ind zsmisl swe kaz tat mlt ara oci sme urd hin Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 5. What is Apertium? Apertium loves small languages Some unique MT systems for small languages: Breton→French Aragonese↔Spanish Occitan↔Catalan Aragonese↔Catalan Occitan↔Spanish North Sámi→Norwegian To love is to give: e.g. provide small languages with language resources, and computational-linguistic descriptions of their language. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 6. What is Apertium good for? What is Apertium good for? Apertium is basically good to translate between related languages. Some examples in Apertium: Spanish ↔ Portuguese Norwegian Nynorsk ↔ Norwegian Bokmål Slovenian ↔ Croatian Tatar ↔ Kazakh Postediting Apertium output in these cases may save time compared to translation from scratch. It is also being used for less-related language pairs in gisting applications. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 7. Apertium is collaboratively developed Apertium licensing: free/open-source Apertium language data and code are both licensed under the GNU General Public License: a free/open-source license allowing free distribution of unmodied and modied versions a copylefted license: it avoids private appropriation and encourages giving improvements back to the project (a commons) → community Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 8. Apertium is collaboratively developed Apertium is collaboratively developed Very active group of hundreds of developers (freelance developers, researchers, industrial partners). Wiki documentation (wiki.apertium.org) in addition to formal documents. Help available at IRC channel #apertium in freenode.net Mailing lists: apertium-stuff@lists.sf.net and other language-specic lists Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 9. Apertium is collaboratively developed Research and business with Apertium Apertium is already an active research and business platform: Research: 40+ publications, 2 PhD thesis, 4 master's theses Business: companies (Prompsit, Eleka, Imaxin Software, etc.) oering services to customers such as Autodesk, the Government of Catalonia, one of the main Basque banks, the daily newspaper La Voz de Galicia, etc.) The free/open-source model creates a community which eectively connects researchers, developers, vendors and users. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 10. Becoming an Apertium user Becoming an Apertium user Professional translators can: use Apertium oine plugins in the OmegaT free/open-source CAT environment. (as with any other system) easily align source and MT to generate machine translation memories to feed into other CAT systems Muggles can use: a stand-alone Java application for the desktop: apertium-caffeine an Android version for handhelds a stand-alone version (Apertium Simpleton) for Windows and MacOS. a plug-in for the OmegaT CAT platform apertium-omegat Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 11. Becoming an Apertium developer Becoming an Apertium developer It's easy to become an Apertium developer. It just takes reasonable computing skills (XML, shell commands, etc.), which are not too hard to acquire, good translation skills. In no time, developers nd themselves contributing to a language pair with the support of the community. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 12. A nice side eect: monolingual resources A nice side eect: monolingual resources When developing a language pair, monolingual language resources are developed, such as morphological dictionaries morphological disambiguation rules and probabilities The corresponding monolingual processors are available to help statistical machine translation deal, for instance, with languages having a challenging morphology. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13
  • 13. Success cases Success cases Apertium a is mature technology which is used: in Wikimedia Content Translation to generate Wikipedia content in other languages, to produce a Catalan edition of Valencia daily newspaper Levante-EMV, by Universities in the Catalan speaking area to help in the generation of courseware and academic information, in PLATA, the Spanish government platform for on-the-y webpage machine translation of public-service webpages. Mikel L. Forcada (Universitat d'Alacant, E-03690 Sant Vicent del Raspeig, Spain)Apertium: Free/open-source rule-based machine translation and language process Riga TAUS Roundtable, June 1, 2016 / 13