SlideShare a Scribd company logo
1 of 21
Download to read offline
MMT – Modern, Next Generation
Machine Translation
Achim Ruopp, Directory of R&D
achim@taus.net
MMT Project
Horizon 2020 Innovation Action
3M € funding
3 years: 2015-2017
Goal:
deliver a large-scale commercial online machine
translation service based on a new open-source distributed
architecture.
This project has received funding from the European Union's Horizon 2020
research and innovation programme under grant agreement No 645487.
MMT Team
Business Research
Special thanks to Marcello Frederico (FBK) and Ulrich Germann
(University of Edinburgh) for many of the slides!
Setting up MT for CAT today
1. Select TMs
2. Collect extra data
3. Train and evaluate engine
4. Doesn’t work? back to 2.
5. Analyse/process input documents
6. Apply MT on fake TM
7. Import TMs in CAT tool
8. Start translating
9. Adapt engine to new data - go back to 3.
10. New project? back to 1.
The MMT way
1. Drag & drop your private TMs
2. connect your CAT with a key
3. Start translating!
Modern MT in a nutshell
Zero training time
Manages context
Learns from users
Scales with data and users
Prototype (April 2016) - Fast training
Context aware translation
party
CONTEXT
We are going out.
TRANSLATION
fête
SENTENCE
CONTEXT
We approved the law
TRANSLATION
parti
Prototype (March 2016)
MS Translator Hub vs Modern MT
MMT vs. Moses core language processing
● More supported languages
● Faster processing
● Simpler to use
● Tags and XML management
● Localisation of expressions
REST API
GET /translate?q=party&context=We+approved+the+law
"translation": "parti",
"context": [
{ "id": "europarl",
"score": 0.10343984
}, …
]
MMT Architecture
MMT Data Pooling
 Partner’s repositories:
 MyMemory (Translated)
 Data Cloud (TAUS)
 Volume pooled for the English-Italian prototypes
 ca 785M words & 423M segments in total
MMT Data Collection from CommonCrawl
 commoncrawl.org – US-based non-profit
“CommonCrawl is a 501(c)(3) non-profit organization
dedicated to providing a copy of the internet to internet
researchers, companies and individuals at no cost for the
purpose of research and analysis.”
 On average 1.5 billion unique URLs per crawl
 Vs. an estimated 50 billion pages in Google index and 20
billion pages in Microsoft Bing index
 What can be considered the “surface web” vs. the “deep
web”?
 Two questions
1. What language are these pages in?
2. Which pages are translations of each other?
Monolingual Data Including English
Monolingual Data Excluding English
Parallel Data
Projections from en→it
MMT is Open Source
LGPL/Apache licences
new core technology
github.com/ModernMT/MMT
soon: github.com/ModernMT/DataCollection
email me if you are interested
Roadmap
2015 Q1 2016 Q2 2016 Q4 2017 Q4
development
started
first alpha
release.
10 langs,
fast training,
context aware,
distributed
first beta
release
45 langs,
Incremental
learning
final release
enterprise
ready
This slide may not be used or copied without permission from TAUS
THANK YOU!

More Related Content

Viewers also liked

Next generation engine immobiliser
Next generation engine immobiliserNext generation engine immobiliser
Next generation engine immobilisereSAT Journals
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword UnitsKanji Takahashi
 
2 Stroke Engines
2 Stroke Engines2 Stroke Engines
2 Stroke EnginesSop3303
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 

Viewers also liked (6)

Windows vista
Windows vistaWindows vista
Windows vista
 
Next generation engine immobiliser
Next generation engine immobiliserNext generation engine immobiliser
Next generation engine immobiliser
 
20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units20161215Neural Machine Translation of Rare Words with Subword Units
20161215Neural Machine Translation of Rare Words with Subword Units
 
New two stroke engine
New two stroke engineNew two stroke engine
New two stroke engine
 
2 Stroke Engines
2 Stroke Engines2 Stroke Engines
2 Stroke Engines
 
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
Neural Machine Translation (D3L4 Deep Learning for Speech and Language UPC 2017)
 

Similar to MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)

European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016EuropeanDataPortal
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS - The Language Data Network
 
Building a reliable and scalable IoT platform with MongoDB and HiveMQ
Building a reliable and scalable IoT platform with MongoDB and HiveMQBuilding a reliable and scalable IoT platform with MongoDB and HiveMQ
Building a reliable and scalable IoT platform with MongoDB and HiveMQDominik Obermaier
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?kantanmt
 
Innovate2014 ea 1833
Innovate2014 ea 1833Innovate2014 ea 1833
Innovate2014 ea 1833Paulo Lacerda
 
TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013Jeff Haynie
 
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoCurrent state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoKnud Lasse Lueth
 
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoCurrent state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoIoTAnalytics
 
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinarkantanmt
 
Towards Enterprise Interoperability Service Utilities
Towards Enterprise Interoperability Service UtilitiesTowards Enterprise Interoperability Service Utilities
Towards Enterprise Interoperability Service UtilitiesBrian Elvesæter
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...Ignasi Sayol
 
20090327 Software Engineering -- What's in it for me?
20090327 Software Engineering -- What's in it for me?20090327 Software Engineering -- What's in it for me?
20090327 Software Engineering -- What's in it for me?Arian Zwegers
 
CloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: EuropeSaltlux Inc.
 
Rahul internet of things
Rahul internet of thingsRahul internet of things
Rahul internet of thingsRahul Tathod
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual EuropeGeorg Rehm
 
DTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformDTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformNguyen Trung
 

Similar to MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS) (20)

European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016European Data Portal - ePSI platform webinar 8 February 2016
European Data Portal - ePSI platform webinar 8 February 2016
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...
 
A Multi-agent Approach for Processing Industrial Enterprise Data
A Multi-agent Approach for Processing Industrial Enterprise DataA Multi-agent Approach for Processing Industrial Enterprise Data
A Multi-agent Approach for Processing Industrial Enterprise Data
 
Building a reliable and scalable IoT platform with MongoDB and HiveMQ
Building a reliable and scalable IoT platform with MongoDB and HiveMQBuilding a reliable and scalable IoT platform with MongoDB and HiveMQ
Building a reliable and scalable IoT platform with MongoDB and HiveMQ
 
iadaatpa gala boston
iadaatpa gala bostoniadaatpa gala boston
iadaatpa gala boston
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Innovate2014 ea 1833
Innovate2014 ea 1833Innovate2014 ea 1833
Innovate2014 ea 1833
 
IBM Think Milano
IBM Think MilanoIBM Think Milano
IBM Think Milano
 
TiConf Australia 2013
TiConf Australia 2013TiConf Australia 2013
TiConf Australia 2013
 
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoCurrent state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
 
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech ExpoCurrent state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
Current state of industrial IoT / Industrie 4.0 markets - IoT Tech Expo
 
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
 
Towards Enterprise Interoperability Service Utilities
Towards Enterprise Interoperability Service UtilitiesTowards Enterprise Interoperability Service Utilities
Towards Enterprise Interoperability Service Utilities
 
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
CONFERENCIA: El impacto de la Tecnología en la optimización de la cadena de s...
 
20090327 Software Engineering -- What's in it for me?
20090327 Software Engineering -- What's in it for me?20090327 Software Engineering -- What's in it for me?
20090327 Software Engineering -- What's in it for me?
 
CloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture Overview
 
The Semantic Technology Business: Europe
The Semantic Technology Business: EuropeThe Semantic Technology Business: Europe
The Semantic Technology Business: Europe
 
Rahul internet of things
Rahul internet of thingsRahul internet of things
Rahul internet of things
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
DTT OIC, OIP IoT platform
DTT OIC, OIP IoT platformDTT OIC, OIP IoT platform
DTT OIC, OIP IoT platform
 

More from TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

More from TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Recently uploaded

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyPooja Nehwal
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024eCommerce Institute
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsaqsarehman5055
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxNikitaBankoti2
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Chameera Dedduwage
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMoumonDas2
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Hasting Chen
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...Sheetaleventcompany
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardsticksaastr
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxmohammadalnahdi22
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Delhi Call girls
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceDelhi Call girls
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubssamaasim06
 

Recently uploaded (20)

Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
Andrés Ramírez Gossler, Facundo Schinnea - eCommerce Day Chile 2024
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Mathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptxMathematics of Finance Presentation.pptx
Mathematics of Finance Presentation.pptx
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
Night 7k Call Girls Noida Sector 128 Call Me: 8448380779
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 

MMT – Modern, Next Generation Machine Translation, Achim Ruopp (TAUS)

  • 1. MMT – Modern, Next Generation Machine Translation Achim Ruopp, Directory of R&D achim@taus.net
  • 2. MMT Project Horizon 2020 Innovation Action 3M € funding 3 years: 2015-2017 Goal: deliver a large-scale commercial online machine translation service based on a new open-source distributed architecture. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 645487.
  • 3. MMT Team Business Research Special thanks to Marcello Frederico (FBK) and Ulrich Germann (University of Edinburgh) for many of the slides!
  • 4. Setting up MT for CAT today 1. Select TMs 2. Collect extra data 3. Train and evaluate engine 4. Doesn’t work? back to 2. 5. Analyse/process input documents 6. Apply MT on fake TM 7. Import TMs in CAT tool 8. Start translating 9. Adapt engine to new data - go back to 3. 10. New project? back to 1.
  • 5. The MMT way 1. Drag & drop your private TMs 2. connect your CAT with a key 3. Start translating!
  • 6. Modern MT in a nutshell Zero training time Manages context Learns from users Scales with data and users
  • 7. Prototype (April 2016) - Fast training
  • 8. Context aware translation party CONTEXT We are going out. TRANSLATION fête SENTENCE CONTEXT We approved the law TRANSLATION parti
  • 10. MS Translator Hub vs Modern MT
  • 11. MMT vs. Moses core language processing ● More supported languages ● Faster processing ● Simpler to use ● Tags and XML management ● Localisation of expressions
  • 12. REST API GET /translate?q=party&context=We+approved+the+law "translation": "parti", "context": [ { "id": "europarl", "score": 0.10343984 }, … ]
  • 14. MMT Data Pooling  Partner’s repositories:  MyMemory (Translated)  Data Cloud (TAUS)  Volume pooled for the English-Italian prototypes  ca 785M words & 423M segments in total
  • 15. MMT Data Collection from CommonCrawl  commoncrawl.org – US-based non-profit “CommonCrawl is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to internet researchers, companies and individuals at no cost for the purpose of research and analysis.”  On average 1.5 billion unique URLs per crawl  Vs. an estimated 50 billion pages in Google index and 20 billion pages in Microsoft Bing index  What can be considered the “surface web” vs. the “deep web”?  Two questions 1. What language are these pages in? 2. Which pages are translations of each other?
  • 19. MMT is Open Source LGPL/Apache licences new core technology github.com/ModernMT/MMT soon: github.com/ModernMT/DataCollection email me if you are interested
  • 20. Roadmap 2015 Q1 2016 Q2 2016 Q4 2017 Q4 development started first alpha release. 10 langs, fast training, context aware, distributed first beta release 45 langs, Incremental learning final release enterprise ready
  • 21. This slide may not be used or copied without permission from TAUS THANK YOU!