SlideShare a Scribd company logo
The Web, The Database and
The Neural
Garth Hedenskog, Sales Director
Pangeanic TAUS Girona, 13 June 2017
• National research project CDTI
• Workflow system with built in crawler
• PM-less track and workflows initiation
• Powerful tool with incorporated with
Pangeanic’s new technology – ActivaTM and
PangeaMT Neural
ELASTIC CENTRALIZED TM SYSTEM
• FEATURES:
• CAT tool agnostic
• Cor integratable
• Hosting options
• Tag handling capabilities
• API to NMT
• Triangulation
…summary
Our story……
• First translation company in the world to make commercial use of
Moses.
• Wins a post-editing contract in 2007 to work for the European
Commission as MT output post-editors.
THAT WAS THEN, THIS IS NOW
• Pangeanic’s consortium, along with KantanMT, Prompsit and Tilde,
was awarded the largest EU contract by CEF (Connecting Europe
Facility) to supply infrastructure services to the European Union in
the field of Digital Service Infrastructures, and particularly machine
translation. (IADAATPA (Intelligent, Automatic Domain Adapted
Automated Translation for Public Administrations)
Training Corpus
Sentences Running
words
Vocabulary
EN 4,6M 55,9M 491,6K
JA 4,6M 76,0M 283,8K
Dev corpus
Sentences Running
words
OOVs
EN 1,9K 24,1K 1,32
JA 1,9K 32,7K 0,86
Test corpus
Sentences Running
words
OOVs Average length
in characters
Average
number tokens
EN 2K 27,1K 1,80 77 14,12
JA 2K 37,0K 1,14 59 19,08
Training data:
• TAUS data for Electronics Computer Hardware (ECH) plus SOFT (IT) 4,6M sentences / 56M words (EN)
• EN and JA tokenized (tokenizer.perl and Mecab respectively)
BLEU TER WER
PangeaMT 43,25 0,493174 0,607223
NMT 44,53 0,422858 0,473214
Seemingly…. Not such a big difference
Results EN->JP :
0-10 words 11-15 words 16-20 words 21-25 words 26-30 words 31+ words
BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER
Pangea
MT
44,00 0,428
65
0,471
268
42,80 0,465
28
0,591
708
41,08 0,485
096
0,617
126
39,95 0,491
183
0,649
891
39,08 0,539
768
0,693
745
35,38 0,565
217
0,713
226
NMT 40,59 0,398
68
0,414
078
46,00 0,353
941
0,393
642
43,43 0,392
998
0,443
898
42,04 0,407
965
0,476
323
39,86 0,461
081
0,529
578
35,65 0,561
833
0,630
695
Results EN->JP by length:
• In shorter sentences (0-10 words), our SMT system scores better results in BLEU, but if we take a look to the
TER and WER, we see that in character and word level, NMT has better results which means less post edition
efforts.In sentences (11-25 words), NMT always gets better results in BLEU, WER and TER.
• In longer sentences (26++), NMT tends to have same results than PangeaMT.
BLEU TER WER
Tests in F/I/G/S, RU, PT point to a very strong preference towards NMT (results available in our blog)
On average: from a set of (random) 250 sentences, around 85% - 90%, were good or very good (A or B). ES/PT/IT
results similar to FR
Evaluation: Translation companies and professional freelance translators
EN-DE set of 250 sentences
NMT SMT
A 132 53% 34 14%
B 98 39% 95 38%
C 14 6% 97 39%
D 6 2% 24 10%
250 250
EN-FR set of 250 sentences
NMT SMT
A 150 60% 39 16%
B 76 30% 126 50%
C 21 8% 71 28%
D 3 1% 14 6%
250 250
EN-RU set of 250 sentences
NMT SMT
A 128 51% 39 16%
B 84 34% 43 17%
C 22 9% 60 24%
D 16 6% 108 43%
250 250
EN-JP set of 250 sentences
NMT SMT
A 83 33% 17 7%
B 71 28% 14 6%
C 56 22% 95 38%
D 40 16% 124 50%
250 250
•Conclusion
•NN does not produce miracles yet but the innitial results are very exciting.
•The shift is remarkable in all languages especially JP which has moved away from
the usual average to bad results to a great leap to pretty acceptable quality
Thank you!
garth@pangeanic.com

More Related Content

Similar to Pangeanic Taus Presentation 13.06.17

Gestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaGestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de Barcelona
Manuel Herranz
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
TAUS - The Language Data Network
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ijnlc
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
kevig
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
kevig
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS - The Language Data Network
 
SDL BeGlobal The SDL Platform for Automated Translation
SDL BeGlobal The SDL Platform for Automated TranslationSDL BeGlobal The SDL Platform for Automated Translation
SDL BeGlobal The SDL Platform for Automated Translation
SDL Trados
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
Konstantin Savenkov
 
New Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation TechnologyNew Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation Technology
kantanmt
 
Pangeanic presentation at Elia Together Athens - Manuel Herranz
Pangeanic presentation at Elia Together Athens - Manuel HerranzPangeanic presentation at Elia Together Athens - Manuel Herranz
Pangeanic presentation at Elia Together Athens - Manuel Herranz
Manuel Herranz
 
Methods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine TranslationMethods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine Translation
Kerstin Berns
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
CAT TOOLS.ppt
CAT TOOLS.pptCAT TOOLS.ppt
CAT TOOLS.ppt
Kevin464343
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
Stephen Peacock
 
Tms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mtTms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mt
Manuel Herranz
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS - The Language Data Network
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)kantanmt
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
IRJET Journal
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Sheeyam Shellvacumar
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS - The Language Data Network
 

Similar to Pangeanic Taus Presentation 13.06.17 (20)

Gestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de BarcelonaGestión proyectos traducción - Universitat Autònoma de Barcelona
Gestión proyectos traducción - Universitat Autònoma de Barcelona
 
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...ANALYZING ARCHITECTURES FOR NEURAL  MACHINE TRANSLATION USING LOW  COMPUTATIO...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIO...
 
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
ANALYZING ARCHITECTURES FOR NEURAL MACHINE TRANSLATION USING LOW COMPUTATIONA...
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
SDL BeGlobal The SDL Platform for Automated Translation
SDL BeGlobal The SDL Platform for Automated TranslationSDL BeGlobal The SDL Platform for Automated Translation
SDL BeGlobal The SDL Platform for Automated Translation
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
New Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation TechnologyNew Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation Technology
 
Pangeanic presentation at Elia Together Athens - Manuel Herranz
Pangeanic presentation at Elia Together Athens - Manuel HerranzPangeanic presentation at Elia Together Athens - Manuel Herranz
Pangeanic presentation at Elia Together Athens - Manuel Herranz
 
Methods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine TranslationMethods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine Translation
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
CAT TOOLS.ppt
CAT TOOLS.pptCAT TOOLS.ppt
CAT TOOLS.ppt
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Tms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mtTms days 04 2012 manuel herranz pangea mt
Tms days 04 2012 manuel herranz pangea mt
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
 
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISHA NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
 
Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.Real-time DirectTranslation System for Sinhala and Tamil Languages.
Real-time DirectTranslation System for Sinhala and Tamil Languages.
 
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMTTAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
 

Recently uploaded

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 

Recently uploaded (20)

Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 

Pangeanic Taus Presentation 13.06.17

  • 1. The Web, The Database and The Neural Garth Hedenskog, Sales Director Pangeanic TAUS Girona, 13 June 2017
  • 2. • National research project CDTI • Workflow system with built in crawler • PM-less track and workflows initiation • Powerful tool with incorporated with Pangeanic’s new technology – ActivaTM and PangeaMT Neural
  • 3. ELASTIC CENTRALIZED TM SYSTEM • FEATURES: • CAT tool agnostic • Cor integratable • Hosting options • Tag handling capabilities • API to NMT • Triangulation …summary
  • 4. Our story…… • First translation company in the world to make commercial use of Moses. • Wins a post-editing contract in 2007 to work for the European Commission as MT output post-editors. THAT WAS THEN, THIS IS NOW • Pangeanic’s consortium, along with KantanMT, Prompsit and Tilde, was awarded the largest EU contract by CEF (Connecting Europe Facility) to supply infrastructure services to the European Union in the field of Digital Service Infrastructures, and particularly machine translation. (IADAATPA (Intelligent, Automatic Domain Adapted Automated Translation for Public Administrations)
  • 5. Training Corpus Sentences Running words Vocabulary EN 4,6M 55,9M 491,6K JA 4,6M 76,0M 283,8K Dev corpus Sentences Running words OOVs EN 1,9K 24,1K 1,32 JA 1,9K 32,7K 0,86 Test corpus Sentences Running words OOVs Average length in characters Average number tokens EN 2K 27,1K 1,80 77 14,12 JA 2K 37,0K 1,14 59 19,08 Training data: • TAUS data for Electronics Computer Hardware (ECH) plus SOFT (IT) 4,6M sentences / 56M words (EN) • EN and JA tokenized (tokenizer.perl and Mecab respectively) BLEU TER WER PangeaMT 43,25 0,493174 0,607223 NMT 44,53 0,422858 0,473214 Seemingly…. Not such a big difference Results EN->JP :
  • 6. 0-10 words 11-15 words 16-20 words 21-25 words 26-30 words 31+ words BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER BLEU TER WER Pangea MT 44,00 0,428 65 0,471 268 42,80 0,465 28 0,591 708 41,08 0,485 096 0,617 126 39,95 0,491 183 0,649 891 39,08 0,539 768 0,693 745 35,38 0,565 217 0,713 226 NMT 40,59 0,398 68 0,414 078 46,00 0,353 941 0,393 642 43,43 0,392 998 0,443 898 42,04 0,407 965 0,476 323 39,86 0,461 081 0,529 578 35,65 0,561 833 0,630 695 Results EN->JP by length: • In shorter sentences (0-10 words), our SMT system scores better results in BLEU, but if we take a look to the TER and WER, we see that in character and word level, NMT has better results which means less post edition efforts.In sentences (11-25 words), NMT always gets better results in BLEU, WER and TER. • In longer sentences (26++), NMT tends to have same results than PangeaMT. BLEU TER WER
  • 7. Tests in F/I/G/S, RU, PT point to a very strong preference towards NMT (results available in our blog) On average: from a set of (random) 250 sentences, around 85% - 90%, were good or very good (A or B). ES/PT/IT results similar to FR Evaluation: Translation companies and professional freelance translators EN-DE set of 250 sentences NMT SMT A 132 53% 34 14% B 98 39% 95 38% C 14 6% 97 39% D 6 2% 24 10% 250 250 EN-FR set of 250 sentences NMT SMT A 150 60% 39 16% B 76 30% 126 50% C 21 8% 71 28% D 3 1% 14 6% 250 250 EN-RU set of 250 sentences NMT SMT A 128 51% 39 16% B 84 34% 43 17% C 22 9% 60 24% D 16 6% 108 43% 250 250 EN-JP set of 250 sentences NMT SMT A 83 33% 17 7% B 71 28% 14 6% C 56 22% 95 38% D 40 16% 124 50% 250 250
  • 8. •Conclusion •NN does not produce miracles yet but the innitial results are very exciting. •The shift is remarkable in all languages especially JP which has moved away from the usual average to bad results to a great leap to pretty acceptable quality Thank you! garth@pangeanic.com