SlideShare a Scribd company logo
Applying Machine
Learning
11/17/2016
Samad Echihabi
2
Machine Learning
Machine
data
model
input output
3
Machine Learning
Machine
data
model
input output
4
Machine Learning: French to English Translation
data
model
Méditerranée: 3200 personnes
secourues en cinq jours
Mediterranean: 3200 people
rescued in five days
Machine Translation Models
5
Language Model
Translation Model
Source Target Good Translations?
bonjour hello
bonjour blue
bonjour morning
bonjour good morning
bonjour hi
Source Target Good Translations?
bonjour hello ✔
bonjour blue ✗
bonjour morning ~
bonjour good morning ✔
bonjour hi ✔
MT: Translation Model
Target Good Language?
Be the change that you wish to see in the world.
Be the world that you wish to see in the change.
The be change which you wish to see on the world.
Be that the you world to in wish see change . the
Target Good Language?
Be the change that you wish to see in the world. ✔
Be the world that you wish to see in the change. ✗
The be change which you wish to see on the world. ✗✗
Be that the you world to in wish see change . the ✗✗✗✗✗
MT: Language Model
MT: Training
Statistical
Analysis
Translation Model
la the 80%
la a 12%
la 8%
capitale capital 70%
capitale death 30%
de of 53%
de from 47%
france france 100%
Is est 75%
Is was 25%
paris paris 100%
Language Model
the death of 54%
the capital of 34%
a capital of 11%
capital of france 41%
capital from france 9%
of france is 45%
of the france 2%
france is paris 23%
france was paris 22%
………
………
english
………
P(s/t) P(t)
parallel
monolingual
………
………
english
………
………
………
french
………
Statistical
Analysis
MT: Decoding
Statistical
Search
Translation Score
the capital of france is paris 94%
capital of france is paris 71%
a capital of france is paris 65%
... …
a death from france was paris 3%
Translation Model
la the 80%
la a 12%
la 8%
capitale capital 70%
capitale death 30%
de of 53%
de from 47%
france france 100%
Is est 75%
Is was 25%
paris paris 100%
Language Model
the death of 54%
the capital of 34%
a capital of 11%
capital of france 41%
capital from france 9%
of france is 45%
of the france 2%
france is paris 23%
france was paris 22%
Input
la capitale de la france est paris
SMT Models
Adaptive Models
Neural Models
• Translation Model P(s/t)
• Language Model P(t)
• Distortion
• Alignment
• Phrase
• POS
• Syntactic Translation
• Syntactic Language
• Reordering
• Lexicalized Reordering
• Preordering
• Word Deletion
• Lexicalized Smoothing
• Capitalization
• Morphology
• Transliteration
• Semantic
• Informal Models
• Social Media Components
Applying Machine Learning – Use cases
Social Media Translation
Character Repetition
Spelling Errors
Dialect
Morphology
Romanization
Metadata
Social Media Translation Challenges‫أاااا‬‫ا‬‫احسن‬ ‫احسن‬
‫الخلييييييج‬ ‫ا‬‫لخليج‬Normalization
‫نزيفه‬
‫نظيفة‬
‫وظيفة‬
‫نظيف‬
‫نزيف‬
‫نزيفه‬
Spelling Correction
#‫القدم_كرة‬ #soccerSocial Metadata
‫المرفهين‬ ‫+ال‬ ‫+مرفه‬ ‫ين‬
Morphological
Segmentation
bessa7a
wel3afya
habibi ‫والعافية‬
‫حبيبي‬ ‫بالصحة‬
habibi ‫بساحة‬
Deromanization
+62%
Improvement
Source Generic MT
Social Media Translation
• la2a hia katir fi lakhbar.
• ma 3ajbanish kida. Lazim t3'iyyer
l3ounouane
• Enty habla ?
• Kalemni lama t3raf ezay tebatal
teshtemni
• 3andy soda3 fi rassi... 5oshy namy badal
chat. a7san lik Ah sa7
• La2a hia katir Fi lakhbar.
• Ma 3ajbanish kida. lazim T3 (iyyer
L3ounouane
• enty habla?
• kalemni Lama T3RAF ezay tebatal
teshtemni
• 3Andy soda3 Fi rassi ... 5oshy namy
badal Chat. A7San lik Ah SA7
Source Social Media MT
Social Media Translation
• la2a hia katir fi lakhbar.
• ma 3ajbanish kida. Lazim t3'iyyer
l3ounouane
• Enty habla ?
• Kalemni lama t3raf ezay tebatal
teshtemni
• 3andy soda3 fi rassi... 5oshy namy badal
chat. a7san lik Ah sa7
• No, it is very much in the news.
• I don't like this. We must change the
title
• Are you an idiot?
• Talk to me when you know how to stop
insulting me
• I have a headache in my head. Go to
sleep, instead of chat. It is better for
you, Yes, sa7
Broadcast News
Translations
Broadcast News Translation
Speech
Recognition
Machine
Translation
Distillation
Audio Channels
Video Channels
Actionable Information
Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies
marking the twenty-fifth anniversary of the first democratic elections in Poland
Broadcast News Translation
Reçu mardi à Varsovie par Bronislaw Komorowski, Barack Obama a participé aux cérémonies marquant
le vingt-cinquième anniversaire des premières élections démocratiques en Pologne.
Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies
marking the twenty-fifth anniversary of the first democratic elections in Poland
✗
Travel Reviews
Travel User Reviews Translation
published
Translated User Reviews
post-edited
good translation
bad translation
Automatic Quality
Prediction
Post-Editing Machine Translation
Post-Editing Machine Translation
Post-Editing Adaptive Machine Translation
Post-Editing Adaptive Machine Translation
Applying Machine Learning
Volume
Quality
Data Domain
Models
Delivery
Security Speed
Privacy
Evaluation
Integration
Adaptation
ANSWERS&
QUESTIONS
Copyright © 2008-2017 SDL plc. All rights reserved. All company names, brand names,
trademarks, service marks, images and logos are the property of their respective owners.
This presentation and its content are SDL confidential unless otherwise specified, and may
not be copied, used or distributed except as authorised by SDL.
Software and Services for Human Understanding

More Related Content

Similar to Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16

Cycle process powerpoint
Cycle process powerpointCycle process powerpoint
Cycle process powerpoint
hilmius akbar
 
Pyramid diagram powerpoint
Pyramid diagram powerpointPyramid diagram powerpoint
Pyramid diagram powerpoint
hilmius akbar
 
Speech to Speech real time translations, Aigars Macins, Skype
Speech to Speech real time translations, Aigars Macins, SkypeSpeech to Speech real time translations, Aigars Macins, Skype
Speech to Speech real time translations, Aigars Macins, Skype
TAUS - The Language Data Network
 
08. Photo Layout.pptx
08. Photo Layout.pptx08. Photo Layout.pptx
08. Photo Layout.pptx
d2f5k6mhj2
 
The State of Spam
The State of SpamThe State of Spam
The State of Spam
Anthony Bubel
 
SWOT Analysis Slides
SWOT Analysis SlidesSWOT Analysis Slides
SWOT Analysis Slides
SachinDahiya46
 
Responsive Web Design For Magazines
Responsive Web Design For MagazinesResponsive Web Design For Magazines
Responsive Web Design For Magazines
Elliance, Inc.
 
Smooth Presentation 1.pptx
Smooth Presentation 1.pptxSmooth Presentation 1.pptx
Smooth Presentation 1.pptx
Wilco7
 

Similar to Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16 (8)

Cycle process powerpoint
Cycle process powerpointCycle process powerpoint
Cycle process powerpoint
 
Pyramid diagram powerpoint
Pyramid diagram powerpointPyramid diagram powerpoint
Pyramid diagram powerpoint
 
Speech to Speech real time translations, Aigars Macins, Skype
Speech to Speech real time translations, Aigars Macins, SkypeSpeech to Speech real time translations, Aigars Macins, Skype
Speech to Speech real time translations, Aigars Macins, Skype
 
08. Photo Layout.pptx
08. Photo Layout.pptx08. Photo Layout.pptx
08. Photo Layout.pptx
 
The State of Spam
The State of SpamThe State of Spam
The State of Spam
 
SWOT Analysis Slides
SWOT Analysis SlidesSWOT Analysis Slides
SWOT Analysis Slides
 
Responsive Web Design For Magazines
Responsive Web Design For MagazinesResponsive Web Design For Magazines
Responsive Web Design For Magazines
 
Smooth Presentation 1.pptx
Smooth Presentation 1.pptxSmooth Presentation 1.pptx
Smooth Presentation 1.pptx
 

More from SDL

Video localization: Take Your Videos Global
Video localization: Take Your Videos GlobalVideo localization: Take Your Videos Global
Video localization: Take Your Videos Global
SDL
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
SDL
 
Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!
SDL
 
Transcreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural ConnectionTranscreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural Connection
SDL
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality Challenges
SDL
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
SDL
 
Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...
SDL
 
Top Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality MeasurementTop Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality Measurement
SDL
 
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
SDL
 
iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language Solutions
SDL
 
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center:  Advanced Techniques for Rapid Global Content CreationSDL Knowledge Center:  Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
SDL
 
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test LabMultilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
SDL
 
Terminology Management Best Practices
Terminology Management Best PracticesTerminology Management Best Practices
Terminology Management Best Practices
SDL
 
How to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global AudienceHow to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global Audience
SDL
 
Fast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksFast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural Networks
SDL
 
An Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation SystemAn Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation System
SDL
 
Redefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video LocalizationRedefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video Localization
SDL
 
The Case for Enterprise Translation Management
The Case for Enterprise Translation ManagementThe Case for Enterprise Translation Management
The Case for Enterprise Translation Management
SDL
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
SDL
 
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
SDL
 

More from SDL (20)

Video localization: Take Your Videos Global
Video localization: Take Your Videos GlobalVideo localization: Take Your Videos Global
Video localization: Take Your Videos Global
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
 
Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!
 
Transcreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural ConnectionTranscreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural Connection
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality Challenges
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...
 
Top Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality MeasurementTop Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality Measurement
 
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
 
iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language Solutions
 
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center:  Advanced Techniques for Rapid Global Content CreationSDL Knowledge Center:  Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
 
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test LabMultilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
 
Terminology Management Best Practices
Terminology Management Best PracticesTerminology Management Best Practices
Terminology Management Best Practices
 
How to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global AudienceHow to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global Audience
 
Fast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksFast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural Networks
 
An Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation SystemAn Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation System
 
Redefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video LocalizationRedefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video Localization
 
The Case for Enterprise Translation Management
The Case for Enterprise Translation ManagementThe Case for Enterprise Translation Management
The Case for Enterprise Translation Management
 
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...Machine Translation: Latest Innovations and their Impact on Commercial Transl...
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
Fashion Days with Howard Beader & Andreas Meier at Forrester #CXNYC 2015
 

Recently uploaded

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Natan Silnitsky
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
WSO2
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 

Recently uploaded (20)

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.ILBeyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
Beyond Event Sourcing - Embracing CRUD for Wix Platform - Java.IL
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
Accelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with PlatformlessAccelerate Enterprise Software Engineering with Platformless
Accelerate Enterprise Software Engineering with Platformless
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 

Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16

  • 4. 4 Machine Learning: French to English Translation data model Méditerranée: 3200 personnes secourues en cinq jours Mediterranean: 3200 people rescued in five days
  • 5. Machine Translation Models 5 Language Model Translation Model
  • 6. Source Target Good Translations? bonjour hello bonjour blue bonjour morning bonjour good morning bonjour hi Source Target Good Translations? bonjour hello ✔ bonjour blue ✗ bonjour morning ~ bonjour good morning ✔ bonjour hi ✔ MT: Translation Model
  • 7. Target Good Language? Be the change that you wish to see in the world. Be the world that you wish to see in the change. The be change which you wish to see on the world. Be that the you world to in wish see change . the Target Good Language? Be the change that you wish to see in the world. ✔ Be the world that you wish to see in the change. ✗ The be change which you wish to see on the world. ✗✗ Be that the you world to in wish see change . the ✗✗✗✗✗ MT: Language Model
  • 8. MT: Training Statistical Analysis Translation Model la the 80% la a 12% la 8% capitale capital 70% capitale death 30% de of 53% de from 47% france france 100% Is est 75% Is was 25% paris paris 100% Language Model the death of 54% the capital of 34% a capital of 11% capital of france 41% capital from france 9% of france is 45% of the france 2% france is paris 23% france was paris 22% ……… ……… english ……… P(s/t) P(t) parallel monolingual ……… ……… english ……… ……… ……… french ……… Statistical Analysis
  • 9. MT: Decoding Statistical Search Translation Score the capital of france is paris 94% capital of france is paris 71% a capital of france is paris 65% ... … a death from france was paris 3% Translation Model la the 80% la a 12% la 8% capitale capital 70% capitale death 30% de of 53% de from 47% france france 100% Is est 75% Is was 25% paris paris 100% Language Model the death of 54% the capital of 34% a capital of 11% capital of france 41% capital from france 9% of france is 45% of the france 2% france is paris 23% france was paris 22% Input la capitale de la france est paris
  • 10. SMT Models Adaptive Models Neural Models • Translation Model P(s/t) • Language Model P(t) • Distortion • Alignment • Phrase • POS • Syntactic Translation • Syntactic Language • Reordering • Lexicalized Reordering • Preordering • Word Deletion • Lexicalized Smoothing • Capitalization • Morphology • Transliteration • Semantic • Informal Models • Social Media Components
  • 11. Applying Machine Learning – Use cases
  • 19. Social Media Translation Challenges‫أاااا‬‫ا‬‫احسن‬ ‫احسن‬ ‫الخلييييييج‬ ‫ا‬‫لخليج‬Normalization ‫نزيفه‬ ‫نظيفة‬ ‫وظيفة‬ ‫نظيف‬ ‫نزيف‬ ‫نزيفه‬ Spelling Correction #‫القدم_كرة‬ #soccerSocial Metadata ‫المرفهين‬ ‫+ال‬ ‫+مرفه‬ ‫ين‬ Morphological Segmentation bessa7a wel3afya habibi ‫والعافية‬ ‫حبيبي‬ ‫بالصحة‬ habibi ‫بساحة‬ Deromanization +62% Improvement
  • 20. Source Generic MT Social Media Translation • la2a hia katir fi lakhbar. • ma 3ajbanish kida. Lazim t3'iyyer l3ounouane • Enty habla ? • Kalemni lama t3raf ezay tebatal teshtemni • 3andy soda3 fi rassi... 5oshy namy badal chat. a7san lik Ah sa7 • La2a hia katir Fi lakhbar. • Ma 3ajbanish kida. lazim T3 (iyyer L3ounouane • enty habla? • kalemni Lama T3RAF ezay tebatal teshtemni • 3Andy soda3 Fi rassi ... 5oshy namy badal Chat. A7San lik Ah SA7
  • 21. Source Social Media MT Social Media Translation • la2a hia katir fi lakhbar. • ma 3ajbanish kida. Lazim t3'iyyer l3ounouane • Enty habla ? • Kalemni lama t3raf ezay tebatal teshtemni • 3andy soda3 fi rassi... 5oshy namy badal chat. a7san lik Ah sa7 • No, it is very much in the news. • I don't like this. We must change the title • Are you an idiot? • Talk to me when you know how to stop insulting me • I have a headache in my head. Go to sleep, instead of chat. It is better for you, Yes, sa7
  • 24. Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies marking the twenty-fifth anniversary of the first democratic elections in Poland Broadcast News Translation Reçu mardi à Varsovie par Bronislaw Komorowski, Barack Obama a participé aux cérémonies marquant le vingt-cinquième anniversaire des premières élections démocratiques en Pologne. Received Tuesday in Warsaw by Bronislaw Komorowski, Barack Obama has participated in ceremonies marking the twenty-fifth anniversary of the first democratic elections in Poland ✗
  • 26. Travel User Reviews Translation published Translated User Reviews post-edited good translation bad translation Automatic Quality Prediction
  • 27.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44. Applying Machine Learning Volume Quality Data Domain Models Delivery Security Speed Privacy Evaluation Integration Adaptation
  • 46. Copyright © 2008-2017 SDL plc. All rights reserved. All company names, brand names, trademarks, service marks, images and logos are the property of their respective owners. This presentation and its content are SDL confidential unless otherwise specified, and may not be copied, used or distributed except as authorised by SDL. Software and Services for Human Understanding

Editor's Notes

  1. jn1