@mathew_jaya
•
•
•
•
•
•
•
@mathew_jaya
Machine translation is the task of automatically converting source
text in one language to text in another language.
Top 10 languages*
81%
* English, Chinese, Japanese, French, German + Spanish, Portuguese, Russian, Italian, Korean
One Language: 95%
5%
Two or more languages
*https://towardsdatascience.com/evolution-of-machine-translation-5524f1c88b25
Statistical machine translation
(SMT), is the use of statistical
models that learn to translate
text from a source language to
a target language gives a large
corpus of examples.
Neural machine translation
(NMT), is the use of neural
network models to learn a
statistical model for machine
translation.
Rule based machine translation
(RBMT), is rules created by
experts of linguistics.
Word-based/phrase-based/syntax-based translation and word alignment between the source/target text
(Bayes/HMM).
*https://towardsdatascience.com/evolution-of-machine-translation-5524f1c88b25
*https://towardsdatascience.com/evolution-of-machine-translation-5524f1c88b25
Approach of modeling the entire MT process via a big artificial neural network (End to end Encoder/Decoder
structure models – RNN, GRU, LSTM).
Automatically detect the input language of the text sent to
the API
Detect
Bilingual dictionary
Transliteration
Multiple languages
Detect
Bilingual dictionary
Transliteration
Multiple languages
Translate input text into multiple languages with a single
API call
English: Hello
French: Salut
German: Hallo
Arabic: ‫مرحبا‬
Detect
Bilingual dictionary
Transliteration
Multiple languages
Convert words and sentences from one script into another
English: Where is closest train station?
Chinese Simplified: 最近的火车站在哪里?
Chinese Simplified (PinYin): zuì jìn de huǒ chē zhàn zài nǎ li?
Detect
Bilingual dictionary
Transliteration
Multiple languages
See examples of human translated sentences using the input
word and find alternative translations for the input word
Llamando solo para comprobar.
I need you to check some files for me.
Necesito comprobar unos archivos.
Alternatives:
VERB
comprobar check, verify, test, prove, ascertain
revise check, review, inspect
verificar verify, check, verification
chequear check
NOUN
cheque check, paycheck, certificate
verificación verification, check, checking, verifying, credentials
comprobación check, checking, test, verification, physical, testing, verifying
•
•
•
•
•
•
•
bleue…
The
[0.1, -0.3,…,0.5]
La
[0.4, 0.7,…,0.3]
blue
[0.02, 0.4,…,0.91]
[0.2, 0.3,…,0.3]
maison
house
[0.1, 0.7,…,0.4] [0.1, 0.7,…,0.5]
is…
[-0.1, 0.4,…,0.8] [0.2, 0.3,…,0.1]
Attention algorithm
Final input matrix
[0.6, 0.02,…,0.7] [0.3, 0.2,…,0.01]
Model layer 1
Each word is modeled in context of the full sentence.
Model layer 2 to N
Multiple layers allow for better contextualization of
a given word as part of a whole sentence.
Attention layer
“Attention” layer (algorithm), defines word order translations
based on context.
Decoder
Final layer, the decoder, translates words with contextual
awareness for this particular language pair.
{
During training, the NN, creates a 500-
dimensions model of each word for a
given language pair:
• Word type (noun…)
• Singular/plural
• Gender
• Formality, ...
Note: examples only for illustration purposes.
Actual “dimensions” can be anything derived
by the NN after training
The NN, creates a 1000-dimensions
model of each word given the context
Translation layer of the NN has
learned word translations based on this
1000-dimensions sentence context
•
•
•
•
•
•
Client app
Translated text
WEB API
Partial Transcripts
Final Transcripts
Partial Translations
Final Translations
•
•
•
•
•
•
•
•
•
•
50
-
75
Let go of listening before
going to the nextto the next
listening
Français English
Let go of listening before
going to the nextto the next
listening
1. Upload
2. Train
3. Test
4. Deploy
1. Upload
Français English
Let go of listening before
going to the nextto the next
listening
2. Train
3. Test
4. Deploy
1. Upload1. Upload
2. Train
CUSTOM MODEL
Français English
Let go of listening before
going to the nextto the next
listening
1. Upload
2. Train
4. Deploy
3. Test CUSTOM MODEL
3. Test
2. Train
CUSTOM MODEL
Let go of listening before
going to the nextto the next
listening
1. Upload
2. Train
4. Deploy
3. Test CUSTOM MODEL
+4BLEU
SCORE
General Model: BLEU Score = 22
Custom Model: BLEU Score = 26
Français English
Let go of listening before
going to the nextto the next
listening
1. Upload
2. Train
3. Test
4. Deploy4. Deploy
CatID
Let go of the sheet before
going on the windon the wind
the sheet
Let go of listening before
going to the nextto the next
listening
•https://machinelearningmastery.com/introduction-neural-machine-translation/
•https://towardsdatascience.com/evolution-of-machine-translation-5524f1c88b25
•https://en.wikipedia.org/wiki/Long_short-term_memory
https://arxiv.org/pdf/1508.04025.pdf
https://www.youtube.com/watch?v=IxQtK2SjWWM
•Product Information:
•www.microsoft.com/translator
• http://translate.it
• www.aka.ms/TranslatorForum
@mathew_jaya

Breaking the language barrier: how do we quickly add multilanguage support in our AI application?