3. WHAT IS NEURAL MACHINE TRANSLATION?
Neural MT = A particular application of Neural Networks
Neural
Networks
MT
Self-Driving
Cars
Etc.
Script
Recognition
Price
Prediction
4. So what are Neural Networks?
Neural Networks
Decision Trees
Linear Regressions
Knowledge
Representation
Machine Learning
Evolutionary
Computation
EM Algorithms,
etc.
Fuzzy Systems,
etc.
5. Some Definitions
AI: A branch of computer science dealing with the
simulation of intelligent behavior in computers.
Machine Learning:
A type of AI that provides computers with the ability
to learn without being explicitly programmed.
Neural Networks: A ML data approach consisting of a
large number of simple, high-interconnected processing
elements (artificial neurons) in an architecture inspired by
the structure of the cerebral cortex of the brain.
7. There are Many Types of ML…
ML Algorithm mind map from http://machinelearningmastery.com/
8. …But they all have the Same Principles
From “Machine Learning with Apache Spark”, by David Taieb
The Machine Learning Flow
9. …And They Deliver Great Results
Deep Learning (ML’s newest wave) can detect patterns and make
predictions at a vastly deeper level than old-fashioned statistics
Example:
Deep Learning
vs
Traditional
Statistics
10. If ML is so great, why do we only hear about it now?
- Increased computational power (GPUs, TPUs)
- Flood of available data
- Better algorithms
- Private companies embracing it
….and open-sourcing their tools
11. So… Machine Learning is Everywhere
News Summarization
Speech
Understanding
Face Recognition
Fraud Detection
Price Prediction
Machine Translation
Drug
Development
Spam Filtering
Buyer
Recommendations
Product Demand Forecast
Self-Driving Cars
Chatbots
14. Rule-Based Machine Translation (RBMT) – 1970s-2000s
The linguistic approach: mapping one language to another through
rules and dictionaries.
Source
Text
Translation
Lexicographic
Analysis
Syntactic
Analysis
Morphological
Analysis
Target
Text
15. Statistical Machine Translation (SMT) 1990s-Today
Forget linguistics – let’s look for statistical
patterns in bilingual texts. Data
Translation
(search for best
possible
translation)
Text
(input)
Text
(output)
Language
Model
Translation
Model
Training
How?
16. SMT – Translation Model – the “Translator”
Translation rules are learnt by finding patterns in parallel text documents
These rules are used to translate new texts (“crack the code”)
car
car
English text
Auto
Auto
German text
Mein Auto ist rot.
My car is red.
decode
car
Wagen
src -> trg | prob
car -> Auto | 0.9
car -> Wagen | 0.1
17. SMT– Language Model – the “Proofreader”
The Translation Model’s results are “tuned” by the Language Model to produce
more fluent sentences
My car is red.
English text
My car drives fast
You drive my car
I drive my car
N-gram count
my 4
car 4
is 1
… …
my car 4
… …
drive my car 2
… …
19. BASIC STRUCTURE
Source words are
converted to numbers
and added up (encoded)
to produce a final score
for the whole sentence,
which is then decoded
to the target
2 Parts:
Encoder and Decoder
23. Difference SMT - NMT
SMT is a White Box technology NMT is Black Box
Translation Model
Language Model
Alignment
If you look inside you will only see numbers
(matrixes/word embeddings)
Others
24. NEURAL MACHINE TRANSLATION (2015-…)
…Just a sea of numbers scientists cannot manipulate. If it doesn’t work, they have these choices:
Change the NN:
- Hyper-parameters
- Topology (# of
nodes and layers)
- Architecture
Re-train to focus on a domain:
Replacing the
training data
Continuing
with a subset
of the training
data
25. Does this change linguists’ work?
Essentially: Remember the Machine Learning workflow?
The Machine Learning Flow – Always
No.
26. NMT vs SMT: Performance Comparison
Neural MT vs SMT
Handling complex (Chinese) or
morphologically rich (Russian) languages
NMT better
Word reordering NMT better
Fluency NMT better
Adequacy NMT may give problems (more deviations that
make sense, rare words, whole information
bits missing)
Volume of training data needed Generally, NMT needs more data, but this
depends on processing capacity
Unpredictable errors More likely (specially if insufficient training
data)