2. Key Idea
Vector Space
of Language
A
Vector Space
of Language
B
• Represent languages as vector spaces
• Find the linear transformation that maps one to the
other
3. Google Translate
• Statistical Machine Translation (SMT)
• A machine translation paradigm where translations are
generated on the basis of statistical models whose
parameters are derived from the analysis of
bilingual text corpora. --Wikipedia
6. Vector Space – why do we need it?
Problems…
• Creating parallel corpora
takes human effort
• Parallel corpora are scarce
for some language pairs
• Translation quality is
language-dependent
New Approach
• Automates the process of
generating and expanding
dictionaries and phrase
tables
• Makes little assumption
about the languages; works
for any language pairs
7. How does it work?
• Step 1 (Construct Language Spaces)
• Build monolingual models of languages using large amounts
of monolingual texts
• STEP 2 (Find a Translation Matrix)
• Learn a linear transformation between the vector spaces of
languages using a small bilingual dictionary
8. Step 1: How to Represent Languages?
• Simple neural network architectures that aims to
predict the neighbors of a word
• Continuous Bag-of-Words (CBOW)
• Skip-gram (SG)
• Represent languages as vector spaces using the
relationship between words
9. CBOW vs. Skip-gram
CBOW
• Predicts current word based
on the context
Skip-gram
• Predicts the context based
on current word
• E.g. “I hit the tennis ball”
- “I hit the”
“hit the tennis”,
“the tennis ball”
- “hit the ball”
(skipped tennis)
10. Some Great Results…
• Vectors of similar words are close in the vector space
• Capture semantic information and concept relation
• vec(“king”) – vec(“man”) + vec(“woman”) = vec(“queen”)
• vec(“Madrid”) – vec(“Spain”) + vec(“France”) = vec(“Paris”)
• Can be trained on a large corpus in a short time due
to low computational complexity
11. Step 2: Why does it work?
• All languages have words that describe a similar set
of ideas; words are used in similar ways
• E.g. “A cat is an animal that is smaller than a dog.”
“猫是一种比狗小的动物”
• Strong similarities of geometric arrangements
between different language spaces
12. Step 2: Translation Matrix
• Given a small bilingual dictionary
• 𝒖𝒊 ∈ 𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆 𝑺𝒑𝒂𝒄𝒆 𝑨
• 𝒗𝒊 ∈ 𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆 𝑺𝒑𝒂𝒄𝒆 𝑩
• {𝒖𝒊, 𝒗𝒊} ∈ 𝑫𝒊𝒄𝒕𝒊𝒐𝒏𝒂𝒓𝒚 𝒐𝒇 (𝑨, 𝑩)
• 𝑳𝒆𝒂𝒓𝒏 𝒂 𝒕𝒓𝒂𝒏𝒔𝒍𝒂𝒕𝒊𝒐𝒏 𝒎𝒂𝒕𝒓𝒊𝒙 𝑾 𝒔. 𝒕.
• 𝑾 𝒖𝒊 ≃ 𝒗𝒊
• Works for words that are not in the dictionary
• automatically expands the dictionary
13. Performance And Applications
• 90% precision@5 between English and Spanish
• Expand and refine existing dictionaries
• Correct errors in the English-Czech dictionary
• Improve translation quality for distant language pairs
• English-Vietnamese
14. Comments
• A step forward in multilingual communication
• Still a long way to go…
• Sentence structure
• Precision and in-context translation
• Idioms
15. References
• Haghighi, Aria, et al. "Learning Bilingual Lexicons from Monolingual Corpora." ACL.
Vol. 2008. 2008.
• Guthrie, David, et al. "A closer look at skip-gram modelling." Proceedings of the 5th
international Conference on Language Resources and Evaluation (LREC-2006).
2006.
• Mikolov, Tomas, Quoc V. Le, and Ilya Sutskever. "Exploiting similarities among
languages for machine translation." arXiv preprint arXiv:1309.4168 (2013).