SlideShare a Scribd company logo
Learning bilingual word
embeddings with (almost)
no bilingual data
Mikel Artetxe, Gorka Labaka, Eneko Agirre
IXA NLP group – University of the Basque Country (UPV/EHU)
Who cares?
Who cares?
word embeddings are useful!
Who cares?
word embeddings are useful!
Who cares?
word embeddings are useful!
Who cares?
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
Previous work
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
Previous work
- parallel corpora
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
Previous work
- parallel corpora
- comparable corpora
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
Previous work
- parallel corpora
- comparable corpora
- (big) dictionaries
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
Previous work
- parallel corpora
- comparable corpora
- (big) dictionaries
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
This talkPrevious work
- parallel corpora
- comparable corpora
- (big) dictionaries
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
This talk
- 25 word dictionary
Previous work
- parallel corpora
- comparable corpora
- (big) dictionaries
word embeddings are useful!
Who cares?
- inherently crosslingual tasks
- crosslingual transfer learning
bilingual signal
for training
This talk
- 25 word dictionary
- numerals (1, 2, 3…)
Previous work
- parallel corpora
- comparable corpora
- (big) dictionaries
word embeddings are useful!
Bilingual embedding mappings
Bilingual embedding mappings
𝑋 𝑍
Basque English
Bilingual embedding mappings
𝑋 𝑍
Seed dictionary
Basque English
Bilingual embedding mappings
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑋 𝑍
Seed dictionary
Basque English
Bilingual embedding mappings
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍
Seed dictionary
Basque English
Bilingual embedding mappings
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
Seed dictionary
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
Seed dictionary
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
Seed dictionary
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
arg min
𝑊∈𝑂(𝑛)
෍
𝑖
𝑋𝑖∗ 𝑊 − 𝑍𝑗∗
2
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
arg min
𝑊∈𝑂(𝑛)
෍
𝑖
𝑋𝑖∗ 𝑊 − 𝑍𝑗∗
2
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
arg min
𝑊∈𝑂(𝑛)
෍
𝑖
𝑋𝑖∗ 𝑊 − 𝑍𝑗∗
2
Basque English
Bilingual embedding mappings
𝑋1,∗
𝑋2,∗
⋮
𝑋 𝑛,∗
𝑊 ≈
𝑍1,∗
𝑍2,∗
⋮
𝑍 𝑛,∗
Txakur
Sagar
⋮
Egutegi
Dog
Apple
⋮
Calendar
𝑊
𝑋 𝑍 𝑋𝑊
arg min
𝑊∈𝑂(𝑛)
෍
𝑖
𝑋𝑖∗ 𝑊 − 𝑍𝑗∗
2
Basque English
Bilingual embedding mappings
Bilingual embedding mappings
Monolingual embeddings
Bilingual embedding mappings
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Mapping
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Mapping
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping
Dictionary
better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping
Dictionary
better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping Dictionary
Dictionary
better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping Dictionary
Dictionary
better!
even better!
even better!
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
proposed self-learning method
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
proposed self-learning method
formalization and implementation details in the paper
based on the mapping method of Artetxe et al. (2016)
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
proposed self-learning method
formalization and implementation details in the paper
based on the mapping method of Artetxe et al. (2016)
Too good to be true?
Bilingual embedding mappings
Mapping Dictionary
Monolingual embeddings
Dictionary
proposed self-learning method
formalization and implementation details in the paper
based on the mapping method of Artetxe et al. (2016)
Too good to be true?
Experiments
Experiments
• Dataset by Dinu et al. (2015)
Experiments
• Dataset by Dinu et al. (2015)
English-Italian
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
English-Italian
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
English-Italian English-German English-Finnish
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
English-Italian English-German English-Finnish
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs
English-Italian English-German English-Finnish
5,000 5,000 5,000
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs
English-Italian English-German English-Finnish
5,000 25 5,000 25 5,000 25
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a)
Xing et al. (2015)
Zhang et al. (2016)
Artetxe et al. (2016)
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a)
Xing et al. (2015)
Zhang et al. (2016)
Artetxe et al. (2016)
Our method
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%
Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%
Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%
Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%
Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%
Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%
Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%
Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%
Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%
Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%
Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%
Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%
Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian English-German English-Finnish
5,000 25 num. 5,000 25 num. 5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56%
Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42%
Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77%
Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
⇒ Test dictionary: 1,500 word pairs
English-Italian
5,000 25 num.
Mikolov et al. (2013a) 34.93% 0.00% 0.00%
Xing et al. (2015) 36.87% 0.00% 0.13%
Zhang et al. (2016) 36.73% 0.07% 0.27%
Artetxe et al. (2016) 39.27% 0.07% 0.40%
Our method 39.67% 37.27% 39.40%
word translation induction
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
WS RG WS
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl
Mikolov et al. (2013a) 5k dict
Xing et al. (2015) 5k dict
Zhang et al. (2016) 5k dict
Artetxe et al. (2016) 5k dict
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl
Mikolov et al. (2013a) 5k dict
Xing et al. (2015) 5k dict
Zhang et al. (2016) 5k dict
Artetxe et al. (2016) 5k dict
Our method
5k dict
25 dict
num.
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl 33.1% 33.5% 35.6%
Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%
Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%
Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%
Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%
Our method
5k dict 62.4% 74.2% 61.6%
25 dict 62.6% 74.9% 61.2%
num. 62.8% 73.9% 60.4%
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl 33.1% 33.5% 35.6%
Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%
Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%
Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%
Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%
Our method
5k dict 62.4% 74.2% 61.6%
25 dict 62.6% 74.9% 61.2%
num. 62.8% 73.9% 60.4%
crosslingual word similarity
Experiments
• Dataset by Dinu et al. (2015) extended to German and Finnish
⇒ Monolingual embeddings (CBOW + negative sampling)
⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
EN-IT EN-DE
Bi. data WS RG WS
Luong et al. (2015) Europarl 33.1% 33.5% 35.6%
Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8%
Xing et al. (2015) 5k dict 61.4% 70.0% 59.5%
Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6%
Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7%
Our method
5k dict 62.4% 74.2% 61.6%
25 dict 62.6% 74.9% 61.2%
num. 62.8% 73.9% 60.4%
crosslingual word similarity
Why does it work?
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
small
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
small large
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
small
no error
large
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
small
no error
large
errors
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
small
no error
large
errors
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
small
no error
large
errors
better?
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping Dictionary
small
no error
large
errors
better?
worse?
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping Dictionary
Dictionary
small
no error
large
errors
better?
worse?
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping Dictionary
Dictionary
small
no error
large
errors
better?
worse?
even better?
Why does it work?
Mapping Dictionary
Monolingual embeddings
Dictionary
Mapping
Mapping Dictionary
Dictionary
small
no error
large
errors
better?
worse?
even better?
even worse?
Why does it work?
𝑍𝑋𝑊
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Independent from seed dictionary!
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Independent from seed dictionary!
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Independent from seed dictionary!
So why do we need a seed dictionary?
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Independent from seed dictionary!
So why do we need a seed dictionary?
Avoid poor local optima!
Why does it work?
𝑍𝑋𝑊
𝑊∗
= arg max
𝑊
෍
𝑖
max
𝑗
𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇
= 𝑊 𝑇
𝑊 = 𝐼Implicit objective:
Conclusions
𝑍𝑋𝑊
Conclusions
𝑍𝑋𝑊
• Simple self-learning method to train bilingual embedding mappings
Conclusions
𝑍𝑋𝑊
• Simple self-learning method to train bilingual embedding mappings
• High quality results with almost no supervision (25 words, numerals)
Conclusions
𝑍𝑋𝑊
• Simple self-learning method to train bilingual embedding mappings
• High quality results with almost no supervision (25 words, numerals)
• Implicit optimization objective independent from seed dictionary
Conclusions
𝑍𝑋𝑊
• Simple self-learning method to train bilingual embedding mappings
• High quality results with almost no supervision (25 words, numerals)
• Implicit optimization objective independent from seed dictionary
• Seed dictionary necessary to avoid poor local optima
Conclusions
𝑍𝑋𝑊
• Simple self-learning method to train bilingual embedding mappings
• High quality results with almost no supervision (25 words, numerals)
• Implicit optimization objective independent from seed dictionary
• Seed dictionary necessary to avoid poor local optima
• Future work: fully unsupervised training
One more thing…
𝑍𝑋𝑊>
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
>
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
SRC_INPUT.EMB TRG_INPUT.EMB
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB
>
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB
> vecmap/reproduce_acl2017.sh
One more thing…
𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
> python3 vecmap/map_embeddings.py --self_learning --numerals
SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB
> vecmap/reproduce_acl2017.sh
--------------------------------------------------------------------------------
ENGLISH-ITALIAN
--------------------------------------------------------------------------------
5,000 WORD DICTIONARY
- Mikolov et al. (2013a) | Translation: 34.93% MWS353: 62.66%
- Xing et al. (2015) | Translation: 36.87% MWS353: 61.41%
- Zhang et al. (2016) | Translation: 36.73% MWS353: 61.62%
- Artetxe et al. (2016) | Translation: 39.27% MWS353: 61.74%
- Proposed method | Translation: 39.67% MWS353: 62.35%
25 WORD DICTIONARY
- Mikolov et al. (2013a) | Translation: 0.00% MWS353: -6.42%
- Xing et al. (2015) | Translation: 0.00% MWS353: 19.49%
- Zhang et al. (2016) | Translation: 0.07% MWS353: 15.52%
- Artetxe et al. (2016) | Translation: 0.07% MWS353: 17.45%
- Proposed method | Translation: 37.27% MWS353: 62.64%
NUMERAL DICTIONARY
- Mikolov et al. (2013a) | Translation: 0.00% MWS353: 28.75%
- Xing et al. (2015) | Translation: 0.13% MWS353: 27.75%
- Zhang et al. (2016) | Translation: 0.27% MWS353: 27.38%
- Artetxe et al. (2016) | Translation: 0.40% MWS353: 24.85%
- Proposed method | Translation: 39.40% MWS353: 62.82%
Thank you!
𝑍𝑋𝑊
https://github.com/artetxem/vecmap

More Related Content

Similar to Mikel Artetxe - 2017 - Learning bilingual word embeddings with (almost) no bilingual data

Verb conjugation class 601
Verb conjugation class 601Verb conjugation class 601
Verb conjugation class 601
Barbara M. King
 
Useful ideas for the class
Useful ideas for the classUseful ideas for the class
Useful ideas for the class
Universidad Santo Tomás
 
Sequencing run grief counseling: counting kmers at MG-RAST
Sequencing run grief counseling: counting kmers at MG-RASTSequencing run grief counseling: counting kmers at MG-RAST
Sequencing run grief counseling: counting kmers at MG-RAST
wltrimbl
 
From Database Optimizers To Data Science
From Database Optimizers To Data ScienceFrom Database Optimizers To Data Science
From Database Optimizers To Data ScienceLyticswareTechnologi
 
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologiesParallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Antonio Toral
 
Cartilla modulo ii de competencias ingles fundacion siigo
Cartilla modulo ii de competencias ingles fundacion siigoCartilla modulo ii de competencias ingles fundacion siigo
Cartilla modulo ii de competencias ingles fundacion siigo
Helver Gilberto Parra Gonzalez
 
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
Association for Computational Linguistics
 
Academic Writing: Think before you write - Week 3 2019
Academic Writing: Think before you write - Week 3 2019Academic Writing: Think before you write - Week 3 2019
Academic Writing: Think before you write - Week 3 2019
Ron Martinez
 
Building a suite of online resources to support academic vocabulary learning
Building a suite of online resources to support academic vocabulary learningBuilding a suite of online resources to support academic vocabulary learning
Building a suite of online resources to support academic vocabulary learning
Stefania Spina
 
MED ETA challenge
MED ETA challengeMED ETA challenge
MED ETA challenge
Macmillan Education
 
Barreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-posterBarreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-poster
INESC-ID (Spoken Language Systems Laboratory - L2F)
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
Delip Rao
 
Assessment and eligibility when working with bilingual children for slide sha...
Assessment and eligibility when working with bilingual children for slide sha...Assessment and eligibility when working with bilingual children for slide sha...
Assessment and eligibility when working with bilingual children for slide sha...
Bilinguistics
 
Templates in linguistics - Why Garbage Garbage
Templates in linguistics - Why Garbage GarbageTemplates in linguistics - Why Garbage Garbage
Templates in linguistics - Why Garbage Garbage
Hussein Ghaly
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
National Council on Interpreting in Health Care (NCIHC)
 
Secuencia didáctica integrada
Secuencia didáctica integradaSecuencia didáctica integrada
Secuencia didáctica integradapaquete2
 
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
Alexander Aldrich
 
Challenging "Highly Able" in MFL
Challenging "Highly Able" in MFLChallenging "Highly Able" in MFL
Challenging "Highly Able" in MFL
Rachel wellfair
 

Similar to Mikel Artetxe - 2017 - Learning bilingual word embeddings with (almost) no bilingual data (20)

Verb conjugation class 601
Verb conjugation class 601Verb conjugation class 601
Verb conjugation class 601
 
Useful ideas for the class
Useful ideas for the classUseful ideas for the class
Useful ideas for the class
 
Intern presentation
Intern presentationIntern presentation
Intern presentation
 
Sequencing run grief counseling: counting kmers at MG-RAST
Sequencing run grief counseling: counting kmers at MG-RASTSequencing run grief counseling: counting kmers at MG-RAST
Sequencing run grief counseling: counting kmers at MG-RAST
 
Parameter setting
Parameter settingParameter setting
Parameter setting
 
From Database Optimizers To Data Science
From Database Optimizers To Data ScienceFrom Database Optimizers To Data Science
From Database Optimizers To Data Science
 
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologiesParallel Corpora in (Machine) Translation: goals, issues and methodologies
Parallel Corpora in (Machine) Translation: goals, issues and methodologies
 
Cartilla modulo ii de competencias ingles fundacion siigo
Cartilla modulo ii de competencias ingles fundacion siigoCartilla modulo ii de competencias ingles fundacion siigo
Cartilla modulo ii de competencias ingles fundacion siigo
 
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
Ella Rabinovich - 2017 - FOUND IN TRANSLATION: Reconstructing Phylogenetic La...
 
Academic Writing: Think before you write - Week 3 2019
Academic Writing: Think before you write - Week 3 2019Academic Writing: Think before you write - Week 3 2019
Academic Writing: Think before you write - Week 3 2019
 
Building a suite of online resources to support academic vocabulary learning
Building a suite of online resources to support academic vocabulary learningBuilding a suite of online resources to support academic vocabulary learning
Building a suite of online resources to support academic vocabulary learning
 
MED ETA challenge
MED ETA challengeMED ETA challenge
MED ETA challenge
 
Barreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-posterBarreiro-Mota-VarDial@Coling2018-poster
Barreiro-Mota-VarDial@Coling2018-poster
 
NLP in Practice - Part I
NLP in Practice - Part INLP in Practice - Part I
NLP in Practice - Part I
 
Assessment and eligibility when working with bilingual children for slide sha...
Assessment and eligibility when working with bilingual children for slide sha...Assessment and eligibility when working with bilingual children for slide sha...
Assessment and eligibility when working with bilingual children for slide sha...
 
Templates in linguistics - Why Garbage Garbage
Templates in linguistics - Why Garbage GarbageTemplates in linguistics - Why Garbage Garbage
Templates in linguistics - Why Garbage Garbage
 
Say That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent AcumenSay That Again? Enhancing Your Accent Acumen
Say That Again? Enhancing Your Accent Acumen
 
Secuencia didáctica integrada
Secuencia didáctica integradaSecuencia didáctica integrada
Secuencia didáctica integrada
 
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
Acquisition of L2 Phonology: An Acoustic Analysis of the Centralization of L2...
 
Challenging "Highly Able" in MFL
Challenging "Highly Able" in MFLChallenging "Highly Able" in MFL
Challenging "Highly Able" in MFL
 

More from Association for Computational Linguistics

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Association for Computational Linguistics
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Association for Computational Linguistics
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Association for Computational Linguistics
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Association for Computational Linguistics
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Association for Computational Linguistics
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Association for Computational Linguistics
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Association for Computational Linguistics
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Association for Computational Linguistics
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Association for Computational Linguistics
 

More from Association for Computational Linguistics (20)

Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal TextMuis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
 
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
 
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour AnalysisCastro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
 
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text SimplificationElior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
 
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future DirectionsDaniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
 
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
 
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 

Recently uploaded

TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
Kartik Tiwari
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
DhatriParmar
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 

Recently uploaded (20)

TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 

Mikel Artetxe - 2017 - Learning bilingual word embeddings with (almost) no bilingual data

  • 1. Learning bilingual word embeddings with (almost) no bilingual data Mikel Artetxe, Gorka Labaka, Eneko Agirre IXA NLP group – University of the Basque Country (UPV/EHU)
  • 7. Who cares? - inherently crosslingual tasks word embeddings are useful!
  • 8. Who cares? - inherently crosslingual tasks - crosslingual transfer learning word embeddings are useful!
  • 9. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training word embeddings are useful!
  • 10. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training word embeddings are useful!
  • 11. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training Previous work word embeddings are useful!
  • 12. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training Previous work - parallel corpora word embeddings are useful!
  • 13. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training Previous work - parallel corpora - comparable corpora word embeddings are useful!
  • 14. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training Previous work - parallel corpora - comparable corpora - (big) dictionaries word embeddings are useful!
  • 15. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training Previous work - parallel corpora - comparable corpora - (big) dictionaries word embeddings are useful!
  • 16. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training This talkPrevious work - parallel corpora - comparable corpora - (big) dictionaries word embeddings are useful!
  • 17. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training This talk - 25 word dictionary Previous work - parallel corpora - comparable corpora - (big) dictionaries word embeddings are useful!
  • 18. Who cares? - inherently crosslingual tasks - crosslingual transfer learning bilingual signal for training This talk - 25 word dictionary - numerals (1, 2, 3…) Previous work - parallel corpora - comparable corpora - (big) dictionaries word embeddings are useful!
  • 20. Bilingual embedding mappings 𝑋 𝑍 Basque English
  • 21. Bilingual embedding mappings 𝑋 𝑍 Seed dictionary Basque English
  • 25. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 Seed dictionary Basque English
  • 26. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 Seed dictionary Basque English
  • 27. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 arg min 𝑊∈𝑂(𝑛) ෍ 𝑖 𝑋𝑖∗ 𝑊 − 𝑍𝑗∗ 2 Basque English
  • 28. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 arg min 𝑊∈𝑂(𝑛) ෍ 𝑖 𝑋𝑖∗ 𝑊 − 𝑍𝑗∗ 2 Basque English
  • 29. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 arg min 𝑊∈𝑂(𝑛) ෍ 𝑖 𝑋𝑖∗ 𝑊 − 𝑍𝑗∗ 2 Basque English
  • 30. Bilingual embedding mappings 𝑋1,∗ 𝑋2,∗ ⋮ 𝑋 𝑛,∗ 𝑊 ≈ 𝑍1,∗ 𝑍2,∗ ⋮ 𝑍 𝑛,∗ Txakur Sagar ⋮ Egutegi Dog Apple ⋮ Calendar 𝑊 𝑋 𝑍 𝑋𝑊 arg min 𝑊∈𝑂(𝑛) ෍ 𝑖 𝑋𝑖∗ 𝑊 − 𝑍𝑗∗ 2 Basque English
  • 37. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary
  • 38. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary better!
  • 39. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary better!
  • 40. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping better!
  • 41. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping better!
  • 42. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary better!
  • 43. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary better! even better!
  • 44. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary better! even better!
  • 45. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary better! even better!
  • 46. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary better! even better!
  • 47. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary Dictionary better! even better!
  • 48. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary Dictionary better! even better! even better!
  • 49. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary
  • 50. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary proposed self-learning method
  • 51. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary proposed self-learning method formalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016)
  • 52. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary proposed self-learning method formalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016) Too good to be true?
  • 53. Bilingual embedding mappings Mapping Dictionary Monolingual embeddings Dictionary proposed self-learning method formalization and implementation details in the paper based on the mapping method of Artetxe et al. (2016) Too good to be true?
  • 55. Experiments • Dataset by Dinu et al. (2015)
  • 56. Experiments • Dataset by Dinu et al. (2015) English-Italian
  • 57. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish English-Italian
  • 58. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish English-Italian English-German English-Finnish
  • 59. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) English-Italian English-German English-Finnish
  • 60. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs English-Italian English-German English-Finnish 5,000 5,000 5,000
  • 61. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs English-Italian English-German English-Finnish 5,000 25 5,000 25 5,000 25
  • 62. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num.
  • 63. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num.
  • 64. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. word translation induction
  • 65. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) Xing et al. (2015) Zhang et al. (2016) Artetxe et al. (2016) word translation induction
  • 66. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) Xing et al. (2015) Zhang et al. (2016) Artetxe et al. (2016) Our method word translation induction
  • 67. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56% Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42% Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77% Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47% word translation induction
  • 68. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56% Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42% Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77% Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47% word translation induction
  • 69. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56% Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42% Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77% Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47% word translation induction
  • 70. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56% Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42% Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77% Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47% word translation induction
  • 71. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian English-German English-Finnish 5,000 25 num. 5,000 25 num. 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% 35.00% 0.00% 0.07% 25.91% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% 41.27% 0.07% 0.53% 28.23% 0.07% 0.56% Zhang et al. (2016) 36.73% 0.07% 0.27% 40.80% 0.13% 0.87% 28.16% 0.14% 0.42% Artetxe et al. (2016) 39.27% 0.07% 0.40% 41.87% 0.13% 0.73% 30.62% 0.21% 0.77% Our method 39.67% 37.27% 39.40% 40.87% 39.60% 40.27% 28.72% 28.16% 26.47% word translation induction
  • 72. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals ⇒ Test dictionary: 1,500 word pairs English-Italian 5,000 25 num. Mikolov et al. (2013a) 34.93% 0.00% 0.00% Xing et al. (2015) 36.87% 0.00% 0.13% Zhang et al. (2016) 36.73% 0.07% 0.27% Artetxe et al. (2016) 39.27% 0.07% 0.40% Our method 39.67% 37.27% 39.40% word translation induction
  • 73. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals
  • 74. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals crosslingual word similarity
  • 75. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE WS RG WS crosslingual word similarity
  • 76. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS crosslingual word similarity
  • 77. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl crosslingual word similarity
  • 78. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl Mikolov et al. (2013a) 5k dict Xing et al. (2015) 5k dict Zhang et al. (2016) 5k dict Artetxe et al. (2016) 5k dict crosslingual word similarity
  • 79. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl Mikolov et al. (2013a) 5k dict Xing et al. (2015) 5k dict Zhang et al. (2016) 5k dict Artetxe et al. (2016) 5k dict Our method 5k dict 25 dict num. crosslingual word similarity
  • 80. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl 33.1% 33.5% 35.6% Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8% Xing et al. (2015) 5k dict 61.4% 70.0% 59.5% Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6% Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7% Our method 5k dict 62.4% 74.2% 61.6% 25 dict 62.6% 74.9% 61.2% num. 62.8% 73.9% 60.4% crosslingual word similarity
  • 81. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl 33.1% 33.5% 35.6% Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8% Xing et al. (2015) 5k dict 61.4% 70.0% 59.5% Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6% Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7% Our method 5k dict 62.4% 74.2% 61.6% 25 dict 62.6% 74.9% 61.2% num. 62.8% 73.9% 60.4% crosslingual word similarity
  • 82. Experiments • Dataset by Dinu et al. (2015) extended to German and Finnish ⇒ Monolingual embeddings (CBOW + negative sampling) ⇒ Seed dictionary: 5,000 word pairs / 25 word pairs / numerals EN-IT EN-DE Bi. data WS RG WS Luong et al. (2015) Europarl 33.1% 33.5% 35.6% Mikolov et al. (2013a) 5k dict 62.7% 64.3% 52.8% Xing et al. (2015) 5k dict 61.4% 70.0% 59.5% Zhang et al. (2016) 5k dict 61.6% 70.4% 59.6% Artetxe et al. (2016) 5k dict 61.7% 71.6% 59.7% Our method 5k dict 62.4% 74.2% 61.6% 25 dict 62.6% 74.9% 61.2% num. 62.8% 73.9% 60.4% crosslingual word similarity
  • 83. Why does it work?
  • 84. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary
  • 85. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary small
  • 86. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary small large
  • 87. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary small no error large
  • 88. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary small no error large errors
  • 89. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary small no error large errors
  • 90. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary small no error large errors better?
  • 91. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Dictionary small no error large errors better? worse?
  • 92. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary Dictionary small no error large errors better? worse?
  • 93. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary Dictionary small no error large errors better? worse? even better?
  • 94. Why does it work? Mapping Dictionary Monolingual embeddings Dictionary Mapping Mapping Dictionary Dictionary small no error large errors better? worse? even better? even worse?
  • 95. Why does it work? 𝑍𝑋𝑊
  • 96. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 97. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective: Independent from seed dictionary!
  • 98. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 99. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 100. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 101. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 102. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 103. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 104. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 105. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 106. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 107. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 108. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 109. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 110. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 111. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 112. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 113. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 114. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 115. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective: Independent from seed dictionary!
  • 116. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective: Independent from seed dictionary! So why do we need a seed dictionary?
  • 117. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective: Independent from seed dictionary! So why do we need a seed dictionary? Avoid poor local optima!
  • 118. Why does it work? 𝑍𝑋𝑊 𝑊∗ = arg max 𝑊 ෍ 𝑖 max 𝑗 𝑋𝑖∗ 𝑊 ∙ 𝑍𝑗∗ s.t. 𝑊𝑊 𝑇 = 𝑊 𝑇 𝑊 = 𝐼Implicit objective:
  • 120. Conclusions 𝑍𝑋𝑊 • Simple self-learning method to train bilingual embedding mappings
  • 121. Conclusions 𝑍𝑋𝑊 • Simple self-learning method to train bilingual embedding mappings • High quality results with almost no supervision (25 words, numerals)
  • 122. Conclusions 𝑍𝑋𝑊 • Simple self-learning method to train bilingual embedding mappings • High quality results with almost no supervision (25 words, numerals) • Implicit optimization objective independent from seed dictionary
  • 123. Conclusions 𝑍𝑋𝑊 • Simple self-learning method to train bilingual embedding mappings • High quality results with almost no supervision (25 words, numerals) • Implicit optimization objective independent from seed dictionary • Seed dictionary necessary to avoid poor local optima
  • 124. Conclusions 𝑍𝑋𝑊 • Simple self-learning method to train bilingual embedding mappings • High quality results with almost no supervision (25 words, numerals) • Implicit optimization objective independent from seed dictionary • Seed dictionary necessary to avoid poor local optima • Future work: fully unsupervised training
  • 126. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git
  • 127. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git >
  • 128. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py
  • 129. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals
  • 130. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB
  • 131. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB
  • 132. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB >
  • 133. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB > vecmap/reproduce_acl2017.sh
  • 134. One more thing… 𝑍𝑋𝑊> git clone https://github.com/artetxem/vecmap.git > python3 vecmap/map_embeddings.py --self_learning --numerals SRC_INPUT.EMB TRG_INPUT.EMB SRC_OUTPUT.EMB TRG_OUTPUT.EMB > vecmap/reproduce_acl2017.sh -------------------------------------------------------------------------------- ENGLISH-ITALIAN -------------------------------------------------------------------------------- 5,000 WORD DICTIONARY - Mikolov et al. (2013a) | Translation: 34.93% MWS353: 62.66% - Xing et al. (2015) | Translation: 36.87% MWS353: 61.41% - Zhang et al. (2016) | Translation: 36.73% MWS353: 61.62% - Artetxe et al. (2016) | Translation: 39.27% MWS353: 61.74% - Proposed method | Translation: 39.67% MWS353: 62.35% 25 WORD DICTIONARY - Mikolov et al. (2013a) | Translation: 0.00% MWS353: -6.42% - Xing et al. (2015) | Translation: 0.00% MWS353: 19.49% - Zhang et al. (2016) | Translation: 0.07% MWS353: 15.52% - Artetxe et al. (2016) | Translation: 0.07% MWS353: 17.45% - Proposed method | Translation: 37.27% MWS353: 62.64% NUMERAL DICTIONARY - Mikolov et al. (2013a) | Translation: 0.00% MWS353: 28.75% - Xing et al. (2015) | Translation: 0.13% MWS353: 27.75% - Zhang et al. (2016) | Translation: 0.27% MWS353: 27.38% - Artetxe et al. (2016) | Translation: 0.40% MWS353: 24.85% - Proposed method | Translation: 39.40% MWS353: 62.82%