Be the first to like this
Scientific article: We propose a novel unsupervised semantic method for distinguishing cognates from false friends. The basic intuition is that if two words are cognates, then most of the words in their respective local contexts should be translations of each other. The idea is formalised using the Web as a corpus, a glossary of known word translations used as cross-linguistic âbridgesâ, and the vector space model. Unlike traditional orthographic similarity measures, our method can easily handle words with identical spelling. The evaluation on 200 Bulgarian-Russian word pairs
shows this is a very promising approach.