“LOST IN TRANSLATION”
– an introduction to electronic language translation:
challenges, methods, and future.
by Gabriel Emanuel Borlean
Syddanskerhvervsskole – IT og Data afdeling, Odense
Efteråret, 26 October, 2011
1
Table of Contents: Page:
1. Introduction 2.
2. Linguistics: Concepts and Terminology 2.
3. History of Electronic Language Translation 4.
4. Challenges: Linguistic 5.
5. Challenges: Technical 5.
6. Statistical Machine Translation – Theory 8.
7. Electronic Translators - Products and Services 9.
8. SYSTRAN in a nutshell 10.
9. Google Translate 10.
10. Product Comparison 11.
11. Rules and Regulations, Terms and Conditions 12.
12. The Present Future? 12.
13. Conclusion 13.
14. References 14.
15. Appendix A 16.
16. Appendix B 16.
17. Extra Resources 17.
NOTE: Image on cover page, Hodson, 2009.
2
Introduction
Since prehistoric times people have used various ways to communicate with each other. Gesticulation
and body language have been the rudimentary methods to communicate, especially among folks with
limited foreign language knowledge. In our globalized and internet-connected world, one can use more
than standard printed dictionaries as aids in the translation work; there exist various electronic and
online language translation tools. This paper attempts to understand what are the concepts involved in
language translation, the history of machine assisted translation and the challenges involved in a
faithful translation. Furthermore this paper will analyze the most popular and known electronic
solutions (products and services) and consider what the future development roadmap of language
translation is.
Linguistics: Concepts and Terminology
The old and famous Italian adage – “Tradutore, Traditore” (“Translator, Traitor” in Italian) captures the
essence and challenge of language translation. In the Italian language the verb “to translate”
(“tradure”) and the verb “to betray” (“tradire”) are very similar phonetically speaking. Thus the birth
of the saying “To translate is to betray.” The important question raised by this proverb is “How can
language translation not lose the fidelity of the original language?” (Wooden, 2011). Let’s take an
overview of the field dealing with languages – linguistics, and define some important terms that are
fundamental in any discussion on faithful language translation.
Linguistics is the scientific study of human languages (Fromkin et al., 2000). This field can be
broken down into three distinct areas: 1) language structure (meaning grammar, morphology, syntax
and phonology), 2) language meaning (which includes semantics and pragmatics, and also resolved
3
ambiguity) and 3) language in its context (from evolutionary linguistics to historical linguistics, or from
neurolinguistics to language acquisition and others; http://en.wikipedia.org/wiki/Linguistics).
For the scope of this research paper the most important terms are grammar, syntax, lexicon,
and semantics and pragmatics. Grammar is the “set of structural rules that govern the composition
of clauses, phrases, and words in any given natural language” (http://en.wikipedia.org/wiki/Grammar).
Syntax is defined as “the combination of words into sentences” or the rules that govern the sentence
structure of any individual language (http://www.ielanguages.com/linguist.html). A lexicon comprises
all the words and expressions of a particular language. In other words a lexicon is the thesaurus of a
language, or the mental vocabulary in a speaker’s mind. Semantics is concerned with the meaning of
words and larger syntactic units and the relationship among words or these larger units
(http://www.ielanguages.com/linguist.html). The area of pragmatics is simply the subfield that studies
the way linguistic contexts affect meaning.
For a better comprehension of these terms I will attempt to come up with some simple analogies
or examples. “I saw that she a cookie ate” is an example of an incorrect syntax in English.
Shakespeare’s writings are supposed to have contained a lexicon of between 20,000 and 30,000 words
(http://blog.oxforddictionaries.com/2011/04/shakespeare-language/). Dictionary.com states that
“semantics is commonly used to refer to a trivial point or distinction that revolves around mere words
rather than significant issues: ‘To argue whether the medication killed the patient or contributed to her
death is to argue over semantics’ ”. Sentences such as “Flying planes can be dangerous” or “The
missionaries are ready to eat” are prime examples of pragmatics, where the meaning is ambiguous,
because a word, phrase, or sentence can mean either one or the other of two things
(http://grammar.about.com/od/pq/g/pragmaticsterm.htm).
4
As defined earlier, the field of linguistics is a vast field encompassing multitudes of subfields
and is also interconnected with other fields (such as neurology, psychology, human development, or
history among others). A fascinating aspect of language translation is how technology has been used to
assist in this important task. Let’s look next at the history of human technology in the quest for
language translation.
History of Electronic Language Translation
In the world of language translation the Rosetta Stone stands out as the most critical archeological find
that helped 18th century linguists understand the relationship between Egyptian hieroglyphics, ancient
Greek and pre-Coptic Egyptian. (“Rosetta Stone”; “Rosetta Stone - Egypt, Ptolemaic Period, 196 BC,
The”). Between the 13th and 18th centuries there have been various attempts to overcome the barriers
created by natural languages by means of designing a universal language (Hausser, pg. 1). One prime
example from this historical period is René Descartes, who tried to “convey philosophical ideas and
propositions as unambiguously as mathematics is consistent” by the use of “phonetic regularity and
grammatical rationality” (Niemetz, 2003).
With the closing of WWII history chapter and the beginning of the Cold War, computational
machines became crucial in the intelligence work of language translation. The most widely publicized
and successful attempt has been the January 7, 1954 demonstration of Georgetown University and
IBM’s collaboration at translating from Russian to English using computational linguistics (Hutchins,
2004; “701 Translator – IBM press release”, 1954; Hutchins, 2006). This demonstration along with the
work on the APEXC machine at Birkbeck College ushered a new era of optimism and high
expectations for machine language translation ((http://en.wikipedia.org/wiki/Machine_translation).
5
Challenges: Linguistic
The linguistic challenge in language translation comes as a result of the ambiguities in the source
language: lexical, grammatical, semantics, or reference to pronouns
(http://fr.wikipedia.org/wiki/Traduction_automatique).
Other difficult aspects of language translation are the two ideals of fidelity and transparency.
These two ideals are often at polar ends of each other and provide a great challenge especially to
ancient language translators. The reason why there are so many Bible translations (from the ancient
Hebrew and Koine Greek texts) is because it is very difficult to both “accurately render the meaning of
the source text, without distortion” (textual fidelity) and at the same time provide a translation which
“appears to a native speaker of the target language to have originally been written in that language, and
conforms to its grammar, syntax and idiom” (transparency; http://en.wikipedia.org/wiki/Translation).
One can also argue that the diversity of languages provides a benefit - “the confusion of
language was in fact a benefit. A universal language without cultural diversity would actually stunt the
evolution of unique ideas, because everyone would have a similar framework in which they thought”
(Niemetz, 2003). Now let us consider the technical aspects and models involved in translation.
Challenges: Technical
The translation process can be simply broken down into just two steps: a) decode the meaning of the
source and then, b) recode the meaning in the target language
(http://fr.wikipedia.org/wiki/Traduction_automatique). These two steps are the most basic and
rudimentary processes that a machine translation has to perform. But this does not make the machine
6
translation work easy or very accurate. One has to decide on a model or approach for decoding or
recoding a language.
One model is the Rule-Based Machine Translation (or RBMT). This model is based on a word
for word, transfer, and pivot techniques (http://en.wikipedia.org/wiki/Rule-based_machine_translation).
A machine translation that is based on this approach will require dictionary entries, and linguistic rules.
A second approach is called translation by example; also known as Example-Based Machine
Translation – EBMT (http://en.wikipedia.org/wiki/Example-based_machine_translation). This
particular translation approach is done by analogy, and extraction and combination of phrases, or other
short parts of text. An Example-Based Machine Translation system is “given a set of sentences in the
source language (from which one is translating) and their corresponding translations in the target
language, and uses those examples to translate other, similar source-language sentences into the target
language. The basic premise is that, if a previously translated sentence occurs again, the same
translation is likely to be correct again” (Ralf Brown, 2004).
A third model to machine translation is Statistical Machine Translation (STM). The simple
definition of this approach is “the translation of text from one human language to another by a
computer that learned how to translate from vast amounts of translated text” (Koehn).
A novel and exceptional attempt has also been made to re-define the translation process, from a
two-step process using two languages to a different two-step process using an intermediate language
(IL). Thus, instead of decoding the meaning of the source and then recoding the meaning in the target
language, one would take the source language and translate into an IL and then from IL to the target
language (http://en.wikipedia.org/wiki/Distributed_Language_Translation; ”Esperato as an
7
interlingua”; http://en.wikipedia.org/wiki/Esperanto). This novel approach has been attempted with
inconclusive results by a Dutch company in the 1980s using the universal Esperanto language.
Interlingua is another international auxiliary language that can be used as an IL
(http://en.wikipedia.org/wiki/Comparison_between_Esperanto_and_Interlingua; “Esperato vs.
Interlingua” forum; ”Interlingua vs. Esperato” forum). This process of using an IL is known as
Distributed Language Translation – DLT (”Esperato as an interlingua.”).
The ideal Machine Translation model can be represented graphically as:
Unfortunately, the current best performing machine
translation systems are crawling at the bottom and are represented by the following model (Knight,
Koehn, pg. 10):
Machine Translation
8
While there is no computational system that provides “the holy grail of fully automatic high-
quality machine translation of unrestricted text,” there are many fully automated systems (see
Electronic Translators: Products and Services section) that produce very reasonable output (see
SYSTRAN and Google Translate sections). When the language domain is restricted and controlled the
quality of machine translation is also greatly improved (“The electronic interpreter”, 2000).
Statistical Machine Translation - Theory
Statistical Machine Translation is based primarily on information theory and probability distribution
according to Bayes’ Theorem (Brown, Pietra, Pietra and Mercer, pg. 264):
e represents a string of English words, which can be
translated into a string of French words in various ways, f. Thus the notation (e,f) represents the pair of
string of an English words-string and one of its French equivalent words-string. Pr(f|e) represents the
probability that a translator will produce f when presented with e. Pr(f) or Pr(e) is called the language
model, and Pr(f|e) or Pr(e|f) are called the translation models.
The fundamental equation of machine translation is to maximize the nominator of the equation
in Bayes’ Theorem. The goal of the translation system is to find the string e that the native speaker had
in mind when he produced f. To minimize the chance of error one has to choose the specific English
string “e with hat” for which Pr(e|f) is the greatest. (Chen, page 1; Knight, Koehn page 5).
9
Electronic Translators - Products and Services
One finds a multitude of electronic translators and online translating tools and programs. A way to
categorize these products is based on their roles and capabilities. One such category can be “static”
translational tool, such as online dictionaries and electronic translator gadgets. The other category that
these products fall in is what the writer calls “dynamic” translational products, because of their use of
Example-Based Machine Translation or Statistical Machine Translation models. It is important to note
that as translation tools and gadgets get more sophisticated the line between “static” and “dynamic”
translators becomes fuzzier.
In the American market, Franklin Electronic Publisher dictionary and translator is a classic
example of a static translation product. These gadgets that provide electronic versions of dictionaries
also incorporate voice translation and are becoming more prevalent in every national market.
In the Danish market, Ordbogen.com is a prime example of a “static” online (internet)
dictionary. Ordbogen.com is Denmark’s largest online dictionary company with over 30 products,
some of them being in specialized fields such as law, medicine, or music. While Ordbogen.com is
primarily a static online translational tool, it also incorporates some dynamic aspects. These aspects
are the dynamic updates and daily addition of new words and response to customer unsuccessful query
searches. Ordbogen.com also offers machine translation tools and traditional personal translation
services to the private enterprise, individual consumers or the educational field (Peter Sepstrup, Product
Manager Ordbogen.com, personal communication, 7 Oct. 2011). Since its inception 10 years ago
Ordbogen.com has proven to be such a successful cutting-edge company, and has in recent years
expanded and entered the global market with Lemma.com.
10
A list of other Danish online dictionaries and translators are available in Appendix A. Now
let’s look at some successful implementations of Machine Translation.
SYSTRAN in a nutshell
The oldest and most significant machine translation company has been SYSTRAN. Founded in 1968,
SYSTRAN has been utilized by the US government and EU. SYSTRAN provides the engine for
Yahoo! Babel Fish translation, and had been used by Google language translating tools until 2007.
SYSTRAN is also used by the Dashboard Translation widget in the latest Mac OS X version.
“Commercial versions of SYSTRAN can run on Microsoft Windows (including Windows Mobile),
Linux, and Solaris. Historically, SYSTRAN systems used Rule-Based Machine Translation (RBMT)
technology. With the release of Systran Server 7 in 2010, SYSTRAN implemented a hybrid Rule-
Based/Statistical Machine Translation (SMT) technology which was the first of its kind in the
marketplace” (http://en.wikipedia.org/wiki/SYSTRAN ).
Google Translate
In recent years Google Translate has become a popular online translation tool. Instead of utilizing
SYSTRAN or any traditional rule-based methods, they cast their lot with statistical analysis and a fresh
new approach. As the “How Google Translate Works video” explains, Google has undertaken the task
of using already available translated books, UN documents (which has 5 official languages) and foreign
websites in order to build a large linguistic database of comparative and equal phrases or words. Thus
the computer looks for “patterns between the translation and the original text that are unlikely to occur
by chance. Once the computer finds a pattern, it can use this pattern to translate similar texts in the
future. When you repeat this process billions of times you end up with billions of patterns and one very
11
smart computer program” (http://googlesystem.blogspot.com/2010/08/how-google-translate-
works.html).
It is important to note that translation quality by Google can vary from language to language
and this quality depends on the number of translated documents available for each particular language.
Another pattern seen by this translation model is that technical or diplomatic words and phrases
have a higher likelihood to appear in the translation database compared to street language, idioms, or
colloquialism. An example of this discrepancy is the mistranslation from English to Danish of the
adage (saying, proverb) “Haste makes waste.” The Google Translate engine result is “Hastværk gør
affald”. Similarly, taking the equivalent Danish phrase (udtrykket) “Hastværk er lastværk” and
translating it with Google Translate one gets the confusing result of “Haste, less speed.”
A competitor of Google Translate to watch in the future is Science Applications International
Corporation (SAIC Language Services; http://www.saic.com/natsec/language-services.html ).
Product Comparison
The plethora of translation tools and applications, both online and for the mobile market makes a
thorough product comparison impossible. The features that distinguish one product from another are
important to consumers and help to “push the technological envelope”.
For the “static” translators, important features include pronunciation, use of term in the form of
a sentence example, multiple language support and ease of use (user-friendliness). For the “dynamic”
translators, the additional features of voice recognition software integration, multi-touch, display
technology, voice-to-text translation, e-mail/chat transfer support are what make some products stand
out over others. A list of online and mobile-device translators is available in Appendix B.
12
Rules and Regulations, Terms and Conditions
The translation products available on the market are either license base/copyright protected or are
freeware/open-source. The machine translation algorithm models are public knowledge, with special
tweaks that each individual company makes and which remain industry secrets. The use of a product
protected by copyright, trademark, purchase license, or open-source cannot be used outside the terms
and conditions that the company has created. Most electronic products, including freeware, have
restricted usage rights. One such restricted usage right is giving due credit to the creator/owner
company of the product when referencing such product.
The Present Future
"The Internet is rapidly evolving from a collection of stationary devices to a fluid network of mobile
devices" (“Accessing the WAN – CCNA Exploration,” section 7.3.1.3). Voice recognition software
has entered in its maturity stage and can be easily integrated in applications such as language
translators. One can easily imagine a world where Siri - Apple’s recent speech-recognition "personal
assistant - will provide language translation and be used in an international corporate setting. With
more and more e-books and cross-cultural documents available online, with more people interacting
across the world digitally, the more accurate and more robust will Google Translate and SYSTRAN
products become. Technology has proven to be a major contributor to aiding in the difficult task of
language translation.
Technology and the use of statistical machine translation have also been just recently employed
to solve century old puzzles and mysteries, like the Copiale Cipher (Mark Brown, 2011). These are
exciting times to live in and where the future does not seem so distant.
13
Conclusion
Isaac Assimov stated that “the brain is the next frontier in science” and is “more complex than the
universe” (Asimov, pg. 12, 45). The human brain is the creator of languages and probably the best
machine to handle all the complex and intricate nuances and aspects of a language. While machine
translation systems are becoming very sophisticated in handling grammar, syntax, lexicon, semantics or
pragmatics, human languages are always in a state of flux. Changing fads, shifting cultural attitudes,
morphing philosophies, current world events and colloquialisms are all strong factors in keeping human
languages in a perpetual state of change. In the future, human language translating staff and services
will most likely never be without a job or facing a dying career, but technology will only enhance their
work as human languages evolve. The symbiotic work between humans and machines will definitely
result in more “tradutore” and less of a “traditore.”
14
References:
“Accessing the WAN – CCNA Exploration” version 4. Cisco Network Academy. Web. 25 Sep. 2011. URL:
http://slpl.cse.nsysu.edu.tw/cpchen/publication/aclclp_mt.pdf
Asimov, Isaac. The Human Brain: Its Capacities and Functions.1994. Signet. ISBN 978-0451628671, 298 pages.
Brown, Mark. “Modern Algorithms Crack 18th Century Secret Code.” Wired UK. 26 Oct. 2011. Web. 26 Oct. 2011.
URL: http://www.wired.com/wiredscience/2011/10/copiale-cipher-crack/ .
Brown, Ralf. ”Example-Based Machine Translation” Nov. 2004. Web. 6 Oct. 2011. URL:
http://www.cs.cmu.edu/~ralf/ebmt/intro.html
Brown, Peter F. Pietra, Stephen A. Della. Pietra, Vincent J. Della. Mercer, Robert L. “The Mathematics of Statisctical
Machine Translation: Parameter Estimation.” Computational Linguistics,Vol. 19, Nr.2, June 1993. Web. 14 Oct. 2011.
URL: http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf side 264
Chen, Chia-Ping. “Machine Translation: A Score Years Ago.” ACLCLP Newsletter, 21(4). October 2010. Web. 14
Oct. 2011. URL: http://slpl.cse.nsysu.edu.tw/cpchen/publication/aclclp_mt.pdf.
”Electronic interpreter, The.” The Guardian, 27 January 2000. Web. 7 Oct. 2011. URL:
http://www.guardian.co.uk/technology/2000/jan/27/onlinesupplement5/print
”Esperato as an interlingua.” Examples of Machine Translation. May 2000. San Diego State University. Web.7 Oct.
2011. URL: http://www-rohan.sdsu.edu/~ling354/MT-eg.html#Esperanto_as_an_interlingua.
“Esperato vs. Interlingua” forum. How-To-Learn-Any-Lanugage.com. Jan 2010. Web.7 Oct. 2011. URL: http://how-
to-learn-any-language.com/forum/forum_posts.asp?TID=18712&PN=1
Fromkin, Victoria. Bruce Hayes, Susan Curtiss, Anna Szabolcsi, Tim Stowell and Donca Steriade (2000). Linguistics:
An Introduction to Linguistic Theory. Oxford: Blackwell. p. 3. ISBN 0631197117.
Hausser, Roland. ”Overcoming Language Barriers by Means of Computers.” Web. 6 Oct. 2011. URL:
http://www.linguistik.uni-erlangen.de/~rrh/papers/sejong.pdf .
Hodson,Steven. “Traduttore, traditore”, WinExtra, November 16, 2009. Web: Multi_flag_thumb.png image. 6 Oct.
2011, URL: http://www.winextra.com/archives/traduttore-traditore/
Hutchins, John. “The first public demonstration of machine translation: the Georgetown-IBM system,7th January
1954.” March 2006. Web. 5 Oct. 2011. URL: http://www.hutchinsweb.me.uk/GU-IBM-2005.pdf
Hutchins, John. ”The Georgetown-IBM experiment demonstrated in January 1954.” AMTA conference in September
2004. Web. 5 Oct. 2011. URL: http://www.hutchinsweb.me.uk/AMTA-2004-ppt.pdf .
”Interlingua vs.Esperato” forum. AntiMoon.com– How to learn English effectively. Aug 2008. Web. 7 Oct. 2011.
URL: Forum http://www.antimoon.com/forum/t11408.htm
15
Knight, Kevin. Koehn, Philipp. “What’s New in Statistical Machine Translation.” Tutorial Tutorial at HLT/NAACL .
2004. Web. 7 Oct. 2011. URL: http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf
Koehn, Philipp. ”Statisctical Machine Translation” front page. Web. 7 Oct. 2011. URL: http://www.statmt.org/
Niemetz, Anne.“A Universal Language: The Myth, Search and Experiments”, March 2003. Web. 6 Oct. 2011. URL:
http://users.design.ucla.edu/~aniemetz/utm/index.html.
"Rosetta Stone." Encyclopædia Britannica.Encyclopædia Britannica Online.Encyclopædia Britannica Inc., 2011.
Web. 6 Oct. 2011.URL: http://www.britannica.com/EBchecked/topic/509988/Rosetta-Stone.
“Rosetta Stone - Egypt, Ptolemaic Period, 196 BC, The. “ The British Museum. Web. 6 Oct. 2011. URL:
http://www.britishmuseum.org/explore/highlights/highlight_objects/aes/t/the_rosetta_stone.aspx
Wooden,Cindy. “Mass translations are a challenge in every language, officials says”. Catholic News Services. 9th
August,2011. Web. 6 Oct. 2011. URL: http://www.catholicnews.com/data/stories/cns/1103160.htm
”701 Translator – IBM press release.” January 8, 1954. Web. 5 Oct. 2011. URL: http://www-
03.ibm.com/ibm/history/exhibits/701/701_translator.html.
16
APPENDIX A
Online dictionaries and language translator in Denmark:
 Ordbogen.com http://www.ordbogen.com/
 Gyldendals Røde Ordbøger http://ordbog.gyldendal.dk/
 Den Danske Ordbog – moderne dansk sprog http://ordnet.dk/ddo
 Den Danske Online Ordbog http://www.ddoo.dk/
 Dansk Parlør http://www.parlor.dk/
 Kryds & Tværs http://kryds.onlineordbog.dk/
 Rimordbog http://www.rimordbog.dk/
 Engelsk Ordbog http://www.onlineordbog.dk/wordnet/da/
 Danish Word http://www.danishword.com/
 Dialekt.dk – Københavns Universitet http://dialekt.ku.dk/
APPENDIX B
Online Translators:
 SYSTRAN (FREE) http://www.systran.co.uk/
 Yahoo Bablefish http://babelfish.yahoo.com/
 Google Translate http://translate.google.com/
 WordLingo http://www.worldlingo.com/en/products/
 SDL FreeTranslation http://www.freetranslation.com/
iPhone/Android Apps:
 Google Translate from App Store or Android Market http://www.businessinsider.com/best-
translation-apps-2011-3#google-translate-translates-your-words-as-you-speak-into-your-
devices-mic-1
 SAIC Linguistics – Google Translate competitor
http://www.informationweek.com/thebrainyard/news/workgrouping_team_collaboration_works
paces/231602149/saic-takes-on-google-with-speech-translation-apps ,
http://www.saic.com/natsec/language-services.html
 iSpeak app from Acapela Group: http://www.acapela-group.com/translator-for-iphone-acapela-
speech-empowers-ispeak-advanced-tool-designed-by-future-apps-in-9-languages--2040-speech-
synthesis.html
 Vocre app from myLanguage http://www.reuters.com/article/2011/09/26/uk-app-language-
idUSLNE78P02520110926 , http://www.gottabemobile.com/2011/09/16/vocre-translation-app/
, http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign-languages/
17
 SmartTrans http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign-
languages/
 Jibbigo app http://www.jibbigo.com/website/index.php ,
http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign-languages/
 Ultralingua app http://www.ultralingua.com/products
 iLinguist app from Apps Dev Team http://www.appsdevteam.com/
 Linguo app from Edovia Inc. from iTunes Store
http://www.brighthub.com/mobile/iphone/articles/86375.aspx
 Interpret from iTunes Store http://www.brighthub.com/mobile/iphone/articles/86375.aspx
 Tap-Translate or Tap-Dictionary for Android http://www.businessinsider.com/best-
translation-apps-2011-3#tap-translate-lets-you-translate-as-you-read-in-safari-5 ,
http://iostranslate.appspot.com/tap-translate-support.html
 Odyssey Translator from App Store http://www.businessinsider.com/best-translation-apps-
2011-3#odyssey-translator-pro-lets-you-build-sentences-youd-actually-use-in-real-life-8
 Word Lens from App Store http://www.businessinsider.com/best-translation-apps-2011-
3#word-lens-is-a-great-bet-for-translating-signs-and-other-text-you-stumble-upon-9
 SayHi Translate from Nuance Communications Inc.
http://www.marketwatch.com/story/nuance-gives-sayhi-translate-app-a-voice-2011-09-15
 Abby TextGrabber + Translator http://www.ipadnewstracker.com/mobile/2011/09/abbyy-
textgrabber-translator-for-the-iphone-just-grab-and-translate/
 voice to text translation application (upcoming) from Ortsbo
http://www.mobilemag.com/2011/09/21/ortsbo-to-release-voice-to-text-translation-and-crowd-
sourcing-wiki/
 Translating an App into different foreign languages and aiding with localization from
Tethras http://www.guardian.co.uk/technology/appsblog/2011/sep/29/tethras-translation-
localisation-apps?newsfeed=true
Extra Resources:
Statistical Machine Translation Theory (extra links):
 http://en.wikipedia.org/wiki/Statistical_machine_translation
 http://en.wikipedia.org/wiki/Language_model
 What's new in Statistical Machine Translation?
 EMNLP 2011 SIXTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION
 Papers presented
SYSTRAN in a nutshell (extra links)
 http://en.wikipedia.org/wiki/SYSTRAN:
 Online Systran Free translation http://www.systran.co.uk/
 Guardin Article in 2000 about SysTran ...
http://www.guardian.co.uk/technology/2000/jan/27/onlinesupplement5

Lost in Translation - Gabriel Emanuel Borlean

  • 1.
    “LOST IN TRANSLATION” –an introduction to electronic language translation: challenges, methods, and future. by Gabriel Emanuel Borlean Syddanskerhvervsskole – IT og Data afdeling, Odense Efteråret, 26 October, 2011
  • 2.
    1 Table of Contents:Page: 1. Introduction 2. 2. Linguistics: Concepts and Terminology 2. 3. History of Electronic Language Translation 4. 4. Challenges: Linguistic 5. 5. Challenges: Technical 5. 6. Statistical Machine Translation – Theory 8. 7. Electronic Translators - Products and Services 9. 8. SYSTRAN in a nutshell 10. 9. Google Translate 10. 10. Product Comparison 11. 11. Rules and Regulations, Terms and Conditions 12. 12. The Present Future? 12. 13. Conclusion 13. 14. References 14. 15. Appendix A 16. 16. Appendix B 16. 17. Extra Resources 17. NOTE: Image on cover page, Hodson, 2009.
  • 3.
    2 Introduction Since prehistoric timespeople have used various ways to communicate with each other. Gesticulation and body language have been the rudimentary methods to communicate, especially among folks with limited foreign language knowledge. In our globalized and internet-connected world, one can use more than standard printed dictionaries as aids in the translation work; there exist various electronic and online language translation tools. This paper attempts to understand what are the concepts involved in language translation, the history of machine assisted translation and the challenges involved in a faithful translation. Furthermore this paper will analyze the most popular and known electronic solutions (products and services) and consider what the future development roadmap of language translation is. Linguistics: Concepts and Terminology The old and famous Italian adage – “Tradutore, Traditore” (“Translator, Traitor” in Italian) captures the essence and challenge of language translation. In the Italian language the verb “to translate” (“tradure”) and the verb “to betray” (“tradire”) are very similar phonetically speaking. Thus the birth of the saying “To translate is to betray.” The important question raised by this proverb is “How can language translation not lose the fidelity of the original language?” (Wooden, 2011). Let’s take an overview of the field dealing with languages – linguistics, and define some important terms that are fundamental in any discussion on faithful language translation. Linguistics is the scientific study of human languages (Fromkin et al., 2000). This field can be broken down into three distinct areas: 1) language structure (meaning grammar, morphology, syntax and phonology), 2) language meaning (which includes semantics and pragmatics, and also resolved
  • 4.
    3 ambiguity) and 3)language in its context (from evolutionary linguistics to historical linguistics, or from neurolinguistics to language acquisition and others; http://en.wikipedia.org/wiki/Linguistics). For the scope of this research paper the most important terms are grammar, syntax, lexicon, and semantics and pragmatics. Grammar is the “set of structural rules that govern the composition of clauses, phrases, and words in any given natural language” (http://en.wikipedia.org/wiki/Grammar). Syntax is defined as “the combination of words into sentences” or the rules that govern the sentence structure of any individual language (http://www.ielanguages.com/linguist.html). A lexicon comprises all the words and expressions of a particular language. In other words a lexicon is the thesaurus of a language, or the mental vocabulary in a speaker’s mind. Semantics is concerned with the meaning of words and larger syntactic units and the relationship among words or these larger units (http://www.ielanguages.com/linguist.html). The area of pragmatics is simply the subfield that studies the way linguistic contexts affect meaning. For a better comprehension of these terms I will attempt to come up with some simple analogies or examples. “I saw that she a cookie ate” is an example of an incorrect syntax in English. Shakespeare’s writings are supposed to have contained a lexicon of between 20,000 and 30,000 words (http://blog.oxforddictionaries.com/2011/04/shakespeare-language/). Dictionary.com states that “semantics is commonly used to refer to a trivial point or distinction that revolves around mere words rather than significant issues: ‘To argue whether the medication killed the patient or contributed to her death is to argue over semantics’ ”. Sentences such as “Flying planes can be dangerous” or “The missionaries are ready to eat” are prime examples of pragmatics, where the meaning is ambiguous, because a word, phrase, or sentence can mean either one or the other of two things (http://grammar.about.com/od/pq/g/pragmaticsterm.htm).
  • 5.
    4 As defined earlier,the field of linguistics is a vast field encompassing multitudes of subfields and is also interconnected with other fields (such as neurology, psychology, human development, or history among others). A fascinating aspect of language translation is how technology has been used to assist in this important task. Let’s look next at the history of human technology in the quest for language translation. History of Electronic Language Translation In the world of language translation the Rosetta Stone stands out as the most critical archeological find that helped 18th century linguists understand the relationship between Egyptian hieroglyphics, ancient Greek and pre-Coptic Egyptian. (“Rosetta Stone”; “Rosetta Stone - Egypt, Ptolemaic Period, 196 BC, The”). Between the 13th and 18th centuries there have been various attempts to overcome the barriers created by natural languages by means of designing a universal language (Hausser, pg. 1). One prime example from this historical period is René Descartes, who tried to “convey philosophical ideas and propositions as unambiguously as mathematics is consistent” by the use of “phonetic regularity and grammatical rationality” (Niemetz, 2003). With the closing of WWII history chapter and the beginning of the Cold War, computational machines became crucial in the intelligence work of language translation. The most widely publicized and successful attempt has been the January 7, 1954 demonstration of Georgetown University and IBM’s collaboration at translating from Russian to English using computational linguistics (Hutchins, 2004; “701 Translator – IBM press release”, 1954; Hutchins, 2006). This demonstration along with the work on the APEXC machine at Birkbeck College ushered a new era of optimism and high expectations for machine language translation ((http://en.wikipedia.org/wiki/Machine_translation).
  • 6.
    5 Challenges: Linguistic The linguisticchallenge in language translation comes as a result of the ambiguities in the source language: lexical, grammatical, semantics, or reference to pronouns (http://fr.wikipedia.org/wiki/Traduction_automatique). Other difficult aspects of language translation are the two ideals of fidelity and transparency. These two ideals are often at polar ends of each other and provide a great challenge especially to ancient language translators. The reason why there are so many Bible translations (from the ancient Hebrew and Koine Greek texts) is because it is very difficult to both “accurately render the meaning of the source text, without distortion” (textual fidelity) and at the same time provide a translation which “appears to a native speaker of the target language to have originally been written in that language, and conforms to its grammar, syntax and idiom” (transparency; http://en.wikipedia.org/wiki/Translation). One can also argue that the diversity of languages provides a benefit - “the confusion of language was in fact a benefit. A universal language without cultural diversity would actually stunt the evolution of unique ideas, because everyone would have a similar framework in which they thought” (Niemetz, 2003). Now let us consider the technical aspects and models involved in translation. Challenges: Technical The translation process can be simply broken down into just two steps: a) decode the meaning of the source and then, b) recode the meaning in the target language (http://fr.wikipedia.org/wiki/Traduction_automatique). These two steps are the most basic and rudimentary processes that a machine translation has to perform. But this does not make the machine
  • 7.
    6 translation work easyor very accurate. One has to decide on a model or approach for decoding or recoding a language. One model is the Rule-Based Machine Translation (or RBMT). This model is based on a word for word, transfer, and pivot techniques (http://en.wikipedia.org/wiki/Rule-based_machine_translation). A machine translation that is based on this approach will require dictionary entries, and linguistic rules. A second approach is called translation by example; also known as Example-Based Machine Translation – EBMT (http://en.wikipedia.org/wiki/Example-based_machine_translation). This particular translation approach is done by analogy, and extraction and combination of phrases, or other short parts of text. An Example-Based Machine Translation system is “given a set of sentences in the source language (from which one is translating) and their corresponding translations in the target language, and uses those examples to translate other, similar source-language sentences into the target language. The basic premise is that, if a previously translated sentence occurs again, the same translation is likely to be correct again” (Ralf Brown, 2004). A third model to machine translation is Statistical Machine Translation (STM). The simple definition of this approach is “the translation of text from one human language to another by a computer that learned how to translate from vast amounts of translated text” (Koehn). A novel and exceptional attempt has also been made to re-define the translation process, from a two-step process using two languages to a different two-step process using an intermediate language (IL). Thus, instead of decoding the meaning of the source and then recoding the meaning in the target language, one would take the source language and translate into an IL and then from IL to the target language (http://en.wikipedia.org/wiki/Distributed_Language_Translation; ”Esperato as an
  • 8.
    7 interlingua”; http://en.wikipedia.org/wiki/Esperanto). Thisnovel approach has been attempted with inconclusive results by a Dutch company in the 1980s using the universal Esperanto language. Interlingua is another international auxiliary language that can be used as an IL (http://en.wikipedia.org/wiki/Comparison_between_Esperanto_and_Interlingua; “Esperato vs. Interlingua” forum; ”Interlingua vs. Esperato” forum). This process of using an IL is known as Distributed Language Translation – DLT (”Esperato as an interlingua.”). The ideal Machine Translation model can be represented graphically as: Unfortunately, the current best performing machine translation systems are crawling at the bottom and are represented by the following model (Knight, Koehn, pg. 10): Machine Translation
  • 9.
    8 While there isno computational system that provides “the holy grail of fully automatic high- quality machine translation of unrestricted text,” there are many fully automated systems (see Electronic Translators: Products and Services section) that produce very reasonable output (see SYSTRAN and Google Translate sections). When the language domain is restricted and controlled the quality of machine translation is also greatly improved (“The electronic interpreter”, 2000). Statistical Machine Translation - Theory Statistical Machine Translation is based primarily on information theory and probability distribution according to Bayes’ Theorem (Brown, Pietra, Pietra and Mercer, pg. 264): e represents a string of English words, which can be translated into a string of French words in various ways, f. Thus the notation (e,f) represents the pair of string of an English words-string and one of its French equivalent words-string. Pr(f|e) represents the probability that a translator will produce f when presented with e. Pr(f) or Pr(e) is called the language model, and Pr(f|e) or Pr(e|f) are called the translation models. The fundamental equation of machine translation is to maximize the nominator of the equation in Bayes’ Theorem. The goal of the translation system is to find the string e that the native speaker had in mind when he produced f. To minimize the chance of error one has to choose the specific English string “e with hat” for which Pr(e|f) is the greatest. (Chen, page 1; Knight, Koehn page 5).
  • 10.
    9 Electronic Translators -Products and Services One finds a multitude of electronic translators and online translating tools and programs. A way to categorize these products is based on their roles and capabilities. One such category can be “static” translational tool, such as online dictionaries and electronic translator gadgets. The other category that these products fall in is what the writer calls “dynamic” translational products, because of their use of Example-Based Machine Translation or Statistical Machine Translation models. It is important to note that as translation tools and gadgets get more sophisticated the line between “static” and “dynamic” translators becomes fuzzier. In the American market, Franklin Electronic Publisher dictionary and translator is a classic example of a static translation product. These gadgets that provide electronic versions of dictionaries also incorporate voice translation and are becoming more prevalent in every national market. In the Danish market, Ordbogen.com is a prime example of a “static” online (internet) dictionary. Ordbogen.com is Denmark’s largest online dictionary company with over 30 products, some of them being in specialized fields such as law, medicine, or music. While Ordbogen.com is primarily a static online translational tool, it also incorporates some dynamic aspects. These aspects are the dynamic updates and daily addition of new words and response to customer unsuccessful query searches. Ordbogen.com also offers machine translation tools and traditional personal translation services to the private enterprise, individual consumers or the educational field (Peter Sepstrup, Product Manager Ordbogen.com, personal communication, 7 Oct. 2011). Since its inception 10 years ago Ordbogen.com has proven to be such a successful cutting-edge company, and has in recent years expanded and entered the global market with Lemma.com.
  • 11.
    10 A list ofother Danish online dictionaries and translators are available in Appendix A. Now let’s look at some successful implementations of Machine Translation. SYSTRAN in a nutshell The oldest and most significant machine translation company has been SYSTRAN. Founded in 1968, SYSTRAN has been utilized by the US government and EU. SYSTRAN provides the engine for Yahoo! Babel Fish translation, and had been used by Google language translating tools until 2007. SYSTRAN is also used by the Dashboard Translation widget in the latest Mac OS X version. “Commercial versions of SYSTRAN can run on Microsoft Windows (including Windows Mobile), Linux, and Solaris. Historically, SYSTRAN systems used Rule-Based Machine Translation (RBMT) technology. With the release of Systran Server 7 in 2010, SYSTRAN implemented a hybrid Rule- Based/Statistical Machine Translation (SMT) technology which was the first of its kind in the marketplace” (http://en.wikipedia.org/wiki/SYSTRAN ). Google Translate In recent years Google Translate has become a popular online translation tool. Instead of utilizing SYSTRAN or any traditional rule-based methods, they cast their lot with statistical analysis and a fresh new approach. As the “How Google Translate Works video” explains, Google has undertaken the task of using already available translated books, UN documents (which has 5 official languages) and foreign websites in order to build a large linguistic database of comparative and equal phrases or words. Thus the computer looks for “patterns between the translation and the original text that are unlikely to occur by chance. Once the computer finds a pattern, it can use this pattern to translate similar texts in the future. When you repeat this process billions of times you end up with billions of patterns and one very
  • 12.
    11 smart computer program”(http://googlesystem.blogspot.com/2010/08/how-google-translate- works.html). It is important to note that translation quality by Google can vary from language to language and this quality depends on the number of translated documents available for each particular language. Another pattern seen by this translation model is that technical or diplomatic words and phrases have a higher likelihood to appear in the translation database compared to street language, idioms, or colloquialism. An example of this discrepancy is the mistranslation from English to Danish of the adage (saying, proverb) “Haste makes waste.” The Google Translate engine result is “Hastværk gør affald”. Similarly, taking the equivalent Danish phrase (udtrykket) “Hastværk er lastværk” and translating it with Google Translate one gets the confusing result of “Haste, less speed.” A competitor of Google Translate to watch in the future is Science Applications International Corporation (SAIC Language Services; http://www.saic.com/natsec/language-services.html ). Product Comparison The plethora of translation tools and applications, both online and for the mobile market makes a thorough product comparison impossible. The features that distinguish one product from another are important to consumers and help to “push the technological envelope”. For the “static” translators, important features include pronunciation, use of term in the form of a sentence example, multiple language support and ease of use (user-friendliness). For the “dynamic” translators, the additional features of voice recognition software integration, multi-touch, display technology, voice-to-text translation, e-mail/chat transfer support are what make some products stand out over others. A list of online and mobile-device translators is available in Appendix B.
  • 13.
    12 Rules and Regulations,Terms and Conditions The translation products available on the market are either license base/copyright protected or are freeware/open-source. The machine translation algorithm models are public knowledge, with special tweaks that each individual company makes and which remain industry secrets. The use of a product protected by copyright, trademark, purchase license, or open-source cannot be used outside the terms and conditions that the company has created. Most electronic products, including freeware, have restricted usage rights. One such restricted usage right is giving due credit to the creator/owner company of the product when referencing such product. The Present Future "The Internet is rapidly evolving from a collection of stationary devices to a fluid network of mobile devices" (“Accessing the WAN – CCNA Exploration,” section 7.3.1.3). Voice recognition software has entered in its maturity stage and can be easily integrated in applications such as language translators. One can easily imagine a world where Siri - Apple’s recent speech-recognition "personal assistant - will provide language translation and be used in an international corporate setting. With more and more e-books and cross-cultural documents available online, with more people interacting across the world digitally, the more accurate and more robust will Google Translate and SYSTRAN products become. Technology has proven to be a major contributor to aiding in the difficult task of language translation. Technology and the use of statistical machine translation have also been just recently employed to solve century old puzzles and mysteries, like the Copiale Cipher (Mark Brown, 2011). These are exciting times to live in and where the future does not seem so distant.
  • 14.
    13 Conclusion Isaac Assimov statedthat “the brain is the next frontier in science” and is “more complex than the universe” (Asimov, pg. 12, 45). The human brain is the creator of languages and probably the best machine to handle all the complex and intricate nuances and aspects of a language. While machine translation systems are becoming very sophisticated in handling grammar, syntax, lexicon, semantics or pragmatics, human languages are always in a state of flux. Changing fads, shifting cultural attitudes, morphing philosophies, current world events and colloquialisms are all strong factors in keeping human languages in a perpetual state of change. In the future, human language translating staff and services will most likely never be without a job or facing a dying career, but technology will only enhance their work as human languages evolve. The symbiotic work between humans and machines will definitely result in more “tradutore” and less of a “traditore.”
  • 15.
    14 References: “Accessing the WAN– CCNA Exploration” version 4. Cisco Network Academy. Web. 25 Sep. 2011. URL: http://slpl.cse.nsysu.edu.tw/cpchen/publication/aclclp_mt.pdf Asimov, Isaac. The Human Brain: Its Capacities and Functions.1994. Signet. ISBN 978-0451628671, 298 pages. Brown, Mark. “Modern Algorithms Crack 18th Century Secret Code.” Wired UK. 26 Oct. 2011. Web. 26 Oct. 2011. URL: http://www.wired.com/wiredscience/2011/10/copiale-cipher-crack/ . Brown, Ralf. ”Example-Based Machine Translation” Nov. 2004. Web. 6 Oct. 2011. URL: http://www.cs.cmu.edu/~ralf/ebmt/intro.html Brown, Peter F. Pietra, Stephen A. Della. Pietra, Vincent J. Della. Mercer, Robert L. “The Mathematics of Statisctical Machine Translation: Parameter Estimation.” Computational Linguistics,Vol. 19, Nr.2, June 1993. Web. 14 Oct. 2011. URL: http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf side 264 Chen, Chia-Ping. “Machine Translation: A Score Years Ago.” ACLCLP Newsletter, 21(4). October 2010. Web. 14 Oct. 2011. URL: http://slpl.cse.nsysu.edu.tw/cpchen/publication/aclclp_mt.pdf. ”Electronic interpreter, The.” The Guardian, 27 January 2000. Web. 7 Oct. 2011. URL: http://www.guardian.co.uk/technology/2000/jan/27/onlinesupplement5/print ”Esperato as an interlingua.” Examples of Machine Translation. May 2000. San Diego State University. Web.7 Oct. 2011. URL: http://www-rohan.sdsu.edu/~ling354/MT-eg.html#Esperanto_as_an_interlingua. “Esperato vs. Interlingua” forum. How-To-Learn-Any-Lanugage.com. Jan 2010. Web.7 Oct. 2011. URL: http://how- to-learn-any-language.com/forum/forum_posts.asp?TID=18712&PN=1 Fromkin, Victoria. Bruce Hayes, Susan Curtiss, Anna Szabolcsi, Tim Stowell and Donca Steriade (2000). Linguistics: An Introduction to Linguistic Theory. Oxford: Blackwell. p. 3. ISBN 0631197117. Hausser, Roland. ”Overcoming Language Barriers by Means of Computers.” Web. 6 Oct. 2011. URL: http://www.linguistik.uni-erlangen.de/~rrh/papers/sejong.pdf . Hodson,Steven. “Traduttore, traditore”, WinExtra, November 16, 2009. Web: Multi_flag_thumb.png image. 6 Oct. 2011, URL: http://www.winextra.com/archives/traduttore-traditore/ Hutchins, John. “The first public demonstration of machine translation: the Georgetown-IBM system,7th January 1954.” March 2006. Web. 5 Oct. 2011. URL: http://www.hutchinsweb.me.uk/GU-IBM-2005.pdf Hutchins, John. ”The Georgetown-IBM experiment demonstrated in January 1954.” AMTA conference in September 2004. Web. 5 Oct. 2011. URL: http://www.hutchinsweb.me.uk/AMTA-2004-ppt.pdf . ”Interlingua vs.Esperato” forum. AntiMoon.com– How to learn English effectively. Aug 2008. Web. 7 Oct. 2011. URL: Forum http://www.antimoon.com/forum/t11408.htm
  • 16.
    15 Knight, Kevin. Koehn,Philipp. “What’s New in Statistical Machine Translation.” Tutorial Tutorial at HLT/NAACL . 2004. Web. 7 Oct. 2011. URL: http://people.csail.mit.edu/koehn/publications/tutorial2003.pdf Koehn, Philipp. ”Statisctical Machine Translation” front page. Web. 7 Oct. 2011. URL: http://www.statmt.org/ Niemetz, Anne.“A Universal Language: The Myth, Search and Experiments”, March 2003. Web. 6 Oct. 2011. URL: http://users.design.ucla.edu/~aniemetz/utm/index.html. "Rosetta Stone." Encyclopædia Britannica.Encyclopædia Britannica Online.Encyclopædia Britannica Inc., 2011. Web. 6 Oct. 2011.URL: http://www.britannica.com/EBchecked/topic/509988/Rosetta-Stone. “Rosetta Stone - Egypt, Ptolemaic Period, 196 BC, The. “ The British Museum. Web. 6 Oct. 2011. URL: http://www.britishmuseum.org/explore/highlights/highlight_objects/aes/t/the_rosetta_stone.aspx Wooden,Cindy. “Mass translations are a challenge in every language, officials says”. Catholic News Services. 9th August,2011. Web. 6 Oct. 2011. URL: http://www.catholicnews.com/data/stories/cns/1103160.htm ”701 Translator – IBM press release.” January 8, 1954. Web. 5 Oct. 2011. URL: http://www- 03.ibm.com/ibm/history/exhibits/701/701_translator.html.
  • 17.
    16 APPENDIX A Online dictionariesand language translator in Denmark:  Ordbogen.com http://www.ordbogen.com/  Gyldendals Røde Ordbøger http://ordbog.gyldendal.dk/  Den Danske Ordbog – moderne dansk sprog http://ordnet.dk/ddo  Den Danske Online Ordbog http://www.ddoo.dk/  Dansk Parlør http://www.parlor.dk/  Kryds & Tværs http://kryds.onlineordbog.dk/  Rimordbog http://www.rimordbog.dk/  Engelsk Ordbog http://www.onlineordbog.dk/wordnet/da/  Danish Word http://www.danishword.com/  Dialekt.dk – Københavns Universitet http://dialekt.ku.dk/ APPENDIX B Online Translators:  SYSTRAN (FREE) http://www.systran.co.uk/  Yahoo Bablefish http://babelfish.yahoo.com/  Google Translate http://translate.google.com/  WordLingo http://www.worldlingo.com/en/products/  SDL FreeTranslation http://www.freetranslation.com/ iPhone/Android Apps:  Google Translate from App Store or Android Market http://www.businessinsider.com/best- translation-apps-2011-3#google-translate-translates-your-words-as-you-speak-into-your- devices-mic-1  SAIC Linguistics – Google Translate competitor http://www.informationweek.com/thebrainyard/news/workgrouping_team_collaboration_works paces/231602149/saic-takes-on-google-with-speech-translation-apps , http://www.saic.com/natsec/language-services.html  iSpeak app from Acapela Group: http://www.acapela-group.com/translator-for-iphone-acapela- speech-empowers-ispeak-advanced-tool-designed-by-future-apps-in-9-languages--2040-speech- synthesis.html  Vocre app from myLanguage http://www.reuters.com/article/2011/09/26/uk-app-language- idUSLNE78P02520110926 , http://www.gottabemobile.com/2011/09/16/vocre-translation-app/ , http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign-languages/
  • 18.
    17  SmartTrans http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign- languages/ Jibbigo app http://www.jibbigo.com/website/index.php , http://news.yourolivebranch.org/2011/09/30/app-helps-travelers-speak-in-foreign-languages/  Ultralingua app http://www.ultralingua.com/products  iLinguist app from Apps Dev Team http://www.appsdevteam.com/  Linguo app from Edovia Inc. from iTunes Store http://www.brighthub.com/mobile/iphone/articles/86375.aspx  Interpret from iTunes Store http://www.brighthub.com/mobile/iphone/articles/86375.aspx  Tap-Translate or Tap-Dictionary for Android http://www.businessinsider.com/best- translation-apps-2011-3#tap-translate-lets-you-translate-as-you-read-in-safari-5 , http://iostranslate.appspot.com/tap-translate-support.html  Odyssey Translator from App Store http://www.businessinsider.com/best-translation-apps- 2011-3#odyssey-translator-pro-lets-you-build-sentences-youd-actually-use-in-real-life-8  Word Lens from App Store http://www.businessinsider.com/best-translation-apps-2011- 3#word-lens-is-a-great-bet-for-translating-signs-and-other-text-you-stumble-upon-9  SayHi Translate from Nuance Communications Inc. http://www.marketwatch.com/story/nuance-gives-sayhi-translate-app-a-voice-2011-09-15  Abby TextGrabber + Translator http://www.ipadnewstracker.com/mobile/2011/09/abbyy- textgrabber-translator-for-the-iphone-just-grab-and-translate/  voice to text translation application (upcoming) from Ortsbo http://www.mobilemag.com/2011/09/21/ortsbo-to-release-voice-to-text-translation-and-crowd- sourcing-wiki/  Translating an App into different foreign languages and aiding with localization from Tethras http://www.guardian.co.uk/technology/appsblog/2011/sep/29/tethras-translation- localisation-apps?newsfeed=true Extra Resources: Statistical Machine Translation Theory (extra links):  http://en.wikipedia.org/wiki/Statistical_machine_translation  http://en.wikipedia.org/wiki/Language_model  What's new in Statistical Machine Translation?  EMNLP 2011 SIXTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION  Papers presented SYSTRAN in a nutshell (extra links)  http://en.wikipedia.org/wiki/SYSTRAN:  Online Systran Free translation http://www.systran.co.uk/  Guardin Article in 2000 about SysTran ... http://www.guardian.co.uk/technology/2000/jan/27/onlinesupplement5