Translation

1,739 views

Published on

Presentation on translation and croudsourcing at University of Maryland on 6/11/2010.

Published in: Education, Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,739
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
45
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Translation

  1. 1. What is Translation? A text that is based on a text in another Translation • language and which —has the same meaning and Cloud Computing!! —conveys the same information —has the same effect on its readers —gives the gist of the original —explains the original Martin Kay • It depends what you want • Generally a mixture of several of these Stanford University and The University of the Saarland Martin Kay Morphology Martin Kay Machine Translation 2 Vauquois Triangle Vauquois Triangle Interlingua Reduce dimensions Semantics Semantics Abstraction Syntax Syntax Transfer Morphology Morphology sis An aly aly An sis Phonology / Orthography Phonology / Orthography Martin Kay Morphology Martin Kay Morphology
  2. 2. Vauquois Triangle Vauquois Triangle Semantics Semantics Syntax Syntax Morphology loves = love+pl, love+3sg Morphology children = child+pl tying = Phonology / loved = love + ed tie+ing Orthography Phonology / Orthography Martin Kay Morphology Martin Kay Morphology Vauquois Triangle Vauquois Triangle Brutus killed Ceasar => Semantics Semantics Caesar is dead The dog chased the cat = the Syntax Syntax cat was chased by the dog Morphology Morphology Phonology / Orthography Phonology / Orthography Martin Kay Morphology Martin Kay Morphology
  3. 3. Statistical MT Pragmatic MT Semantics Semantics s Syntax si TSyntax ransfe ly r A na Morphology Morphology An aly sis Phonology / Orthography Phonology / Orthography Transfer Martin Kay Morphology Martin Kay Morphology Pragmatic MT Vauquois Triangle I left nothing on the train Semantics Semantics = I had all my belongings when I got off the train Syntax Semantic Processing Syntax Morphology Morphology Phonology / Orthography Phonology / Orthography Martin Kay Morphology Martin Kay Morphology
  4. 4. Semantic Processing Semantic Processing Replace words and phrases by words and Replace words and phrases by words and phrases that mean the same phrases that mean the same Reorder Reorder • Translation in Language teaching • St. Jerome: Vulgate • Popular perception of translation • 20th Century • Machine translation • Translator is anonymous Martin Kay Morphology Martin Kay Morphology Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Martin Kay Robotics 15 Martin Kay Robotics 16
  5. 5. Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Martin Kay Robotics 17 Martin Kay Robotics 18 Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Martin Kay Robotics 19 Martin Kay Robotics 20
  6. 6. Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Martin Kay Robotics 21 Martin Kay Robotics 22 Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Martin Kay Robotics 23 Martin Kay Robotics 24
  7. 7. Versichern Sie Sich daß Sie nichts in Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben dem Zug vergeßen haben Assurez-vous que vous n'avez rien Assurez-vous que vous n'avez rien oublié dans le train oublié dans le train Make sure that you have forgotten nothing on the train I forgot to call you on the train I forgot my password on the train Make sure that you have left nothing on the train I left the trash on the train The stuff on the car is still at home, but I left nothing on the train Martin Kay Robotics 25 Martin Kay Robotics 26 Versichern Sie Sich daß Sie nichts in dem Zug vergeßen haben Assurez-vous que vous n'avez rien We all know the Lost and Found oublié dans le train scenario and that this is what they are talking about Make sure that you take all your belongings with you Versichern Sie Sich daß Sie nichts in I didn't bring all my belongings with me! dem Zug vergeßen haben I left your brief case, but none of my belongings. Make sure that you take all your belongings with you Pragmatic Semantic Translation Martin Kay Robotics 27 Martin Kay Robotics 28
  8. 8. Situated Language Semantic Processing Replace words and phrases by words and phrases that mean the same Reorder Pragmatic Processing Replace words and phrases by words and phrases that will effect the same changes in the hearer's head Martin Kay Robotics 29 Martin Kay Morphology • Mr President , I respond to an invitation • Mr President , I respond to an invitation yesterday afternoon by the President of the yesterday afternoon by the President of the House to speak on behalf of my group on a House to speak on behalf of my group on a matter referred to in the Minutes . matter referred to in the Minutes . • Monsieur le Président , je réponds à une • Monsieur le Président , je réponds à une invitation lancée hier par la Présidente qui m invitation lancée hier par la Présidente qui ' a demandé de prendre la parole au nom de m'a demandé de prendre la parole au nom de mon groupe sur un sujet mentionné dans le mon groupe sur un sujet mentionné dans le procès - verbal . procès - verbal . Martin Kay Morphology Martin Kay Morphology
  9. 9. • Mr President , I respond to an invitation • I refer to item 11 on the order of yesterday afternoon by the President of the business . House to speak on behalf of my group on a • Je veux parler du point 11 de l'ordre des matter referred to in the Minutes . travaux . • Monsieur le Président , je réponds à une invitation lancée hier par la Présidente qui m'a demandé de prendre la parole au nom de mon groupe sur un sujet mentionné dans le procès - verbal . Martin Kay Morphology Martin Kay Morphology • Je souhaite exprimer ce point de vue , même si je désapprouve la proposition du président du groupe des socialistes , tout • Je souhaite exprimer ce point de vue , en la respectant , et même si j'ai voté même si je désapprouve la proposition du contre . président du groupe des socialistes , tout en la respectant , et même si j'ai voté • I wish to express that view even if I contre . respectfully disagreed and voted against the proposal of the President of the • I wish to express that view even if I Socialist Group. respectfully disagreed and voted against the proposal of the President of the • I want to express that view, even if I Socialist Group . disapprove of the proposal of the president of the socialist group, while I respect it, and even though I voted against it Martin Kay Morphology Martin Kay Morphology
  10. 10. • J 'apprécie fortement cette attitude . • Pragmatic translation often involves addition or removal of substantive information from • That is something for which I have a deep what is explicitly expressed. appreciation . Martin Kay Morphology Martin Kay Morphology You cannot avoid it Two no trumps, short stop, goal keeper, end run Happy hour, a hair of the dog Alimony, juge d'instruction value-added tax, home owner's policy nut, hot tea, café/espresso on I usually go to work in the bus! n-th floor, n pièces 2-piece, 2-seater, deux roues, 6-pack Second reading. Do I have a second? Martin Kay Translation 39 Martin Kay Machine Translation 40
  11. 11. Does this train go to Perpignan? No, it stops in Beziers. Est_ce que ce train va a Perpignan? Fährt dieser Zug nach Perpignan? Nein, er hält in endet Béziers Martin Kay Translation 41 Martin Kay Translation 42 Est-ce que c’est ta cousine? Est-ce que c’est ta cousine? Non, je n’ai pas de cousine. Non, je n’ai pas de cousine. female female Is that your^cousin? Is that your ^cousin? female female No. I don’t have a ^cousin. No. I don’t have a ^ cousin. girl Is that woman your cousin? Is that woman your cousin? Martin Kay Machine Translation 43 Martin Kay Machine Translation 44
  12. 12. Facts about translation … are not all reflected in emergent properties People, see everything of translations Does this train go to Endville? Est-ce que c’est ta cousine? —in a context Situated I just got back from Texas/Utah. I had forgotten how good beer tastes. Ich hatte vergeßen, wie gut[es] Bier schmekt. —from a point of view Embodied It may be necessary to reduce condenser steam side pressure pression latérale de la vapeur pression côté vapeur Martin Kay Robotics 45 Martin Kay Machine Translation ! ! Google Shakespeare Google Shakespeare Coca Cola Coca Cola Boeing Boeing The Human Zone Caterpillar Caterpillar Source Source Difficulty Haiti Difficulty Haiti Weather reports Weather reports ! ! Target Target ! Quality ! ! Quality ! Martin Kay Machine Translation Martin Kay Machine Translation
  13. 13. ! Google Shakespeare Coca Cola Boeing The • Literary translators are usually amateurs. • Legal translators are professionals. Am Caterpillar at • Scientific, and technical translators of eu Source material to be disseminated are r Difficulty Haiti Zone professionals. • Scientific, and technical translators of internal material may be either. Weather reports ! Target ! Quality ! Martin Kay Machine Translation Martin Kay Machine Translation The human contribution • Post-editing • Pre-editing Fin • Consulting • Triangulation and Reflective editing Martin Kay Machine Translation Martin Kay Morphology
  14. 14. The Prevailing View The Traditional View Information about language Information about language —resides in records of it in use—in texts; —resides in people's heads; —Has something important to do with the frequency of short —Has something important to do with the frequency of short sequences of words (n-grams); sequences of words (n-grams); —occurs mostly in very weak dilution. —occurs mostly in very weak dilution. ∴ Learning about language requires ∴ Learning about language requires —computers, and —computers, and —vast corpora. —vast corpora. Translation is all about relations between Translation is all about relations between words and phrases in two languages. words and phrases in two languages. Martin Kay Morphology Martin Kay Morphology The Traditional View The Traditional View Information about language Information about language —resides in people's heads; —resides in people's heads; —Has something important to do with the recursive —Has something important to do with the recursive grammatical structure of sentences; grammatical structure of sentences; —occurs mostly in very weak dilution. —occurs mostly in quite strong dilution. ∴ Learning about language requires ∴ Learning about language requires —computers, and —computers, and —vast corpora. —vast corpora. Translation is all about relations between Translation is all about relations between words and phrases in two languages. words and phrases in two languages. Martin Kay Morphology Martin Kay Morphology
  15. 15. Dictionary-and-Rearrangement Dictionary-and-Rearrangement Translation Translation Used exclusively in: —Translation in foreign-language teaching. • ' can translate if : ' is in the —Machine translation. bilingual dictionary. Preferred in Because • ' ' can translate if ' can translate —Religious translation —It's easiest and 'can translate —Much legal translation —Makes the translator —Much post-1900 translation anonymous • ' ' can translate if ' can translate Deprecated in Because and 'can translate —Scientific and technical translation —What the text refers to is —Translation of Belles Lettres more important than what the words mean. • The right alternative is the one that looks best. Semantic Translation Martin Kay Morphology Martin Kay Morphology Understand and Re-express Translation • The ultimate basis of all translation • Je ne veux pas sans cesse remettre sur le tapis les problèmes de ce bâtiment mais cela constitue un sujet de préoccupation sérieux . • I do not want to ever put on the table the issue of this building but this is a serious concern. (G) • I do not want to drag up the issue of this building endlessly , but this is a serious problem . Pragmatic Translation Martin Kay Morphology Martin Kay Morphology
  16. 16. • Toutefois , nous discutons également aujourd ' hui du rapport • Il s ' agit de l ' exigence de modernisation afin d ' assurer l ' sur les aides d ' État et du rapport général sur la politique de avenir de la politique européenne de concurrence . concurrence pour 1998, ma contribution à cette discussion • It is all about the need for modernisation and the future commune touchant au dernier rapport . viability of the European competition policy . • However , we are also discussing the aid report today and the general competition report for 1998, and my contribution to this joint debate relates to the latter . • Nous appelons le Conseil et la Conférence intergouvernementale à introduire la procédure de codécision en matière de droit de la • However, we also discussed today the report on state aid and concurrence . the general report on competition policy for 1998, my contribution to this joint debate concerning the final report. • Therefore , I would urge the Council and the Intergovernmental Conference to introduce the codecision procedure into legislation in this area Martin Kay Morphology Martin Kay Morphology • Les rudiments de l ' économie nous apprennent que • At the end of the day , one of the fundamental tenets of economic theory is that Martin Kay Morphology Martin Kay Morphology
  17. 17. Cloud Computing Cloud Computing When you need computing resources • Client —In large amounts. —A way to rent out time on computers you are not using. —In bursts —A way to get other people to work for you for free —With simple parallelization—MapReduce • Customer When you could use people —A way to get huge bursts of computation done. —irregulary —of just the right kind Martin Kay Morphology Martin Kay Morphology • A core team of about 100 employees has been developing n.Fluent over the past four years. The software went live for internal IBM use in August 2008. Since then, about 3,000 employee volunteers have collectively contributed more than 36 million words to extend and improve it. To encourage further participation and raise awareness of the project, IBM held its first crowdsourcing event last summer. • No specific plans have been announced for a commercial product or service. "For right now, we're focusing on building and perfecting the tool," said Ari Fishkind, an IBM spokesman. • Salim Roukos, computer science researcher at IBM's T.J. Watson Labs, believes this sort of technology could play a big role in localizing global operations. Companies currently spend about $13 billion a year to translate documentation, which is all done using human labor. With n.Fluent, companies could automate the first translation and then let humans focus on correcting any mistakes. CIA Google Martin Kay Morphology Martin Kay Morphology
  18. 18. • Philip Resnik, associate professor of • To address this gap, Resnik is working with computer science at the University of Ben Bederson, associate professor of Maryland, has been researching computer science at the University of crowdsourcing machine-translation Maryland, to develop a framework for techniques. Resnick said statistical human-machine interaction that pairs techniques made a revolutionary leap by volunteers, one of whom knows only the turning a labor-intensive, expert-driven source language and the other, only the development process into a machine-learning target language. problem. Martin Kay Morphology Martin Kay Morphology MapReduce • It is clear that the needs of machine • Last month, for example, it said it was translation re- searchers have outgrown the working to combine its translation tool with capabilities of individual computers. image analysis, allowing a person to, say, take a cellphone photo of a menu in German and get an instant English translation. • The services themselves have long been referred to as Software as a Service (SaaS) • Like many other NLP problems, output quality of statistical machine translation (SMT) systems in- creases with the amount of training data. Martin Kay Morphology Martin Kay Morphology
  19. 19. Translation Models The delphi method • Classroom model • The name "Delphi" derives from the Oracle • Find lexical pairs (bilingual dictionary) of Delphi. The authors of the method were not happy with this name, because it implies • Do morphology "something oracular, something smacking a • Reorder little of the occult" • The opening anecdote relates Francis • Makes translator anonymous, less accountable Galton's surprise that the crowd at a county fair accurately guessed the weight of an ox • Good for when their individual guesses were averaged —religious translation (the average was closer to the ox's true —legal text? butchered weight than the estimates of most crowd members, and also closer than any of Martin Kay Morphology Martin Kay separate estimates made by cattle the Morphology experts). Pragmatic Translation • Coordination of behavior includes optimizing • Assurez-vous que vous n'avez rien oublié the utilization of a popular bar and not dans le train colliding in moving traffic flows. The book is • Versichern Sie Sich daß Sie nichts in dem replete with examples from experimental Zug vergeßen haben economics, but this section relies more on naturally occurring experiments such as • Make sure you have not forgotten the train. —Google translate 5/29/10 5.55pm CET. pedestrians optimizing the pavement flow or the extent of crowding in popular • English restaurants. He examines how common —Make sure that you have have not forgotten (left) anything on the train. understanding within a culture allows —Make sure that you take all your belongings with you. remarkably accurate judgments about specific reactions of other members of the culture. Martin Kay Morphology Martin Kay Morphology
  20. 20. Translation Types • Outbound • Inbound • Indicative • Informative Martin Kay Machine Translation Martin Kay Machine Translation The European Union The perception Danish Linguistics has failed technology Czech Dutch Estonian English Linguistics is not about communication Hungarian Finnish Lithuanian It focuses on fringe phenomena French Latvian It is not robust German Maltese Greek 11 languages Polish It luxuriates in ambiguities but is not Italian 2,500 (12.5%) of 20,000 staff Slovene interested in resolving them 1% of the annual budget Slovak Portuguese 40% of administration costs. It never gets beyond the sentence Spanish Swedish Martin Kay Machine Translation Martin Kay Machine Translation
  21. 21. Generative Linguistics In many ways Linguistics has been a phenomenal success The generative vein in linguistics has run out Where problems have competing solutions, because most problems have fringe phenomena can decide the issue —been solved The strength and flexibility of language comes —turned out to belong to a wider domain in large measure from its openness to A new paradigm is required based in acknowledging ambiguity. Resolving ambiguity is not a that language is about communication linguistic enterprise Translation is about communication Much of what there is to study about language is within the sentence Lexical semantics Martin Kay Machine Translation Martin Kay Machine Translation Unfortunately we have … Triangulation Early binding Zipf’s law Locality Emergent Properties AI Bleu score Martin Kay Machine Translation Martin Kay Machine Translation
  22. 22. ! Google Shakespeare Assimilation Dissemination Indicative Informative Hard Belles Lettres Advertising There is a lot of stuff Scientific corner in this Source Source Papers Difficulty Difficulty Manuals Weather reports Easy Low High Weather reports ! Target Quality Target ! Quality ! Martin Kay Machine Translation Martin Kay Machine Translation

×