Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

L1 l2 l3 introduction to machine translation


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

L1 l2 l3 introduction to machine translation

  1. 1. Drop me a mail: Drop me a mail: Visit me at: Visit me at: http://rushdishams.googlepages.com 1Rushdi Shams, Lecturer, Dept of CSE, KUET, Bangladesh
  2. 2. O OO O V
  3. 3. • Peter mentioned the book I sent to Marry
  4. 4. • We will give medicines to pregnant women and children
  5. 5. • I saw the boy with the telescope
  6. 6. • The painter put on another coat
  7. 7. • We like flying planes
  8. 8. • The judge threw the book at him
  9. 9. • Visiting relatives can be tiresome
  10. 10. • Da Vinci liked to paint his models nude.
  11. 11. • He wrote the note yesterday
  12. 12. • You mean you carried the information by a bus?
  13. 13. • Connecting wires are tiring in DLD lab
  14. 14. • Squad helps dog bite victim
  15. 15. Why use computers in translation? • Too much translation for humans • Technical materials too boring for humans • Greater consistency required • Need results more quickly • Not everything needs to be top quality • Reduce costs • any one of these may justify machine translation or computer aids
  16. 16. Components of a LanguageComponents of a Language • There are three components of a language‐There are three components of a language 1. Lexicon C i i2. Categorization 3. Grammar Rules
  17. 17. LexiconLexicon stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… the | a | an |the | a | an | …… to | in | on | near |to | in | on | near | …… and | or | but |and | or | but | …… 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  18. 18. CategorizationCategorization NounNoun >> stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east |stench | breeze | glitter | nothing | wumpus | pit | pits | gold | east | .... VerbVerb >> is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn |is | see | smell | shoot | feel | stinks | go | grab | carry | kill | turn | …… AdjectiveAdjective >> right | left | east | south | back | smelly |right | left | east | south | back | smelly | …… | | | | | f | | | || | | | | f | | | |AdverbAdverb >> here | there | nearby | ahead | right | left | east | south | back |here | there | nearby | ahead | right | left | east | south | back | …… PronounPronoun >> me | you | I | it | S=HEme | you | I | it | S=HE || Y’ALLY’ALL …… NameName >> John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | ……NameName John | Mary | Boston | UCB | PAJC |John | Mary | Boston | UCB | PAJC | …… ArticleArticle >> the | a | an |the | a | an | …… PrepositionPreposition >> to | in | on | near |to | in | on | near | …… ConjunctionConjunction >> and | or | but |and | or | but | …… DigitDigit >> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 90 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
  19. 19. Grammar StructureGrammar Structure • In this lecture and the one following it  In this lecture and the one following it,  attending it carefully does not mean you know  all of English languageg g g • Because, that will take you to read NLP as one  subject for 4 years! ☺subject for 4 years! ☺ • We will learn how to define the basic grammar  structure for NLP systemsstructure for NLP systems • We will also learn what things you need to  keep in your head while devising such systemskeep in your head while devising such systems
  20. 20. Syntactic TreeSyntactic Tree • Human recognizes the organization of words g g according to their POS in a sentence with trees. • Are you denying? • Well you can. Because, you didn’t learn it this way in  your childhood. • No one did!• No one did! • But it has been proved that our brain draws a tree  like structure when we first develop our skills on p language • That research is beyond this lecture
  21. 21. Syntactic TreeSyntactic Tree • So  if you really do that unintentionally  So, if you really do that unintentionally,  then why not learn it on pen and paper so  that you can understand how you will teach that you can understand how you will teach  machines to learn languages? • The tree structure human contemplates is • The tree structure human contemplates is  called syntactic tree
  22. 22. Parsing a Syntactic TreeParsing a Syntactic Tree • Parsing is the process of using grammar Parsing is the process of using grammar  rules to determine whether a sentence is  legal  and to obtain its syntactical structurelegal, and to obtain its syntactical structure • ‘The large cat eats the small rat’
  23. 23. ParsingParsing The large cat eats the small rat
  24. 24. ParsingParsing Article adjective noun VerbArticle adjective noun Article adjective noun Verb The large cat eats the small rat
  25. 25. ParsingParsing Article adjective noun noun phraseVerbArticle adjective noun noun phrase Article adjective noun Verb The large cat eats the small rat
  26. 26. ParsingParsing Noun phrase verb phrase Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  27. 27. Parsing t Parsing Noun phrase verb phrase sentence Article adjective noun Verb noun phrase Noun phrase verb phrase Article adjective noun Verb noun phrase Article adjective noun The large cat eats the small rat
  28. 28. Syntactic Tree • The point where lines begin or end is called node • Each node has labels like S  PP or chasedEach node has labels like S, PP or chased • If 2 nodes are connected by a line, the upper node is immediate  dominator of the lower node. D is the immediate dominator of the • Upper nodes in a branch are called dominators. NP is the dominator Upper nodes in a branch are called dominators. NP is the dominator  of  D, N, the, dog
  29. 29. Syntactic TreeSyntactic Tree • Two nodes are sisters if they are immediately dominated by  same node  D and N are sisterssame node. D and N are sisters. • The immediate dominator of them is called their mother. NP is  the mother of D and N. Similarly, D and N are daughters of NP • The immediate dominators of them are called their parents
  30. 30. Syntactic TreeSyntactic Tree • Constituents are the terminal nodes that    ll d i d b     i l   i l are all dominated by a single non‐terminal  node. Chased a cat into the garden are  constituents as they are dominated by VPconstituents as they are dominated by VP
  31. 31. Label BracketingLabel Bracketing I  i       f  i   h   i    i   h  • It is a process of representing the syntactic tree in another way.
  32. 32. Do yourself: Label Bracket the treeDo yourself: Label Bracket the tree
  33. 33. R b      h     i   h  Remember, you may have to practise the  reverse‐ constructing a syntactic tree  f  l b l b k i  ☺from label bracketing ☺
  34. 34. Constituents and CategoriesConstituents and Categories • Tree structure provides two information‐Tree structure provides two information 1. It divides the sentence into constituents (in English  these are called phrases)(in English, these are called phrases) 2. It puts them into categories (NP, VP, etc)
  35. 35. Constituents and CategoriesConstituents and Categories • How do we know what would be the right way to group g y g p words into right category? • How do we know into the garden is a category, but a cat  i t  i   t?into is not? • Any words that can be moved as group are probably constituents‐ the meaning of the dog chased a cat into g g the garden and into the garden, the dog chased a cat.  • Which one did you move? Into the garden‐ right? • And the meaning did not change • That’s probably our constituent
  36. 36. Constituents and CategoriesConstituents and Categories • Any string of words that can be deleted is Any string of words that can be deleted is  probably a constituent • If you omit into the garden from the sentence, y g , nothing is changed grammatically. • Usually, meaning of unit of words makes sense. y g Into the garden is much more meaningful than a  cat into
  37. 37. Constituents and CategoriesConstituents and Categories • However, we are only talking about syntactic However, we are only talking about syntactic  structure, not the semantic one. • The dog, the cat and the garden‐ their grammar g, g g structure is saying they are all noun phrases.  • It means, they can be used interchangeably‐ no y g y linguist can deny that • Then what about‐ “The garden chased the cat  into the dog”? ☺☺ • We will not focus on semantics, said you before!
  38. 38. AmbiguityAmbiguity • There are 2 types of ambiguity‐yp g y 1. Lexical Ambiguity: Sentence contains an  idiom/word/term that has more than one meaning. Glasses means both drinking glasses and spectacles 2. Structural Ambiguity: Sentence has more than  one syntactic treeone syntactic tree I saw the boy with the telescope‐ Did you see the boy with a telescope? OrDid you see the boy with a telescope? Or Did you see the boy who was having a telescope?
  39. 39. Structural AmbiguityStructural Ambiguity
  40. 40. Difficulties with Natural Language: Anaphora •• Using pronouns to refer back to entities Using pronouns to refer back to entities Us g p o ou s to e e bac to e t t esUs g p o ou s to e e bac to e t t es already introduced in the textalready introduced in the text After Mary proposed to John, After Mary proposed to John, theythey found a found a  preacher and got married.preacher and got married. For the honeymoon, For the honeymoon, theythey went to Hawaiiwent to Hawaii Mary saw a ring through the window and asked Mary saw a ring through the window and asked  John for John for itit Mary threw a rock at the window and broke Mary threw a rock at the window and broke itit
  41. 41. Difficulties with Natural Language: Indexicality •• Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance Indexical sentences refer to utterance  situation (place, time, S/H, etc.)situation (place, time, S/H, etc.) I am over I am over herehere Why did you do Why did you do thatthat??
  42. 42. Difficulties with Natural Language: Metonymy •• Using one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for anotherUsing one noun phrase to stand for another I'   dI'   d Sh kSh kI've readI've read ShakespeareShakespeare ChryslerChrysler announced record profitsannounced record profits The The ham sandwichham sandwich on Table 4 wants another on Table 4 wants another  beerbeer
  43. 43. Difficulties with Natural Language: Metaphor •• “Non“Non‐‐literal" usage of words and phrases  literal" usage of words and phrases  NonNon literal  usage of words and phrases, literal  usage of words and phrases,  often systematic.often systematic. I've tried killing the process but it won't die. I've tried killing the process but it won't die.  I    k  i   liI    k  i   liIts parent keeps it alive.Its parent keeps it alive.
  44. 44. Semantics in NLSemantics in NL • I can't untie that knot with one t u t e t at ot t o e a d. – The sentence is about the abilities of whoever spoke  or wrote it. (Call this person the speaker.) – It's also about a knot, maybe one that the speaker is  pointing at – The sentence denies that the speaker has a certain– The sentence denies that the speaker has a certain  ability. (This is the contribution of the word `can't'.) – Untying is a way of making something not tied. – The sentence doesn't mean that the knot has one  hand; it has to do with how many hands are used to  do the untyingdo the untying.
  45. 45. Problems in Semantics in NLProblems in Semantics in NL • If you do not understand certain you do ot u de sta d ce ta characteristics of linguistics, you will not be  able to understand the semantics. • If you do understand them, you need to feel  them • If you do feel them, you need to see the context • If you see the context, you are dealt with both  ti   d  ti  i  NLsemantics and pragmatics in NL • ☺
  46. 46. SynonymySynonymy • Synonyms are different words (or sometimesSynonyms are different words (or sometimes  phrases) with identical or very similar  meaningsmeanings. • Words that are synonyms are said to  be synonymous and the state of being abe synonymous, and the state of being a  synonym is called synonymy
  47. 47. SynonymySynonymy • student and pupil (noun)student and pupil (noun) • buy and purchase (verb) i k d ill ( dj i )• sick and ill (adjective) • quickly and speedily (adverb) • on and upon (preposition)
  48. 48. SynonymySynonymy • Note that synonyms are defined with respectNote that synonyms are defined with respect  to certain senses of words  • pupil as the "aperture in the iris of the eye" is• pupil as the aperture in the iris of the eye is  not synonymous with student.  Si il l h i d h h• Similarly,he expired means the same as he  died, yet my passport has expired cannot be  l d b h di dreplaced by my passport has died. 
  49. 49. AntonymyAntonymy • Antonyms are words with opposite or nearlyAntonyms are words with opposite or nearly  opposite meanings. For example: • short and tall• short and tall • dead and alive • increase and decrease
  50. 50. HomonymyHomonymy • a homonym is one of a group of words thata homonym is one of a group of words that  share the same spelling and the same  pronunciation but have different meanings,  usually as a result of the two words having  different origins.  • The state of being a homonym is  called homonymy.  • bark (the sound of a dog) and bark (the skin of  a tree).
  51. 51. HeteronymyHeteronymy • heteronyms (also known asheterophones) areheteronyms (also known asheterophones) are  words with identical spellings (or characters)  but different pronunciations and meaningsbut different pronunciations and meanings.
  52. 52. Monolingual ambiguity • morphological ambiguity: – German -en: noun plural, dative plural, weak noun non-nominative, adjective masculine non-nominative, etc. • compound nouns: – coincide -> coin+cide, cooperate -> cooper+ate • category ambiguity: – round: the first round (noun), to round up cattle (verb), the round table (adjective), go on a voyage round the Mediterranean (preposition), it measure three feet round (adverb), etc. • homographs and polysemes: – branch: ‘of a tree’, ‘of a bank’; crane (a bird or lifting machine) – ball: The ball rolled down the hill, The ball lasted until midnight
  53. 53. Bilingual lexical ambiguity • English wall: German Mauer (outside) or Wand (inside) • English river: French fleuve (major) or rivière (general term) • English leg: French jambe (human), patte (animal, insect), pied (table), étape (journey) • English blue: Russian goluboi (pale blue) or sinii (dark blue) • French louer: English hire or rent • German leihen: English borrow or lend • English wear: Japanese haoru (coat/jacket), haku (shoes/trousers), kaburu (hat), hameru (ring/gloves), shimeru (belt/tie/scarf), tsukeru (brooch/clip), kakeru (glasses/necklace) • resolvable by: – rules (indicating allowable or usual categories or types of subjects, objects, verbs, etc.) – collocations (specifying particular adjacent words) – frequencies (most probable adjacent or dependent words)
  54. 54. Structural ambiguity • Flying planes can be dangerous • The man saw the girl with a telescope • John mentioned the book I sent to Mary • I told everyone concerned about the strike – everyone concerned/involved/relevant, or: everyone disturbed/worried • He noticed her shaking hands – either which were shaking from cold, or which were shaking other hands • They complained to the guide that they could not hear – that as relative pronoun (‘whom they could not hear’) or as complementizer (‘that they could not hear him’) • The mathematics students sat their examinations • The mathematics students study today is very complex – difficulty of identifying noun compound vs. relative clause • Gas pump prices rose last time oil stocks fell – each word potentially noun or verb
  55. 55. ReferenceReference • Richard ThomsonRichard Thomson ents/general/what is semantics htmlents/general/what‐is‐semantics.html
  56. 56. ReferenceReference • NLP for Prolog Programmers by Michael A  NLP for Prolog Programmers by Michael A.  Covington Chapter 4Chapter 4 Rushdi Shams, Dept of CSE, KUET,  Bangladesh 1
  57. 57. ReferenceReference • Wikipedia  Wikipedia, nsns Rushdi Shams, Dept of CSE, KUET,  Bangladesh 1