Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Language Modeling in Turner&Charniak (2007)

598 views

Published on

Class presentation for a seminar on computational approaches to functional elements.

Published in: Technology
  • Be the first to like this

Language Modeling in Turner&Charniak (2007)

  1. 1. Language Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Language Modeling in Charniak’s LM Determiner Turner&Charniak (2007) Selection Method Results Reasons for Success References Kilian Evang 2009-11-30
  2. 2. Language Recap: Language Models Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Charniak’s LM ◮ LMs assign probabilities to sentences Determiner Selection ◮ a sentence is a complex event Method Results ◮ LMs break it up into a sequence of “atomic” events Reasons for Success References ◮ each “atomic” event conditioned on certain previous events ◮ conditional probabilities approximated by counting and smoothing
  3. 3. Language N-gram LMs Modeling in Turner&Charniak (2007) Kilian Evang Language Models n-gram LMs Charniak’s LM N-gram LMs Charniak’s LM sequence represents sentence Determiner Selection p(sent) = p(seq) Method events are words, Results Reasons for Success end symbols References conditioned on the n − 1 previous events
  4. 4. Language A Sentence – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Sentence Charniak’s LM Determiner Selection Method Results Reasons for Success put the ball in the box References Event sequence put, the, ball, in, the, box, ∆
  5. 5. Language A Sentence – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Sentence Charniak’s LM Determiner Selection Method Results Reasons for Success put the ball in the box References Conditional probability p(wi = the|wi −2 = ball, wi −1 = in)
  6. 6. Language N-gram LMs vs. Charniak’s Parsing LM Modeling in Turner&Charniak (2007) Kilian Evang n-gram LMs Charniak’s LM Language Models N-gram LMs sequence represents sentence parse tree Charniak’s LM p(sent) = p(seq) p(seq) Determiner Selection seq Method events are words, pre-terminals, Results Reasons for Success end symbols terminals, References constituents, end symbols conditioned on the n − 1 certain previous previous events, depending events on type
  7. 7. Language A Parse Tree – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Parse tree vp Language Models N-gram LMs Charniak’s LM Determiner np pp Selection Method Results Reasons for Success np References verb det noun prep det noun put the ball in the box
  8. 8. Language A Parse Tree – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Parse tree vp Language Models N-gram LMs Charniak’s LM Determiner np pp Selection Method Results Reasons for Success np References verb det noun prep det noun put the ball in the box Event sequence verb, put, M, ∆, M, np, pp, ∆, noun, ball, M, det, ∆, M, ∆, the, prep, in, M, ∆, M, np, ∆, noun, box, M, det, ∆, M, ∆, the
  9. 9. Language Digression: Non-head Constituents Modeling in Turner&Charniak (2007) Kilian Evang Language Models Tree fragment N-gram LMs Charniak’s LM l Determiner Selection Method Results Lm ... L1 t R1 ... Rn Reasons for Success References h Event sequence fragment M, L1 , . . ., Lm , ∆, M, R1 , . . ., Rn , ∆
  10. 10. Language A Parse Tree – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Parse tree Language Models vp N-gram LMs Charniak’s LM Determiner np pp Selection Method Results Reasons for Success np References verb det noun prep det noun put the ball in the box Conditional probability for a head pre-terminal p(t = noun|l = np, m = vp, u = verb, i = put)
  11. 11. Language A Parse Tree – a Sequence of Events Modeling in Turner&Charniak (2007) Kilian Evang Parse tree Language Models vp N-gram LMs Charniak’s LM Determiner np pp Selection Method Results Reasons for Success np References verb det noun prep det noun put the ball in the box Conditional probability for a head terminal p(h = ball|t = noun, l = np, m = vp, u = verb, i = put)
  12. 12. Language A Parse Tree – a Sequence of Events Modeling in Turner&Charniak (2007) Parse tree Kilian Evang vp Language Models N-gram LMs Charniak’s LM Determiner np pp Selection Method Results Reasons for Success np References verb det noun prep det noun put the ball in the box Conditional probability for a non-head constituent p(Li = det|Li −1 = M, h = ball, t = noun, l = np, m = vp, u = verb)
  13. 13. Language Overview: Conditioning Modeling in Turner&Charniak (2007) Kilian Evang event type conditioned on Language Models head pre-terminal t constituent label l, N-gram LMs mother constituent label m, Charniak’s LM mother constituent head pre-terminal u Determiner Selection mother consitutent head terminal i Method head terminal h head pre-terminal t, Results Reasons for Success constituent label l, References mother constituent label m, mother constituent head pre-terminal u mother consitutent head terminal i non-head (part of) L1...i −1 (L1...m , R1...i −1 ), constituent label Li (Ri ), head terminal h, end symbol ∆ head pre-terminal t, constituent label l, mother constituent label m, mother constituent head pre-terminal u
  14. 14. Language Determiner Selection Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Charniak’s LM Determiner ◮ for each NP, Selection Method ◮ for each possible determiner (the, a/an, null), Results Reasons for Success ◮ determine probability of NP References ◮ choose determiner resulting in highest probability ◮ note: sufficient to determine probabilities for events that differ
  15. 15. Language Determiner Selection – Example Modeling in Turner&Charniak (2007) Kilian Evang ◮ “put [NP the ball] in the box” Language Models ◮ p(L1 = det|m = vp, u = verb, l = np, t = noun, h = N-gram LMs Charniak’s LM ball) × p(L2 = ∆|m = vp, u = verb, l = np, t = Determiner noun, h = ball, L1 = det) × p(det → the|m = vp, u = Selection verb, l = np, t = noun, h = ball, L1 = det) Method Results Reasons for Success ◮ “put [NP a/an ball] in the box” References ◮ p(L1 = det|m = vp, u = verb, l = np, t = noun, h = ball) × p(L2 = ∆|m = vp, u = verb, l = np, t = noun, h = ball, L1 = det) × (p(det → a|m = vp, u = verb, l = np, t = noun, h = ball, L1 = det) + p(det → an|m = vp, u = verb, l = np, t = noun, h = ball, L1 = det)) ◮ “put [NP ball] in the box” ◮ p(L1 = ∆|m = vp, u = verb, l = np, t = noun, h = ball)
  16. 16. Language Results Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Charniak’s LM Determiner Selection Method Results Reasons for Success References
  17. 17. Language Reasons for Success Modeling in Turner&Charniak (2007) Kilian Evang Language Models N-gram LMs Charniak’s LM Determiner ◮ syntactic structure allows for long-distance conditioning, Selection e.g. Method Results Reasons for Success ◮ he [VP gave [NP the sultan of Brunei] [NP a cactus]] References ◮ constituent head enforces selectional preferences, reflected in head-first strategy ◮ ...
  18. 18. Language References Modeling in Turner&Charniak (2007) Kilian Evang Language Models Eugene Charniak (2000) N-gram LMs A Maximum-Entropy-Inspired Parser Charniak’s LM Determiner Proceedings of the First Meeting of the North American Selection Chapter of the Association for Computational Linguistics Method Results Reasons for Success Eugene Charniak (2001) References Immediate-Head Parsing for Language Models Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics Jenine Turner & Eugene Charniak (2007) Language Modeling for Determiner Selection Proceedings of NAACL HLT 2007, Companion Volume

×