Language Modeling in Turner&Charniak (2007)

Language
Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs

Language Modeling in Charniak’s LM

Determiner

Turner&Charniak (2007) Selection
Method
Results
Reasons for Success

References
Kilian Evang

2009-11-30

Language
Recap: Language Models Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs
Charniak’s LM
◮ LMs assign probabilities to sentences Determiner
Selection
◮ a sentence is a complex event Method
Results
◮ LMs break it up into a sequence of “atomic” events Reasons for Success

References
◮ each “atomic” event conditioned on certain previous
events
◮ conditional probabilities approximated by counting and
smoothing

Language
N-gram LMs Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
n-gram LMs Charniak’s LM N-gram LMs
Charniak’s LM
sequence represents sentence Determiner
Selection
p(sent) = p(seq) Method

events are words, Results
Reasons for Success

end symbols References

conditioned on the n − 1
previous
events

Language
A Sentence – a Sequence of Events Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs

Sentence Charniak’s LM

Determiner
Selection
Method
Results
Reasons for Success

put the ball in the box References

Event sequence
put, the, ball, in, the, box, ∆

Language
A Sentence – a Sequence of Events Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs

Sentence Charniak’s LM

Determiner
Selection
Method
Results
Reasons for Success

put the ball in the box References

Conditional probability
p(wi = the|wi −2 = ball, wi −1 = in)

Language
N-gram LMs vs. Charniak’s Parsing LM Modeling in
Turner&Charniak
(2007)

Kilian Evang

n-gram LMs Charniak’s LM Language Models
N-gram LMs
sequence represents sentence parse tree Charniak’s LM

p(sent) = p(seq) p(seq) Determiner
Selection
seq Method
events are words, pre-terminals, Results
Reasons for Success
end symbols terminals, References
constituents,
end symbols
conditioned on the n − 1 certain previous
previous events, depending
events on type

Language
A Parse Tree – a Sequence of Events Modeling in
Turner&Charniak
(2007)

Kilian Evang
Parse tree
vp Language Models
N-gram LMs
Charniak’s LM

Determiner
np pp Selection
Method
Results
Reasons for Success
np References

verb det noun prep det noun

put the ball in the box

Language
Turner&Charniak
(2007)

Kilian Evang
Parse tree
vp Language Models
N-gram LMs
Charniak’s LM

Determiner
np pp Selection
Method
Results
Reasons for Success
np References



Event sequence
verb, put, M, ∆, M, np, pp, ∆, noun, ball, M, det, ∆, M, ∆, the,
prep, in, M, ∆, M, np, ∆, noun, box, M, det, ∆, M, ∆, the

Language
Digression: Non-head Constituents Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
Tree fragment N-gram LMs
Charniak’s LM

l Determiner
Selection
Method
Results
Lm ... L1 t R1 ... Rn Reasons for Success

References

h

Event sequence fragment
M, L1 , . . ., Lm , ∆, M, R1 , . . ., Rn , ∆

Language
Turner&Charniak
(2007)

Kilian Evang
Parse tree
Language Models
vp N-gram LMs
Charniak’s LM

Determiner
np pp Selection
Method
Results
Reasons for Success
np References



Conditional probability for a head pre-terminal
p(t = noun|l = np, m = vp, u = verb, i = put)

Language
Turner&Charniak
(2007)

Kilian Evang
Parse tree
Language Models
vp N-gram LMs
Charniak’s LM

Determiner
np pp Selection
Method
Results
Reasons for Success
np References



Conditional probability for a head terminal
p(h = ball|t = noun, l = np, m = vp, u = verb, i = put)

Language
Turner&Charniak
(2007)

Parse tree Kilian Evang

vp Language Models
N-gram LMs
Charniak’s LM

Determiner
np pp Selection
Method
Results
Reasons for Success
np
References



Conditional probability for a non-head constituent
p(Li = det|Li −1 = M, h = ball, t = noun, l = np, m =
vp, u = verb)

Language
Overview: Conditioning Modeling in
Turner&Charniak
(2007)

Kilian Evang
event type conditioned on
Language Models
head pre-terminal t constituent label l,
N-gram LMs
mother constituent label m, Charniak’s LM

mother constituent head pre-terminal u Determiner
Selection
mother consitutent head terminal i
Method
head terminal h head pre-terminal t, Results
Reasons for Success
constituent label l,
References
mother constituent label m,
mother constituent head pre-terminal u
mother consitutent head terminal i
non-head (part of) L1...i −1 (L1...m , R1...i −1 ),
constituent label Li (Ri ), head terminal h,
end symbol ∆ head pre-terminal t,
constituent label l,
mother constituent label m,
mother constituent head pre-terminal u

Language
Determiner Selection Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs
Charniak’s LM

Determiner
◮ for each NP, Selection
Method
◮ for each possible determiner (the, a/an, null), Results
Reasons for Success
◮ determine probability of NP
References
◮ choose determiner resulting in highest probability
◮ note: suﬃcient to determine probabilities for events
that diﬀer

Language
Determiner Selection – Example Modeling in
Turner&Charniak
(2007)

Kilian Evang
◮ “put [NP the ball] in the box”
Language Models
◮ p(L1 = det|m = vp, u = verb, l = np, t = noun, h = N-gram LMs
Charniak’s LM
ball) × p(L2 = ∆|m = vp, u = verb, l = np, t =
Determiner
noun, h = ball, L1 = det) × p(det → the|m = vp, u = Selection
verb, l = np, t = noun, h = ball, L1 = det) Method
Results
Reasons for Success
◮ “put [NP a/an ball] in the box”
References
◮ p(L1 = det|m = vp, u = verb, l = np, t = noun, h =
ball) × p(L2 = ∆|m = vp, u = verb, l = np, t =
noun, h = ball, L1 = det) × (p(det → a|m = vp, u =
verb, l = np, t = noun, h = ball, L1 = det) + p(det →
an|m = vp, u = verb, l = np, t = noun, h = ball, L1 =
det))
◮ “put [NP ball] in the box”
◮ p(L1 = ∆|m = vp, u = verb, l = np, t = noun, h = ball)

Language
Results Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs
Charniak’s LM

Determiner
Selection
Method
Results
Reasons for Success

References

Language
Reasons for Success Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
N-gram LMs
Charniak’s LM

Determiner
◮ syntactic structure allows for long-distance conditioning, Selection

e.g. Method
Results
Reasons for Success
◮ he [VP gave [NP the sultan of Brunei] [NP a cactus]]
References
◮ constituent head enforces selectional preferences,
reﬂected in head-ﬁrst strategy
◮ ...

Language
References Modeling in
Turner&Charniak
(2007)

Kilian Evang

Language Models
Eugene Charniak (2000) N-gram LMs

A Maximum-Entropy-Inspired Parser Charniak’s LM

Determiner
Proceedings of the First Meeting of the North American Selection

Chapter of the Association for Computational Linguistics Method
Results
Reasons for Success

Eugene Charniak (2001) References

Immediate-Head Parsing for Language Models
Proceedings of the 39th Annual Meeting of the
Association for Computational Linguistics
Jenine Turner & Eugene Charniak (2007)
Language Modeling for Determiner Selection
Proceedings of NAACL HLT 2007, Companion Volume

Language Modeling in Turner&Charniak (2007)

More Related Content

Viewers also liked

Similar to Language Modeling in Turner&Charniak (2007)

Recently uploaded

Language Modeling in Turner&Charniak (2007)