My INSURER PTE LTD - Insurtech Innovation Award 2024
Unsupervised Partial Parsing: Thesis defense
1. Unsupervised Partial Parsing
Elias Ponvert
Department of Linguistics
The University of Texas at Austin
Dissertation Defense
July 27, 2011
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
1 / 62
2. 1
2
3
4
Goals and contributions
Unsupervised partial parsing
Main results
Discussion
Cascaded parsing
Main results
Discussion
Concluding remarks
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
2 / 62
4. Research goals
Specifically:
Learn to predict constituent structure from raw text
the cat saw the red dog run
⇓
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
3 / 62
5. Why unsupervised parsing?
1 Less reliance on annotated training
Hello!
2 Apply to new languages and domains
Særær man
annær man
mæþæn
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
4 / 62
6. Assumptions made in parser learning
Getting these labels right AS WELL AS the structure
of the tree is hard
S
PP
,
P
NP
on
N
,
NP
Det
the
A
VP
N
brown bear
V
sleeps
Sunday
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
5 / 62
7. Assumptions made in parser learning
So the task is to identify the structure alone
,
P
N
on Sunday
Elias Ponvert (UT Austin)
,
V
Det
the
A
N
sleeps
brown bear
Unsupervised Partial Parsing
Dissertation Defense
5 / 62
8. Assumptions made in parser learning
Learning operates from gold-standard parts-of-speech
(POS) rather than raw text
P N , Det A N V
on Sunday , the brown bear sleeps
,
P
N
V
Det
A
N
,
on Sunday
Klein & Manning 2003 CCM
Bod 2006a, 2006b
Klein & Manning 2005 DMV
Successors to DMV:
- Smith 2006, Smith & Cohen
2009, Headden et al 2009,
Spitkovsky et al 2010ab, &c
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
sleeps
the brown bear
J. Gao et al 2003, 2004
Seginer 2007
this work
Dissertation Defense
5 / 62
9. Unsupervised parsing: desiderata
Raw text
Standard NLP / extensible
Scalable and fast
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
6 / 62
10. Contributions
• Unsupervised parsing satisfying these
desiderata is possible
• Unsupervised partial parsing: predicting local
constituents with high accuracy
• Cascaded models: building constituent structure
bottom up
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
7 / 62
11. Outline
1
2
3
4
Goals and contributions
Unsupervised partial parsing
Main results
Discussion
Cascaded parsing
Main results
Discussion
Concluding remarks
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
8 / 62
12. A new approach: start from the bottom
Unsupervised Partial Parsing =
segmentation of (non-overlapping) multiword constituents
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
9 / 62
13. Unsupervised segmentation of constituents
leaves some room for interpretation
Possible segmentations
• ( the cat ) in ( the hat ) knows ( a lot ) about that
• ( the cat ) ( in the hat ) knows ( a lot ) ( about that )
• ( the cat in the hat ) knows ( a lot about that )
• ( the cat in the hat ) ( knows a lot about that )
• ( the cat in the hat ) ( knows a lot ) ( about that )
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
10 / 62
14. Defining UPP by evaluation
1. Constituent chunks:
non-hierarchical multiword constituents
S
NP
D
The
VP
N
PP
Cat P
knows
NP
in D
N
the
Elias Ponvert (UT Austin)
NP
V
PP
D
N
a
lot about
hat
Unsupervised Partial Parsing
P
NP
N
that
Dissertation Defense
11 / 62
15. Defining UPP by evaluation
2. Base NPs:
non-recursive noun phrases
S
NP
D
The
VP
N
PP
Cat P
knows
NP
in D
N
the
Elias Ponvert (UT Austin)
NP
V
PP
D
N
a
lot about
hat
Unsupervised Partial Parsing
P
NP
N
that
Dissertation Defense
11 / 62
16. Multilingual data for direct evaluation
English WSJ
German Negra
Chinese CTB
WSJ Penn Treebank
Negra Negra German Corpus
CTB Penn Chinese Treebank
Elias Ponvert (UT Austin)
Sentences Types Tokens
49K
44K
1M
21K
49K 300K
19K
37K 430K
Unsupervised Partial Parsing
Dissertation Defense
12 / 62
17. Constituent chunks and NPs in the data
WSJ
Chunks
203K
NPs
172K
Chunks ∩ NPs 161K
Negra
Chunks
59K
NPs
33K
Chunks ∩ NPs 23K
CTB
Chunks
92K
NPs
56K
Chunks ∩ NPs 43K
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
13 / 62
18. The benchmark: CCL parser
the
cat
saw
run
the
red
dog
Constituency tree
0
the
0
1
cat
saw
0
0
0
the
0
0
red
0
dog
0
run
Common Cover Links representation
Seginer (2007 ACL; 2007 PhD UvA)
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
14 / 62
19. Hypothesis
Segmentation can be learned by
generalizing on phrasal boundaries
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
15 / 62
20. UPP as a tagging problem
the
cat
in
the
hat
B
I
O
B
I
the
cat
in
the
hat
B Beginning of a constituent
I Inside a constituent
O Not inside a constituent
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
16 / 62
22. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
23. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
24. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
25. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
26. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
27. Unsupervised learning tag model for UPP
I
I
I
B
I
B
STOP
B
B
O
O
O
#
the
Elias Ponvert (UT Austin)
STOP
O
O
cat
in
the
Unsupervised Partial Parsing
hat
#
Dissertation Defense
18 / 62
28. Decoding the tag model for UPP
STOP
#
B
I
O
B
I
STOP
the
cat
in
the
hat
#
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
19 / 62
29. Decoding the tag model for UPP
STOP
#
B
I
O
B
I
STOP
the
cat
in
the
hat
#
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
19 / 62
31. UPP: Models
Hidden Markov Model
B
I
O
B
I
the
cat
in
the
hat
P(
B
I
the
) ≈ P(
B
I
) P( the | B )
I
) P( the | B
Probabilistic right linear grammar
B
I
the
O
cat
P(
B
in
the
I
B
the
I
) = P(
B
I
)
hat
Learning: expectation maximization (EM) via
forward-backward (run to convergence)
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
21 / 62
32. UPP: Models
Hidden Markov Model
B
I
O
B
I
the
cat
in
the
hat
P(
B
I
the
) ≈ P(
B
I
) P( the | B )
I
) P( the | B
Probabilistic right linear grammar
B
I
the
O
cat
P(
B
in
the
I
B
the
I
) = P(
B
I
)
hat
Decoding: Viterbi
Smoothing: additive smoothing on emissions
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
21 / 62
33. UPP: Constraints on sequences
the
cat
in
the
hat
STOP
B
I
O
B
I
STOP
#
the
cat
in
the
hat
#
STOP
O
Elias Ponvert (UT Austin)
B
I
Unsupervised Partial Parsing
Dissertation Defense
22 / 62
34. UPP evaluation: Setup
• Evaluation by comparison to treebank data
• Standard train / development / test splits
• Precision and recall on matched constituents
• Benchmark: CCL
• Both get tokenization, punctuation,
sentence boundaries
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
23 / 62
37. PRLG example output
(the seeds) already are in (the script)
(little chance) that (shane longman) is going
to recoup today
it would have (severe implications) for
(farmers ’ policy) holders
(thames ’s u.s. marketing agent)
(donald taffner) is preparing to do just that
and all (the while) (the bonds) are in
(the baby ’s diaper)
(mr. rustin) is (senior correspondent) in
(the journal ’s london bureau)
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
26 / 62
38. UPP: Review
• Sequence models can generalize on indicators
for phrasal boundaries
• Leads to improved unsupervised segmentation
• Learn to predict NPs with high accuracy
•
(English and German especially)
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
27 / 62
39. Outline
1
2
3
4
Goals and contributions
Unsupervised partial parsing
Main results
Discussion
Cascaded parsing
Main results
Discussion
Concluding remarks
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
28 / 62
40. Question
How do UPP models capture
noun phrase structure?
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
29 / 62
41. What UPP models learn
B 100 · P(w|B)
I
the
a
to
’s
in
mr.
its
of
an
and
%
million
be
company
year
market
billion
share
new
than
21.0
8.7
6.5
2.8
1.9
1.8
1.6
1.4
1.4
1.4
100 · P(w|I)
1.8
1.6
1.3
0.9
0.8
0.7
0.6
0.5
0.5
0.5
O 100 · P(w|O)
of
and
in
that
to
for
is
it
said
on
5.8
4.0
3.7
2.2
2.1
2.0
2.0
1.7
1.7
1.5
HMM Emissions: WSJ
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
30 / 62
42. What UPP models learn
B 100 · P(w|B)
I
der
die
den
und
im
das
des
dem
eine
ein
uhr
juni
jahren
prozent
mark
stadt
000
the
the
the
and
in
the
the
the
a
a
13.0
12.2
4.4
3.3
3.2
2.9
2.7
2.4
2.1
2.0
100 · P(w|I)
o’clock
June
years
percent
currency
city
millionen
millions
jahre
year
frankfurter
Frankfurt
0.8
0.6
0.4
0.4
0.3
0.3
0.3
0.3
0.3
0.3
O 100 · P(w|O)
in
und
mit
¨
fur
auf
zu
von
sich
ist
nicht
in
and
with
for
on
to
of
oneself
is
not
3.4
2.7
1.7
1.6
1.5
1.4
1.3
1.3
1.3
1.2
HMM Emissions: Negra
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
30 / 62
43. What UPP models learn
B
的
一
和
两
这
有
经济
各
全
不
100 · P(w|B)
de, of
one
and
two
this
have
economy
each
all
no
14.3
3.1
1.1
0.9
0.8
0.8
0.7
0.7
0.7
0.6
I
的
了
个
年
说
中
上
人
大
国
100 · P(w|I)
de
(perf. asp.)
ge (measure)
year
say
middle
on, above
person
big
country
3.9
2.2
1.5
1.3
1.0
0.9
0.9
0.7
0.7
0.6
O 100 · P(w|O)
在
是
中国
也
不
对
和
的
将
有
at, in
is
China
also
no
pair
and
de
fut. tns.
have
3.4
2.4
1.4
1.2
1.2
1.1
1.0
1.0
1.0
1.0
HMM Emissions: CTB
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
30 / 62
44. Question
What about the PRLG, why does it do so
much better than the HMM?
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
31 / 62
45. Question
Hidden Markov Model
B
I
O
B
I
the
cat
in
the
hat
P(
B
I
the
) ≈ P(
B
I
) P( the | B )
I
) P( the | B
Probabilistic right linear grammar
B
I
the
O
cat
P(
B
in
the
I
B
the
I
) = P(
B
I
)
hat
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
31 / 62
46. What’s wrong with this picture?
B 100 · P(w|B)
I
the
a
to
’s
in
mr.
its
of
an
and
%
million
be
company
year
market
billion
share
new
than
Elias Ponvert (UT Austin)
21.0
8.7
6.5
2.8
1.9
1.8
1.6
1.4
1.4
1.4
100 · P(w|I)
1.8
1.6
1.3
0.9
0.8
0.7
0.6
0.5
0.5
0.5
Unsupervised Partial Parsing
O 100 · P(w|O)
of
and
in
that
to
for
is
it
said
on
5.8
4.0
3.7
2.2
2.1
2.0
2.0
1.7
1.7
1.5
Dissertation Defense
32 / 62
47. What’s wrong with this picture?
B 100 · P(w|B)
I
the
a
to
’s
in
mr.
its
of
an
and
%
million
be
company
year
market
billion
share
new
than
21.0
8.7
6.5
2.8
1.9
1.8
1.6
1.4
1.4
1.4
100 · P(w|I)
1.8
1.6
1.3
0.9
0.8
0.7
0.6
0.5
0.5
0.5
O 100 · P(w|O)
of
and
in
that
to
for
is
it
said
on
5.8
4.0
3.7
2.2
2.1
2.0
2.0
1.7
1.7
1.5
• ’s occurs (immediately) before several terms that
appear after B
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
32 / 62
48. PRLG rule probabilities
B
B
B
B
B
B
B
B
B
B
B
100 · P(B → w q)
→ the I 28.2
→ a I
11.7
→ mr. I
2.4
→ its I
2.2
→ an I
1.9
→ his I
1.0
→ this I
1.0
→ their I 1.0
→ some I 0.7
→ new I 0.6
Elias Ponvert (UT Austin)
I
I
I
I
I
I
I
I
I
I
I
→
→
→
→
→
→
→
→
→
→
100 · P(I → w q)
’s I
2.6
and I
1.3
% O
1.1
million O
0.6
new I
0.5
million STOP 0.5
company O 0.5
year O
0.4
I
0.4
million I
0.4
Unsupervised Partial Parsing
O
O
O
O
O
O
O
O
O
O
O
100 · P(O → w q)
→ of B
3.8
→ to O
3.6
→ in B
2.5
→ and O 1.7
→ to B
1.7
→ of O
1.6
→ in O
1.5
→ and B
1.4
→ for B
1.3
→ it O
1.3
Dissertation Defense
33 / 62
49. PRLG rule probabilities
B
B
B
B
B
B
B
B
B
B
B
100 · P(B → w q)
→ the I 28.2
→ a I
11.7
→ mr. I
2.4
→ its I
2.2
→ an I
1.9
→ his I
1.0
→ this I
1.0
→ their I 1.0
→ some I 0.7
→ new I 0.6
Elias Ponvert (UT Austin)
I
I
I
I
I
I
I
I
I
I
I
→
→
→
→
→
→
→
→
→
→
100 · P(I → w q)
’s I
2.6
and I
1.3
% O
1.1
million O
0.6
new I
0.5
million STOP 0.5
company O 0.5
year O
0.4
I
0.4
million I
0.4
Unsupervised Partial Parsing
O
O
O
O
O
O
O
O
O
O
O
100 · P(O → w q)
→ of B
3.8
→ to O
3.6
→ in B
2.5
→ and O 1.7
→ to B
1.7
→ of O
1.6
→ in O
1.5
→ and B
1.4
→ for B
1.3
→ it O
1.3
Dissertation Defense
33 / 62
50. PRLG rule probabilities
B
B
B
B
B
B
B
B
B
B
B
100 · P(B → w q)
→ the I 28.2
→ a I
11.7
→ mr. I
2.4
→ its I
2.2
→ an I
1.9
→ his I
1.0
→ this I
1.0
→ their I 1.0
→ some I 0.7
→ new I 0.6
Elias Ponvert (UT Austin)
I
I
I
I
I
I
I
I
I
I
I
→
→
→
→
→
→
→
→
→
→
100 · P(I → w q)
’s I
2.6
and I
1.3
% O
1.1
million O
0.6
new I
0.5
million STOP 0.5
company O 0.5
year O
0.4
I
0.4
million I
0.4
Unsupervised Partial Parsing
O
O
O
O
O
O
O
O
O
O
O
100 · P(O → w q)
→ of B
3.8
→ to O
3.6
→ in B
2.5
→ and O 1.7
→ to B
1.7
→ of O
1.6
→ in O
1.5
→ and B
1.4
→ for B
1.3
→ it O
1.3
Dissertation Defense
33 / 62
51. PRLG rule probabilities
B
B
B
B
B
B
B
B
B
B
B
100 · P(B → w q)
→ the I 28.2
→ a I
11.7
→ mr. I
2.4
→ its I
2.2
→ an I
1.9
→ his I
1.0
→ this I
1.0
→ their I 1.0
→ some I 0.7
→ new I 0.6
Elias Ponvert (UT Austin)
I
I
I
I
I
I
I
I
I
I
I
→
→
→
→
→
→
→
→
→
→
100 · P(I → w q)
’s I
2.6
and I
1.3
% O
1.1
million O
0.6
new I
0.5
million STOP 0.5
company O 0.5
year O
0.4
I
0.4
million I
0.4
Unsupervised Partial Parsing
O
O
O
O
O
O
O
O
O
O
O
100 · P(O → w q)
→ of B
3.8
→ to O
3.6
→ in B
2.5
→ and O 1.7
→ to B
1.7
→ of O
1.6
→ in O
1.5
→ and B
1.4
→ for B
1.3
→ it O
1.3
Dissertation Defense
33 / 62
52. PRLG rule probabilities
B
B
B
B
B
B
B
B
B
B
B
100 · P(B → w q)
→ the I 28.2
→ a I
11.7
→ mr. I
2.4
→ its I
2.2
→ an I
1.9
→ his I
1.0
→ this I
1.0
→ their I 1.0
→ some I 0.7
→ new I 0.6
Elias Ponvert (UT Austin)
I
I
I
I
I
I
I
I
I
I
I
→
→
→
→
→
→
→
→
→
→
100 · P(I → w q)
’s I
2.6
and I
1.3
% O
1.1
million O
0.6
new I
0.5
million STOP 0.5
company O 0.5
year O
0.4
I
0.4
million I
0.4
Unsupervised Partial Parsing
O
O
O
O
O
O
O
O
O
O
O
100 · P(O → w q)
→ of B
3.8
→ to O
3.6
→ in B
2.5
→ and O 1.7
→ to B
1.7
→ of O
1.6
→ in O
1.5
→ and B
1.4
→ for B
1.3
→ it O
1.3
Dissertation Defense
33 / 62
53. Learning curves: Base NPs
80
80
F -score
60
40
20
10 20 30 40K
sentences
80
60
60
40
40
20
20
100
60
EM iter
20
20
30 40K
10 sentences
0 20 40 60 80 100
EM iter
1
PRLG chunking model: WSJ
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
34 / 62
54. 50
40
30
20
10
F -score
Learning curves: Base NPs
5 10 15K
sentences
50
40
30
20
10
40
20
140
80
EM iter
20
5
10
15K
0
50 100 150
EM iter
sentences
1
PRLG chunking model: Negra
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
34 / 62
55. Learning curves: Base NPs
30
30
F -score
20
10
0
5
10 15K
sentences
30
20
20
10
10
0
100
60
EM iter
20
5
10
15K
0 20 40 60 80 100
EM iter
sentences
PRLG chunking model: CTB
1
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
34 / 62
56. Question
How much can these models learn?
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
35 / 62
57. Against a supervised benchmark
Base NPs F-score
Supervised PRLG
Unsupervised PRLG
80
60
40
20
∼4500 10K
20K
30K
40K
WSJ Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
36 / 62
58. Against a supervised benchmark
Base NPs F-score
Supervised PRLG
Unsupervised PRLG
50
40
30
20
10
∼2200
5K
10K
15K
Negra Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
36 / 62
59. Against a supervised benchmark
Base NPs F-score
Supervised PRLG
Unsupervised PRLG
50
40
30
20
10
5
10
15K
CTB Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
36 / 62
60. Negra/CTB training much smaller than WSJ
WSJ PRLG
Base NPs F-score
80
60
40
Negra PRLG
CTB PRLG
20
10K
20K
30K
40K
Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
37 / 62
61. Treebank precision
S
NP
D
The
VP
N
PP
Cat P
NP
in D
the
NP
V
knows
PP
N
a
N
D
lot about
P
hat
NP
N
that
(the cat in the hat) knows (a lot) (about that)
• Constituent chunks: Prec = 2/3, Rec = 2/3, F = 2/3
• Base NPs: Prec = 1/3, Rec = 1/2
• Treebank precision: 3/3
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
38 / 62
62. On chunking the CTB
50
Treebank precision
30
Base NPs F-score
Constituent chunk F-score
10
3
20
60
80
40
EM Iterations
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
39 / 62
63. Question.
Do these models scale?
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
40 / 62
64. Chunking with training from Gigaword NYT
90
Treebank precision
80
Base NPs F
70
Const. chunks F
60
50
+160K +320K +480K
+NYT Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
+640K
Dissertation Defense
41 / 62
65. Chunking with training from Gigaword NYT
90
Treebank precision
80
Base NPs F
70
Const. chunks F
60
50
WSJ
+160K
+320K
+480K
+640K
+NYT Sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
41 / 62
66. Outline
1
2
3
4
Goals and contributions
Unsupervised partial parsing
Main results
Discussion
Cascaded parsing
Main results
Discussion
Concluding remarks
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
42 / 62
67. Question
Are we limited to segmentation?
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
43 / 62
68. Hypothesis
Identification of higher level constituents
can also be learned by generalizing on
phrasal boundaries
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
44 / 62
69. Cascaded UPP: 1 Segment raw text
there
is
no
asbestos
in
our
products
now
there
is
no
asbestos
in
our
products
now
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
45 / 62
70. Cascaded UPP: 2 Choose stand-ins for phrases
there
is
is
no
in
our
no asbestos
there
Elias Ponvert (UT Austin)
asbestos
products
our
is
in
our
Unsupervised Partial Parsing
now
products
now
Dissertation Defense
45 / 62
71. Cascaded UPP: 3 Segment text + phrasal stand-ins
there
is
in
our
now
there
is
in
our
now
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
45 / 62
72. Cascaded UPP: 4 Choose stand-ins and repeat steps 3–4
there
is
in
our
there
is
in
our
no asbestos
is
Elias Ponvert (UT Austin)
now
in
Unsupervised Partial Parsing
products
now
Dissertation Defense
45 / 62
73. Cascaded UPP: 5 Unwind to output tree
there
is
in
our
no asbestos
is
there
Elias Ponvert (UT Austin)
in
products
now
now
is
no asbestos
in
our products
Unsupervised Partial Parsing
Dissertation Defense
45 / 62
74. Cascaded UPP: Review
• Separate models learned at each cascade level
• Models share hyper-parameters (smoothing etc)
• Choice of pseudowords as phrasal stand-ins
• Pseudoword-identification: corpus frequency
• Cascade run to convergence
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
46 / 62
75. Right-branching baseline
the quick brown fox jumped over the lazy dog
the
quick
brown
fox
jumped
over
the
lazy
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
dog
Dissertation Defense
47 / 62
76. Right-branching baseline
a Lorillard spokeswoman said , this is an old story
a
this
Lorillard
is
spokeswoman said
an
old
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
story
47 / 62
78. Another benchmark: CCM
Constituent-context model (Klein Manning, 2002)
• Generative probabilistic model
• Gold-standard POS
• Short sentences
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
49 / 62
79. Evaluation on ≤10 word setences
WSJ
Negra
CTB
0
10
20
30
40
50
60
70
Constituents F-score
Baseline CCM CCL
Cascaded HMM Cascaded PRLG
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
50 / 62
80. Example parses
two
Gold standard
share
a house
almost devoid
offurniture
two share
a house almost devoid of furniture
Cascaded PRLG – WSJ
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
correct
incorrect
Dissertation Defense
51 / 62
83. Example parses
Gold standard
bei
bei
with
bleibt alles
den windsors in
bleibt alles
in
stays
in
der familie
everything
den
windsors
the
der familie
Windsors
the
family
With the Windsors everything stays in the family.
Cascaded PRLG – Negra
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
correct
incorrect
Dissertation Defense
52 / 62
94. Outline
1
2
3
4
Goals and contributions
Unsupervised partial parsing
Main results
Discussion
Cascaded parsing
Main results
Discussion
Concluding remarks
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
58 / 62
95. What we’ve learned
• Unsupervised identification of base NPs and
local constituents is possible
• A cascade of chunking models for raw text
parsing has state-of-the-art results
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
59 / 62
96. Future directions
• Improvements to the sequence models
• Better phrasal stand-in (pseudoword)
construction
• Learning joint models rather than a cascade
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
60 / 62
97. Historical note
First known computational natural language parser
Transformations and Discourse Analysis Project
Zellig Harris colleagues, UPenn 1950s - 1960s
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
61 / 62
98. Historical note
To the best of our knowledge, this is the first
application of FSTs to parsing. The program
consisted of the following phases:
1. Dictionary look-up.
2. Replacement of some ‘grammatical idioms’ by a
single part of speech.
3. Rule based part of speech disambiguation.
4. A right to left FST composed with a left to right
FST for computing ‘simple noun phrases’.
Joshi Hopely 1997
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
61 / 62
99. Historical note
To the best of our knowledge, this is the first
application of FSTs to parsing. The program
consisted of the following phases:
4. A left to right FST for computing ‘simple
adjuncts’ such as prepositional phrases and
adverbial phrases.
5. A left to right FST for computing simple verb
clusters.
6. A left to right ‘FST’ for computing clauses.
Joshi Hopely 1997
Elias Ponvert (UT Austin)
Unsupervised Partial Parsing
Dissertation Defense
61 / 62