It

Chap. 12 [Probabilistic Parsing]
HATTORI
Univ. of Tokyo, CS
April 25, 2014

1 12.1 Some Concepts
12.1.1 Parsing for disambiguation
12.1.2 Treebanks
12.1.3 Parsing models vs. language models
12.1.4 Lexicalization
12.1.5 Tree Prob and derivision Prob
12.1.6 more than one way to do it
12.1.7 Phrase structure grammars and dependency
grammars
12.1.8 Evaluation
2 12.2 Some Appraches

12.1 Some Concepts
The practice of parsing is an implementation of chunking
chunking – recognizing higher level units of structre
The overall goal is to produce a system that can place a
probably useful structure over arbitrary sentences, that is,
to build a parser

12.1 Some Concepts
パーザーを考えるとき次の 3 種類の確率が登場する
この章では特に 3 つ目を説明する
determining the sentence:
パーザーとしてある word lattice (わからない Figure
12.1) 上の言語モデルを用いる場合 (e.g. 音声認識) 文章
を語の列と見て確率を決定する最大化する
speedier parsing:
パーザーの探索空間における枝刈りのための確率
choosing between parses:
パーザーは多くの候補の中から尤もらしいものを選択
する

12.1 Some Concepts
Example.
次の例文に対していくつかのパージング結果が考えられる
The post oﬃce will hold out discounts and service
concessions as incentives

12.1 Some Concepts
結果 (a)
((S (NP The post office) (Aux will))
(VP (V hold out)
(NP (NP discounts)
(Conj and)
(NP service concessins))
(PP as incentives)))
結果 (b)
(VP (VP (V hold out)
(NP discounts))
(Conj and)
(VP (VV service)
(NP concessions)
(PP as incentives))))

12.1 Some Concepts
結果 (d)
(VP (V hold)
(PP (P out)
(NP (NP discounts)
(Conj and)
(NP service concessions)))
(PP as incentives)))
例えばこれは明らかな文法ミスで除外できる ((V hold) に
対して目的語 (句) NP が無い) しかしながらこれで除外で
きるのは一部である

12.1 Some Concepts
12.1.2 Treebanks
Treebank
正しいパージング結果の構文木のコレクションを
treebank という
機械学習をしようとする時に用いる
最も多くの人に使われてるのが Penn Treebank
Penn Treebank のマニュアルは 300 ページある

12.1 Some Concepts
12.1.2 Treebanks
Example. A Penn Treebank tree
((S (NP-SBI The move)
(VP followed
(NP (NP a round)
(PP of
(NP (NP similar increases)
(PP by
(NP other leaders))
(PP against
(NP Arizona real estate loans)))))
(S-ADV (NP-SBI *)
(VP reflecting
(NP (NP a continuing decline)
(PP-LOC in
(NP that market))))))
.))
単語の子を持たない空のノードは * を子として表現
最後はピリオドで終わる

12.1 Some Concepts
パージングモデルと言語モデルとの比較を見る

12.1 Some Concepts
Parsing model
パージングとはある文法 G によって文 s から木 t を作る
こと
パージングモデルとは次のような確率を与えるものである
P(t|s, G) where
∑
t
P(t|s, G) = 1
これを用いたパーザーとは次の探索である
t′
= arg max P(t|s, G)

12.1 Some Concepts
Language model
文法 G によって生成され得る全ての木に対して確率 P(t, s) を次のよう
に与える
P(t, s|G) = P(t|G) if yield(t) = s and 0 otherwise
これを与えるのが言語モデル
Deﬁnition of yield (p. 383)
N →∗
wa · · · wb ⇐⇒ yield(N) = wa · · · wb
この時パーザーは次の探索をすればよい
t′
= arg max P(t|s, G) = arg max P(t, s|G)

12.1 Some Concepts
Lexicalization
PCFG の弱点は確率の独立性の仮定

12.1 Some Concepts
Lexicalization
例えば VP を考えると行き先の V がどの単語であって
も展開先の確率は同じ
二重目的語 (ditransitive) である tell の後ろには 2 つ名
詞句が並ぶのが自然
VP → V N N の確率が一番高そう
Table 12.2 は V によってこれだけ確率が違うよを示す
(come なら三割で VP → V PP だけど want ならそれ
の確率はほぼ 0)

12.1 Some Concepts
Lexicalization
例えば VP を考えると行き先の V がどの単語であって
も展開先の確率は同じ
二重目的語 (ditransitive) である tell の後ろには 2 つ名
詞句が並ぶのが自然
VP → V N N の確率が一番高そう
Table 12.2 は V によってこれだけ確率が違うよを示す
(come なら三割で VP → V PP だけど want ならそれ
の確率はほぼ 0)
そこで PCFG の語彙化 (lexicalization) を行う

12.1 Some Concepts
the most straightforward and common
lexicalize
次のような木の
(PP (P into)
(NP (DT the)
(NN store)))
各ノードに “head word” を付与する
(PP-into (P-into into)
(NP-store (DT-the the)
(NN-store store)))
必ずしも充分ではない

12.1 Some Concepts
Probabilities on structural context
PCFGs は context-freeness を仮定しているが実際にはこれは
誤り例えば Table 12.3 (p. 420) は NP が主語にあたるか目
的語にあたるかでの確率の違いを示す
Expansion as Sub as Obj
NP → PRP 14.7 % 2.1 %
NP → NNP 3.5 % 0.9 %
NP → NP PP 5.6 % 14.1 %

12.1 Some Concepts
Tree probability and derivision probability
P(t) =
∑
d is derivision to t
P(d)
大抵の場合は同じ木の導出の順序が違うものを数えている
だけなので順序を固定した導出 canonical derivision d′
to t
(e.g. 最左導出) を定めて
P(t) = P(d′
)
としてよい [Hopcroft and Ullman 1979]

12.1 Some Concepts
導出 d は書き換えルール ri によって次のように表せて
d = S
r1
−→ α1
r2
−→ α2
r3
−→ · · ·
rm
−→ αm = s
その確率
P(d) =
m∏
i=1
P(ri |r1 . . . ri−1)

12.1 Some Concepts
CFG 以外の手法
HMM
Probabilistic Left-Corner Grammars (PLCGs)

12.1 Some Concepts
Probabilistic left-corner grammars (確率的?
LC文法)
PCFG が top-down なのに対して PLCG は bottom-up と
top-down の組み合わせになってる

12.1 Some Concepts
LC parsing
initial:
input = sentence
stack = [S]
loop:
while possible
do one of actions
acrions are
Shift push into stack popped from input
Attach If aa is on top of stack, remove both
Project If a is on top of stack and A → ab, replace a by bA
temination:
if isEmpty input and isEmpty stack
then exit success
else exit failure

12.1 Some Concepts
12.1.7 Phrase structure grammars and dependency grammars
Phrase structure grammars and dependency
grammars
矢印は後の依存関係を表現し矢印の先が “head word” で他
の単語がそれに依存している

12.1 Some Concepts
12.1.7 Phrase structure grammars and dependency grammars
compound noun
“phrase structure model” という合成語の依存関係は
の 2 つが考えられて木にすると次のように
実は lexicalized PCFG の下では P(Nx
) = P(Nv
) が成り立ち結局
P(Ny
) <> P(Nu
) によって 2 つの木の確率を比べることになる
更にこれは phrase → structure と phrase → model の 2 つの
依存関係の比較に等しい

12.1 Some Concepts
12.1.8 Evaluation
Evaluation
parser の性能評価に PARSEVAL measure を用いる
括弧の正解を見る (ただし根は除く

12.2 Some Appraches
以上の知識を実際に適用した例

12.2 Some Appraches
PCFG estimation from a tree bank
[Charniak, 1996]
PCFG from Penn Treebank
POS and phrasal categories
no smoothing
Penn Treebank は特に分岐が多いので個々の展開は希少なた
め本来はスムージングするのが普通だけど最尤推定におい
てはそこまで害にならない
Recall Precision
80.4 78.8

12.2 Some Appraches
Partially unsupervised learning [Pereira and
Schabes, 1992]
Chmosky normal form
15 non-terminal
45 POS as terminal

12.2 Some Appraches
[Collins, 1996]
Recall Precision
88.1 88.6

12.2 Some Appraches

It

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (17)

Recently uploaded

Recently uploaded (9)

It