4. 前提文によるコンテキストがなければ…
4
ある同一の仮説文 ℎ で
あっても、どのような
前提文を与えるかに
よって、含意関係ラベ
ルは変化する
前提文によるコンテキ
スト抜きに、2文の関係
が決まることは、原理
的に有り得ない
Sentence
𝑠1 Two boys are
swimming in the pool.
E
𝑠2 Two girls are playing
the basketball.
N
𝑠3 Two women are
swimming in the pool.
C
ℎ Two children are
swimming in the pool.
12. SNLIコーパスとSICKコーパスの基本統計
12
SNLI SICK
学習セット事例数 55K 4500
開発セット事例数 10K 500
テストセット事例数 10K 4927
学習セット語彙サイズ 36427 2178
テストセット未知語率(vs 学習セット) 0.24% 0.29%
テストセット未知語率(vs 異なるコーパ
スの学習セット)
10.3% 0.15%
• SNLI training set is enough large to cover SICK test set as well as
SNLI test set.
• SICK training set covers its own test set, but does not cover SNLI
test set.
14. 各コーパスの作成手順
SNLIコーパス SICKコーパス
14
① Flicker コーパスの文を、
前提文として作業者に
提示
② 作業者は、前提文に対
して含意する文、中立
の文、矛盾する文、の
3文を作文
① Flicker コーパス+別
コーパスの文を自動的
に簡略化
② 文対を作成
③ 作業者は、提示された
文対を3種類に分類
20. 「矛盾」の確率が高い nobody の用例
20
前提文 仮説文
矛盾
A woman is walking across the street eating
a banana, while a man is following with his
briefcase.
Nobody has food.
A man and a woman are standing next to
sculptures, talking while another man looks
at other sculptures.
Nobody is standing.
A group of young girls playing jump rope in
the street.
Nobody is playing
jump rope.
中立
Three young girls posing for a picture in an
outdoor amphitheater, surrounded by adults
watching a conference.
Nobody is wearing
a hat.
含意
Lacrosse players struggling for control of the
ball.
Nobody is in
control of the ball.
前提文から否定文を作成(例えば not の挿入)→「矛盾」の用例を作成する
ことは、SNLIコーパスの作業マニュアルで禁止されている。が十分ではない。
22. 「含意」の確率が高い proximity の用例
22
前提文 仮説文
含意
A bride and groom dance surrounded
by people at the reception.
A married couple is in the
proximity of other humans.
Many people are dunking to support
special olympics.
Several people are in close
proximity to each other.
A bull charges at a man within a
stadium while an audience watches.
Onlookers view a person
and an animal in close
proximity to each other.
中立
Child playing in waves with sun on
the horizon.
A child is playing in the
water with her mother in
close proximity.
人間の作業者には、複数の人間が出現する情景を描写する前提文が与えられ
ると、その位置関係を使って「含意」の用例を作成する癖がある?
24. 「中立」の確率が高い championship の用例
24
前提文 仮説文
中立
Two soccer teams are competing on a
soccer field.
Two skilled soccer teams
are competing against
one another for the
championship.
A soccer match between a team with
white jerseys, and a team with yellow
jerseys.
The teams are in a
championship match.
There is a baseball player standing at
home plate, the catcher behind him has
his hand up in the air with his glove, and
the umpire is standing behind him, and
many people in the stands.
The final game of the
championship is being
played while many fans
are in the stands.
人間の作業者には、スポーツの試合の情景を描写している前提文が与えられ
ると、championship を使って「中立」(無関係)な文を作ろうとする癖が
ある?
26. 「矛盾」の確率が低い higher の用例
26
前提文 仮説文
含意
A speed boat pulling a waterskier
along a jump.
The skier is going higher in
the water.
Top of the stands looking down at the
baseball stadium.
The baseball stadium seats
are higher than the field.
中立
Two men, one in a circuit city t-shirt,
the other in an M&Ms t-shirt, operate
video game guns.
One man has a higher
score than the other .
A young smiling woman is having fun
on a rustic looking swing.
A woman is trying to swing
higher than her friend.
矛盾
Red objects fall on men standing
behind a red wall.
The men are higher than
the wall.
人間の作業者は、複数の人物(や対象)を含む前提文が与えられると、その
比較を行って含意や中立の用例を作ることができる。ただし、矛盾の用例で
は、higher は極端に使われない。
29. NNを用いた含意関係認識手法
29
Encoder-decoder model (Tim Rocktaschel et al,
ICLR2016)
Encoder using LSTM converts a premise sentence into a
vector representation.
Decoder using LSTM inferences based on the above
vector representation and a hypothesis sentence.
Attention Based Convolutional NN (Wenpeng Yin et
al, TACL2016)
Tree-based convolution model (LiLi et al, ACL2016)