A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

A GRAPH-BASED CROSS-LINGUAL
PROJECTION APPROACH FOR
WEAKLY SUPERVISED RELATION EXTRACTION
The 50th Annual Meeting of the Association for Computational Linguistics
(ACL 2012)
July 11th, 2012, Jeju

Seokhwan Kim (Institute for Infocomm Research)
Gary Geunbae Lee (POSTECH)

Contents
• Introduction
• Methods
 Cross-lingual Annotation Projection for Relation Extraction
 Graph-based Projection Approach
• Evaluation
• Conclusions

2

Contents
• Introduction
• Methods
• Evaluation
• Conclusions

3

Problem Definition
• Relation Extraction
 To identify semantic relations between a pair of entities

Birthplace

Barack Obama was born in Honolulu , Hawaii .
PER LOC LOC

 Considered as a classification problem

4

Related Work (1)
• Supervised Learning
 Many supervised machine learning approaches have been
successfully applied
• (Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta and
Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al., 2006)

• Semi-supervised Learning
 To obtain the annotations of unlabeled instances from the seed
information
• (Brin, 1999; Riloff and Jones, 1999; Agichtein and Gravano, 2000;
Sudo et al, 2003; Yangarber, 2003; Stevenson and Greenwood, 2006;
Zhang, 2004; Chen el al., 2006; Zhou et al., 2009)

5

Motivation
• Resources for Relation Extraction
 Supervised/Semi-supervised Approaches
• Labeled corpora for supervised learning
• Seed instances for semi-supervised learning
• Available for only a few languages
 ACE 2003 Multilingual Training Dataset
• English (252 articles)
• Chinese (221 articles)
• Arabic (206 articles)
• No resources for other languages
 Korean

6

Related Work (2)
• Self-supervised Learning
 To obtain the annotated dataset without any human effort
 Using the information obtained from external resources
• Heuristic-based Method (Banko et al., 2007; Banko et al., 2008)
• Wikipedia-based Methods (Wu and Weld, 2010)

• Cross-lingual Annotation Projection
 To leverage parallel corpora to project the relation annotations on
the resource-rich source language to the resource-poor target
language (Kim et al., 2010, Kim et al., 2011)

7

Contents
• Introduction
• Methods
• Implementation
• Evaluation
• Conclusions

8

Overall Architecture
Annotation Parallel
Projection
Corpus

Sentences in Sentences in
Ls Lt

Preprocessing Preprocessing
(POS Tagging, (POS Tagging,
Parsing) Parsing)

NER Word Alignment

Relation
Projection
Extraction

Annotated Annotated
Sentences in Sentences in
Ls Lt 9

Direct Projection
(Kim et al., 2010)
• Annotation

• Projection

fE (<Barack Obama, Honolulu>) = 1
Barack Obama was born in Honolulu , Hawaii .

버락 오바마 는 하와이 의 호놀룰루 에서 태어났다
(beo-rak-o-ba-ma) (neun) (ha-wa-i) (ui) (ho-nol-rul-ru) (e-seo) (tae-eo-nat-da)

fK (<버락 오바마, 호놀룰루>) = 1
10

Limitations of Direct Projection
• Direct projection approach is still vulnerable to the
erroneous inputs generated by preprocessors
• Main causes of this limitation
 Considering alignment between entity candidates only, not any
contextual information
 Performed by just a single pass process

11

Graph-based Learning
• Semi-supervised learning algorithm
• Defining a graph
 The nodes represent labeled and unlabeled examples in a dataset
 The edges reﬂect the similarity of examples
• Learning a labeling function in an iterative manner
 It should be close to the given labels on the similar labeled nodes
 It should be smooth on the whole graph
• Related Work
 Graph-based Learning for Relation Extraction (Chen et al, 2006)
 Bilingual projection of POS tagging (Das and Petrov, 2011)

12

Graph Construction
• Graph Nodes
 Instance Nodes
• Defined for all pairs of entity candidates in both languages
• Each instance node has a soft label vector Y = [y+ y-]
 Context Nodes
• For identifying the relation descriptors of the positive instances
• Defined for each trigram which is located between a given entity pair
which is semantically related
• Each context node has a soft label vector Y = [y+ y-]

<ARG1> was born in <ARG2>

<ARG1> was born was born in born in <ARG2> 13

Graph Construction
• Edge Weights
 Between instance node and context node in the same language
𝑤 𝑣 𝑖,𝑗 , 𝑢 𝑘
1 𝑖𝑓 𝑣 𝑖𝑗 ℎ𝑎𝑠 𝑢 𝑘 𝑎𝑠 𝑎 𝑐𝑜𝑛𝑡𝑒𝑥𝑡𝑢𝑎𝑙 𝑠𝑢𝑏𝑠𝑒𝑞𝑢𝑒𝑛𝑐𝑒,
= 0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒.
 Between context nodes in a language
𝑘,
|𝑢 𝑘 ∩ 𝑢 𝑙 |
𝑤(𝑢 𝑢 𝑙) = 𝐽(𝑢 𝑘,
𝑢 𝑙) = 𝑘 .
|𝑢 ∪ 𝑢 𝑙 |

 Between context nodes in source and target languages
𝑐𝑜𝑢𝑛𝑡 𝑢 𝑠𝑘 , 𝑢 𝑙𝑡
𝑤(𝑢 𝑠𝑘 , 𝑢 𝑙𝑡 ) = 𝑘 𝑚
,
𝑢𝑡 𝑚 ‍ 𝑐𝑜𝑢𝑛𝑡 𝑢 𝑠 , 𝑢 𝑡

14

Graph Construction
• Example

15

Label Propagation
Initialize T
• Algorithm
 Input
• A transition matrix T
• An initial label matrix Y0 Normalize T
 Output
• The updated label matrix Yt

Initialize Y

Update Y

16

Label Propagation
• Executed in three phases

1st phase

2nd phase

3rd phase

17

Contents
• Introduction
• Methods
• Evaluation
• Conclusions

18

Implementation
• Dataset
 English-Korean parallel corpus
• 266,982 bi-sentence pairs in English and Korean
• Aligned by GIZA++
• Annotation
 ReVerb (Fader et al., 2011)
• English Open IE system
• Label Propagation
 Junto Label Propagation Toolkit
• Learning
 Tree kernel-based SVM classifier
• Shortest path dependency kernel (Bunescu and Mooney, 2005)
• SVM-Light (Joachims, 1998)

19

Evaluation
• Dataset
 Manually annotated Korean dataset
• Obtained from the Web following Bunescu and Mooney(2007)’s work
• 500 sentences with manual annotations for four relation types
 Acquisition
 Birthplace
 Inventor Of
 Won Prize

• Evaluation Metrics
 Precision/Recall/F-measure

20

Experimental Results
• Direct Projection vs. Graph-based Projection

Direct Projection Graph-based Projection
Type
P R F P R F
Acquisition 51.6 87.7 64.9 55.3 91.2 68.9
Birthplace 69.8 84.5 76.4 73.8 87.3 80.0
Inventor of 62.4 85.3 72.1 66.3 89.7 76.3
Won Prize 73.3 80.5 76.7 76.4 82.9 79.5
Total 63.9 84.2 72.7 67.7 87.4 76.3

21

Experimental Results
• Comparisons to other self-supervised approaches
 Heuristic-based Approach (Banko et al., 2007; Banko et al., 2008)
• Korean Treebank and Syntactic Heuristics
 Wikipedia-based Approach (Wu and Weld, 2010)
• Korean Wikipedia articles and Infoboxes

Approach P R F

Heuristic-based 92.31 17.27 29.09

Wikipedia-based 66.67 66.91 66.79

Projection-based 67.69 87.41 76.30

22

Contents
• Introduction
• Methods
• Evaluation
• Conclusions

23

Conclusion
• Summary
 A graph-based projection approach for relation extraction
• Label propagation algorithm
• On a graph that represents the instance and context features of both
the source and target languages
 Experimental results show that our approach helps to improve the
performances of relation extraction compared to other approaches
• Future work
 To relieve the high complexity problem of the approach
 To deal with more expanded graph structure to improve the
extraction performances

24

A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction

Similar to A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction (20)

More from Seokhwan Kim

More from Seokhwan Kim (19)

Recently uploaded

Recently uploaded (20)

A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction