1. Latent Relational Model for Relation Extraction
Gaetano Rossiello1
, Alfio Gliozzo2
, Nicolas Fauceglia2
, Giovanni Semeraro1
1
Department of Computer Science - University of Bari, Italy
2
IBM Research AI - Yorktown Heights, NY, USA
gaetano.rossiello@uniba.it
github.com/gaetangate
/in/gaetano-rossiello
@tanoross
2. Goal: from Text to Knowledge
Unstructured Textual Data Structured Data Knowledge & Insights
● Information Extraction
○ Entity Recognition
○ Relation Extraction
● Frame Parsing
● Semantic Parsing
○ FOL
○ Lambda Calculus
○ AMR
● Deductive Reasoning
● Inductive Logic Programming
● Probabilistic (Logic) Programming
● Relational Embeddings
● ...
3. Why Relation Extraction?
● Automatic Knowledge Base Population (AKBP)
○ Lexical resources: add words to WordNet thesaurus
○ Fact bases: add facts to Wikidata or DBpedia
● Automatic Knowledge Base Construction (AKBC)
● Sample application: Question Answering (QA)
○ Who are the actors younger than Tom Hanks?
(isA ?x actor) (birthDate ?x ?y) (birthDate “Tom_Hanks” ?z) (> ?y ?z)
4. Relation Extraction Approaches
● Pattern-based [Hearst, 1992]
○ Hand-crafted rules
● Bootstrapping [Agichtein, 2000]
○ Semantic drift
● OpenIE [Banko, 2007; Fader, 2011; Mausam, 2012]
○ Lexicalized relations not in a canonical form
● Supervised [Jiang, 2007; Sun, 2014; Nguyen, 2015]
○ Manually annotated training examples
● Distant Supervision [Mintz, 2009; Lin, 2016; Glass, 2018]
○ An existing KB is used to generate training examples
○ Advantages from both bootstrapping and supervised RE
5. ISWC Semantic Web Challenge 2017
Glass, M., Gliozzo, A., Hassanzadeh, O., Mihindukulasooriya, N., Rossiello, G.
Inducing implicit relations from text using distantly supervised deep nets. ISWC 2018.
PCNN-KI: Piecewise Convolutional Neural Network for Distantly Supervised RE
PermID KG
6. Distantly Supervised RE: Limitations
● Distant Supervision does not fit well for vertical domains or long-tailed
relation types, where only a few seed examples are available
● The generalization capability is limited only to those relation types seen
during the training phase
Distantly supervised RE cannot be applied in other domains with new relation types
7. Use Case: Knowledge Base Population in Cold Start
Research Question:
How to design a method able to identify new
relation types in a (small) collection of
documents using a few examples?
Training examples
8. Relation Extraction as Analogy Problem
● Given a corpus D and an entity pair (a, b)
● Find the set R = {(x, y) ∊ D | a : b = x : y}
Watson : IBM = Pixel : Google
Query pair Result pair
9. Word Analogy using Distributional Semantic Models
Vector offset with Word Embeddings
man : king = woman : ?
vec(king) - vec(man) + vec (woman) ≈ vec(queen)
vec(king) - vec(man) ≈ vec(queen) - vec(woman)
Mikolov, T., Chen, K., Corrado, G., & Dean, J. Efficient estimation of word representations in vector space. ICLR 2013.
Pennington, J., Socher, R., & Manning, C. D. Glove: Global vectors for word representation. EMNLP 2014.
Levy, O., Goldberg, Y., & Dagan, I. Improving distributional similarity with lessons learned from word embeddings. TACL 2015.
Limitations:
● Handling multi-word (e.g. Tom Hanks) with pre-trained word embedding models
● Handling unseen words/entities
● Not effective on SAT Analogy Questions [Church, 2017]
10. SAT Analogy Questions Dataset
● SAT = Scholastic Aptitude Test [Turney, 2003]
● 374 multiple-choice analogy questions; 5 choices per question
● Human performance: 81.5%
● SOTA - Latent Relational Analysis (LRA): 56.1%
Turney, P.D., and Littman, M.L. Corpus-based learning of analogies and semantic relations. Machine Learning. 2005.
Turney, P.D. Similarity of semantic relations. Computational Linguistics. 2006.
LRA
r1 = vec(mason:stone)
r2 = vec(carpenter:wood)
sim = cosine(r1, r2)
11. Latent Relational Model for RE
Entity-Entity Vocabulary
V = {(X1, Y1),..., (Xn, Yn)}
Entity-Entity Contexts
1. The entity types provided by the NER
2. The sequence of words between the two entities
3. The part-of-speech tags of these words
4. A flag indicating which entity came first
5. An n-gram to the left of the first entity
6. An n-gram to the right of the second entity
7. A dependency path between the two entities
1 0 0 ... 1
0 1 1 ... 0
1 1 0 ... 0
0 0 0 ... 1
Un,k
∑k,k
Vk,m
Singular Value Decomposition (SVD)
Relational Vector Space Model
LRMn,k
= (Uk
Σk
)n,k
m columns
n rows
Rome is the capital of Italy.
David Gilmour was the guitarist of Pink Floyd.
Pac-Man is an arcade developed by Namco.
...
(Rome, Italy)
(David Gilmour, Pink Floyd)
(Pac-Man, Namco)
...
12. Use Case: Knowledge Base Population in Cold Start
Rossiello G., Gliozzo A., Fauceglia N. RELATION EXTRACTION FROM A CORPUS USING AN
INFORMATION RETRIEVAL BASED PROCEDURE. Patent ID P201706307
14. Geometric Interpretation of Relations
“A semantic relation R is a region in a relational vector space
LRMn,k
that outlines the boundaries among those entity-pair
vectors that are analogous to each other.”
Dataset: NYT-FB [Riedel, 2010]
New York, Brooklyn
Bill Gates, Microsoft
A:B=C:D ⇔ dist(r(A,B)
,r(C,D)
) <
t
15. LRM for Distantly Supervised Relation Extraction
Dataset: NTY-FB [Riedel, 2010]
Corpus: New York Times (2005-2007)
KG: Freebase
Relations/classes: 51
Training positive: 4700
Training negative: 63569
Test positive: 1950
Test negative: 94917
LRM: SVD [Halko, 2011] k=2000
Classifier: SVM one-vs-rest
ARES (Ours) = LRM + SVM
16. Conclusion
● Relation Extraction (RE) as Analogy Problem
(two sides of the same coin)
● Latent Relational Model (LRM) for RE
● Geometric Interpretation of Relations
● LRM for Unsupervised RE
● LRM for Semi-supervised RE
● LRM for Supervised RE
17. Limitations of LRM / Future Work
● NLP pipeline and SVD do not scale on very large corpora
○ Learning Relational Representations by Analogy
using Hierarchical Siamese Networks [Rossiello et al, NAACL 2019]
○ Variational Autoencoders
● LRM is not able to model the directionality of relations
○ founder(Person, Company) - OK
○ competitor(Company, Company) - OK
○ supplyTo(Company, Company) - KO!
● One entity-entity embedding encodes many relations
○ Contextual Relational Embeddings, like ELMO [Peters, 2018], BERT [Devlin, 2018]
○ Lookup tensor: [entity-entity, mention, vector]
● Extract n-ary Relations
○ Towards Unsupervised Semantic/Frame Parsing