SlideShare a Scribd company logo
1 of 90
Download to read offline
Encoding Linguistic Structures with
Graph Convolutional Networks
Diego Marcheggiani
Joint work with IvanTitov and Joost Bastings
University of Amsterdam
University of Edinburgh
@South England NLP Meetup
Structured (Linguistic) Priors
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
“I voted for Palpatine because he was
most aligned with my values,” she said.
2
Sequence to Sequence
3
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
4
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
Encode structured linguistic knowledge into NN using
Graph Convolutional Networks
5
the black cat
le chat noire <s>
<s> le chat noire
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017.
Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks
Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018.
6
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
Sequa makes and repairs jet engines.
Sequa makes and repairs jet engines.
7
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
8
Sequa makes and repairs jet engines.
make.01 repair.01
Sequa makes and repairs jet engines.
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Semantic Role Labeling
9
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Semantic Role Labeling
10
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
11
Semantic Role Labeling
} Only the head of an argument is labeled
} Sequence labeling task for each predicate
} Focus on argument identification and labeling
12
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
13
Question answering
Narayanan and Harabagiu 2004
Shen and Lapata 2007
Khashabi et al. 2018
Machine translation
Wu and Fung 2009
Aziz et al. 2011
Information extraction
Surdeanu et al. 2003
Christensen et al. 2010
Related work
14
Tutorial on Semantic Role
Labeling at EMNLP 2017
Related work
} SRL systems that use syntax with simple NN architectures
} [FitzGerald et al., 2015]
} [Roth and Lapata,2016]
} Recent models ignore linguistic bias
} [Zhou and Xu, 2014]
} [He et al., 2017]
} [Marcheggiani et al., 2017]
15
Tutorial on Semantic Role
Labeling at EMNLP 2017
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
Sequa makes and repairs jet engines.
creator
creation
SBJ COORD
OBJ
CONJ NMOD
ROOT
16
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
} Not all of them – syntax-semantics interface is not trivial
17
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
18
Graph Convolutional Networks (message passing)
Undirected graph
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
19
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
20
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Kipf and Welling,2016]
21
hi = ReLU
0
@W0hi +
X
j2N (v)
W1hj
1
A
<latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit>
Neighborhood
Self loop
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
[Kipf and Welling,2016]
…
…
…
22
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
[Kipf and Welling,2016]
…
…
…
Extend GCNs for syntactic dependency trees
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
23
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
24
Example
Lane disputed those estimates
NMOD
SBJ OBJ
[Marcheggiani andTitov, 2017]
25
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
26
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
[Marcheggiani andTitov, 2017]
27
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
28
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
29
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
30
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥
W
(2)
subj
⇥
W(2)subj0
⇥W (2)
obj 0
⇥W
(2)
obj
⇥
W (2)nm
od
⇥
W
(2)
nm
od
0
Stacking GCNs widens the
syntactic neighborhood
[Marcheggiani andTitov, 2017]
31
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
[Marcheggiani andTitov, 2017]
32
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Syntactic neighborhood
[Marcheggiani andTitov, 2017]
33
Syntactic GCNs
Syntactic neighborhood
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
34
Syntactic GCNs
Syntactic neighborhood Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
35
} Overparametrized: one matrix for each label-direction pair
}
Syntactic GCNs
Syntactic neighborhood
W
(k)
L(u,v) = V
(k)
dir(u,v)
Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
36
Edge-wise Gates
} Not all edges are equally important for the final task
[Marcheggiani andTitov, 2017]
37
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
[Marcheggiani andTitov, 2017]
38
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
39
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Gates depend on
nodes and edges Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
40
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
41
Our Model
} Word representation
} Bidirectional LSTM encoder
} GCN Encoder
} Local role classifier
[Marcheggiani andTitov, 2017]
42
Word Representation
} Pretrained word embeddings
} Word embeddings
} POS tag embeddings
} Predicate lemma embeddings
Lane disputed those estimates
word
representation
[Marcheggiani andTitov, 2017]
43
BiLSTM Encoder
} Encode each word with its left and right context
} Stacked BiLSTM
Lane disputed those estimates
word
representation
J layers
BiLSTM
[Marcheggiani andTitov, 2017]
44
GCNs Encoder
} Syntactic GCNs after BiLSTM encoder
} Add syntactic information
} Skip connections
} Longer dependencies are captured
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
[Marcheggiani andTitov, 2017]
45
Semantic Role Classifier
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
A1
Classifier
predicate
representation
candidate argument
representation
} Local log-linear classifier
p(r|ti, tp, l) / exp(Wl,r(ti tp))
46
Experiments
} Data
} CoNLL-2009 dataset - English and Chinese
} F1 evaluation measure
} Model
} Hyperparameters tuned on English development set
} State-of-the-art predicate disambiguation models
[Marcheggiani andTitov, 2017]
47
Ablation Experiments (Dev set)
82.7
83.3
81
82
83
84
85
English SRL w/o predicate disambiguation
BiLSTM GCN
[Marcheggiani andTitov, 2017]
48
75.2
77.1
73
74
75
76
77
78
Chinese SRL w/o predicate disambiguation
BiLSTM GCN
English Test Set
87.3
87.7 87.7
88
86
87
88
89
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
49
English Out of Domain
75.2
76.1
77.7
77.2
74
75
76
77
78
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
50
English Test Set (Ensemble)
87.7
87.9
89.1
86
87
88
89
90
FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
51
Chinese Test Set
77.7
78.6
79.4
82.5
76
77
78
79
80
81
82
83
Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009)
(global)
Roth and Lapata (2016)
(global)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
52
Syntactic Graph Convolutional Networks
53
} Fast and simple
} Can be seamlessly applied to other tasks
Syntactic Graph Convolutional Networks
54
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Syntactic Graph Convolutional Networks
55
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Improvements on
English to German and
English to Czech translations
Multi-document Question Answering
56
[De Cao et al., 2018]
• Nodes are entities and edges are co-reference links
• Inference on a graph representing the documents collection
Multi-document Question Answering
57
[De Cao et al., 2018]
Syntactic Graph Convolutional Networks
58
Syntactic Graph Convolutional Networks
59
Syntactic Graph Convolutional Networks
60
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
61
Motivations [Marcheggiani at al., 2018]
62
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
SRL helps to generalize over different surface realizations
of the same underlying “meaning”.
[Marcheggiani at al., 2018]
63
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
64
Motivations
65
Lost in translation
Related work
} Semantics in statistical MT
} [Wu and Fung,2009]
} [Liu and Gildea, 2010]
} [Aziz et al., 2011]
} ...
} Syntax in neural MT
} [Sennrich and Haddow,2016]
} [Aharoni and Goldberg,2017 ]
} [Bastings et al., 2017]
} …
} Semantics in neural MT
} ???
[Marcheggiani at al., 2018]
66
Predicate-argument encoding
67
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Giver
Thing given
Entity given to
Our Model
} Standard sequence2sequence with attention
} Semantic GCN encoder on top of a bidirectional RNN
} RNN decoder
[Marcheggiani at al., 2018]
68
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
69
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
70
Experiments
} Data
} WMT‘16 English-German dataset (~4.5 million sentence pairs)
} BLEU as evaluation measure
} Model
} Hyperparameters tuned on News Commentary En-De (~226K sentence pairs)
} GRU as RNN
[Marcheggiani at al., 2018]
71
Results
23.3
23.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
72
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
73
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
74
+ 1.2 BLEU
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Semantics is helpful
[Marcheggiani at al., 2018]
75
+ 1.2 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
76
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
77
+ 1.6 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Syntax and
semantics are
complementary
[Marcheggiani at al., 2018]
78
+ 1.6 BLEU
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
79
BiRNN mistranslates “to” as “nach” (directionality)
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
80
BiRNN mistranslates “to” as “nach” (directionality)
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
81
BiRNN mistranslates “to” as “nach” (directionality)
Analysis [Marcheggiani at al., 2018]
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
82
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
83
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
84
Both translations are wrong,
but the BiRNN’s one is grammatically correct
Conclusion
} GCNs for encoding linguistic structures into NN
} Semantics, coreference, discourse
} Fast
} Cheap
} State-of-the-art model for dependency-based SRL
} First to exploit semantics in NMT
85
Roadmap
86
Including structured bias into neural NLP models
Roadmap
87
Including structured bias into neural NLP models
Low-resource setting
Roadmap
88
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Roadmap
89
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Roadmap
90
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Thanks for your attention!

More Related Content

What's hot

Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法Kensuke Mitsuzawa
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Tomonari Masada
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSean Moran
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingAlireza Mohammadshahi
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Cemal Ardil
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisMehwish Alam
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScaleGoDataDriven
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in KotlinGiovanni Ciatto
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationCarlo Taticchi
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsHolistic Benchmarking of Big Linked Data
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsFulvio Rotella
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesUKM university
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemCSCJournals
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsJie Bao
 

What's hot (19)

Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
 
Sara el hassad
Sara el hassadSara el hassad
Sara el hassad
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency Parsing
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
 
Link Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: AccuracyLink Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: Accuracy
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in Argumentation
 
Link Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: EfficiencyLink Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: Efficiency
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible Worlds
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
 
Link Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-OnLink Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-On
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity Constraints
 

Similar to Encoding Linguistic Structures with Graph Convolutional Networks

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingNesreen K. Ahmed
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEJianfeng Chen
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...cvpaper. challenge
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJordan Open Source Association
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...Taeksoo Kim
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...GUANGYUAN PIAO
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesMartin Szomszor
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiCodemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...GreeceJS
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptxKtonNguyn2
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingNanjing University
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for GraphsJean Ihm
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent SystemsMiguel Rebollo
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfConnorShorten2
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Marius Georgescu
 
Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074Paul Sandoz
 
Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...ijcsit
 

Similar to Encoding Linguistic Structures with Graph Convolutional Networks (20)

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
 
15 unionfind
15 unionfind15 unionfind
15 unionfind
 
Algorithms, Union Find
Algorithms, Union FindAlgorithms, Union Find
Algorithms, Union Find
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
 
DSA Report.pdf
DSA Report.pdfDSA Report.pdf
DSA Report.pdf
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph Embedding
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014
 
Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074
 
Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...
 

Recently uploaded

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Encoding Linguistic Structures with Graph Convolutional Networks

  • 1. Encoding Linguistic Structures with Graph Convolutional Networks Diego Marcheggiani Joint work with IvanTitov and Joost Bastings University of Amsterdam University of Edinburgh @South England NLP Meetup
  • 2. Structured (Linguistic) Priors Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT “I voted for Palpatine because he was most aligned with my values,” she said. 2
  • 3. Sequence to Sequence 3 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 4. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge 4 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 5. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge Encode structured linguistic knowledge into NN using Graph Convolutional Networks 5 the black cat le chat noire <s> <s> le chat noire
  • 6. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017. Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018. 6
  • 7. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence Sequa makes and repairs jet engines. Sequa makes and repairs jet engines. 7
  • 8. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates 8 Sequa makes and repairs jet engines. make.01 repair.01 Sequa makes and repairs jet engines.
  • 9. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Semantic Role Labeling 9
  • 10. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Semantic Role Labeling 10
  • 11. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer Semantic Role Labeling 11
  • 12. Semantic Role Labeling } Only the head of an argument is labeled } Sequence labeling task for each predicate } Focus on argument identification and labeling 12 Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer
  • 13. Semantic Role Labeling 13 Question answering Narayanan and Harabagiu 2004 Shen and Lapata 2007 Khashabi et al. 2018 Machine translation Wu and Fung 2009 Aziz et al. 2011 Information extraction Surdeanu et al. 2003 Christensen et al. 2010
  • 14. Related work 14 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 15. Related work } SRL systems that use syntax with simple NN architectures } [FitzGerald et al., 2015] } [Roth and Lapata,2016] } Recent models ignore linguistic bias } [Zhou and Xu, 2014] } [He et al., 2017] } [Marcheggiani et al., 2017] 15 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 16. Motivations } Some semantic dependencies are mirrored in the syntactic graph Sequa makes and repairs jet engines. creator creation SBJ COORD OBJ CONJ NMOD ROOT 16
  • 17. Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT Motivations } Some semantic dependencies are mirrored in the syntactic graph } Not all of them – syntax-semantics interface is not trivial 17
  • 18. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 18
  • 19. Graph Convolutional Networks (message passing) Undirected graph [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 19
  • 20. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 20
  • 21. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Kipf and Welling,2016] 21 hi = ReLU 0 @W0hi + X j2N (v) W1hj 1 A <latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit> Neighborhood Self loop
  • 22. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) Initial feature representation of nodes Representation informed by nodes’ neighborhood [Kipf and Welling,2016] … … … 22
  • 23. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) [Kipf and Welling,2016] … … … Extend GCNs for syntactic dependency trees Initial feature representation of nodes Representation informed by nodes’ neighborhood 23
  • 24. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 24
  • 25. Example Lane disputed those estimates NMOD SBJ OBJ [Marcheggiani andTitov, 2017] 25
  • 26. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 26
  • 27. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj [Marcheggiani andTitov, 2017] 27
  • 28. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 28
  • 29. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 29
  • 30. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 30
  • 31. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥ W (2) subj ⇥ W(2)subj0 ⇥W (2) obj 0 ⇥W (2) obj ⇥ W (2)nm od ⇥ W (2) nm od 0 Stacking GCNs widens the syntactic neighborhood [Marcheggiani andTitov, 2017] 31
  • 32. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A [Marcheggiani andTitov, 2017] 32
  • 33. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Syntactic neighborhood [Marcheggiani andTitov, 2017] 33
  • 34. Syntactic GCNs Syntactic neighborhood h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 34
  • 35. Syntactic GCNs Syntactic neighborhood Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 35
  • 36. } Overparametrized: one matrix for each label-direction pair } Syntactic GCNs Syntactic neighborhood W (k) L(u,v) = V (k) dir(u,v) Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 36
  • 37. Edge-wise Gates } Not all edges are equally important for the final task [Marcheggiani andTitov, 2017] 37
  • 38. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax [Marcheggiani andTitov, 2017] 38
  • 39. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 39
  • 40. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Gates depend on nodes and edges Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 40
  • 41. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 41
  • 42. Our Model } Word representation } Bidirectional LSTM encoder } GCN Encoder } Local role classifier [Marcheggiani andTitov, 2017] 42
  • 43. Word Representation } Pretrained word embeddings } Word embeddings } POS tag embeddings } Predicate lemma embeddings Lane disputed those estimates word representation [Marcheggiani andTitov, 2017] 43
  • 44. BiLSTM Encoder } Encode each word with its left and right context } Stacked BiLSTM Lane disputed those estimates word representation J layers BiLSTM [Marcheggiani andTitov, 2017] 44
  • 45. GCNs Encoder } Syntactic GCNs after BiLSTM encoder } Add syntactic information } Skip connections } Longer dependencies are captured Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN [Marcheggiani andTitov, 2017] 45
  • 46. Semantic Role Classifier Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN A1 Classifier predicate representation candidate argument representation } Local log-linear classifier p(r|ti, tp, l) / exp(Wl,r(ti tp)) 46
  • 47. Experiments } Data } CoNLL-2009 dataset - English and Chinese } F1 evaluation measure } Model } Hyperparameters tuned on English development set } State-of-the-art predicate disambiguation models [Marcheggiani andTitov, 2017] 47
  • 48. Ablation Experiments (Dev set) 82.7 83.3 81 82 83 84 85 English SRL w/o predicate disambiguation BiLSTM GCN [Marcheggiani andTitov, 2017] 48 75.2 77.1 73 74 75 76 77 78 Chinese SRL w/o predicate disambiguation BiLSTM GCN
  • 49. English Test Set 87.3 87.7 87.7 88 86 87 88 89 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 49
  • 50. English Out of Domain 75.2 76.1 77.7 77.2 74 75 76 77 78 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 50
  • 51. English Test Set (Ensemble) 87.7 87.9 89.1 86 87 88 89 90 FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 51
  • 52. Chinese Test Set 77.7 78.6 79.4 82.5 76 77 78 79 80 81 82 83 Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009) (global) Roth and Lapata (2016) (global) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 52
  • 53. Syntactic Graph Convolutional Networks 53 } Fast and simple } Can be seamlessly applied to other tasks
  • 54. Syntactic Graph Convolutional Networks 54 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017.
  • 55. Syntactic Graph Convolutional Networks 55 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017. Improvements on English to German and English to Czech translations
  • 56. Multi-document Question Answering 56 [De Cao et al., 2018] • Nodes are entities and edges are co-reference links • Inference on a graph representing the documents collection
  • 61. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 61
  • 62. Motivations [Marcheggiani at al., 2018] 62 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 63. Motivations SRL helps to generalize over different surface realizations of the same underlying “meaning”. [Marcheggiani at al., 2018] 63 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 66. Related work } Semantics in statistical MT } [Wu and Fung,2009] } [Liu and Gildea, 2010] } [Aziz et al., 2011] } ... } Syntax in neural MT } [Sennrich and Haddow,2016] } [Aharoni and Goldberg,2017 ] } [Bastings et al., 2017] } … } Semantics in neural MT } ??? [Marcheggiani at al., 2018] 66
  • 67. Predicate-argument encoding 67 John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Giver Thing given Entity given to
  • 68. Our Model } Standard sequence2sequence with attention } Semantic GCN encoder on top of a bidirectional RNN } RNN decoder [Marcheggiani at al., 2018] 68
  • 69. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 69
  • 70. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 70
  • 71. Experiments } Data } WMT‘16 English-German dataset (~4.5 million sentence pairs) } BLEU as evaluation measure } Model } Hyperparameters tuned on News Commentary En-De (~226K sentence pairs) } GRU as RNN [Marcheggiani at al., 2018] 71
  • 72. Results 23.3 23.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 72
  • 73. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 73
  • 74. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 74 + 1.2 BLEU
  • 75. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Semantics is helpful [Marcheggiani at al., 2018] 75 + 1.2 BLEU
  • 76. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 76
  • 77. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 77 + 1.6 BLEU
  • 78. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Syntax and semantics are complementary [Marcheggiani at al., 2018] 78 + 1.6 BLEU
  • 79. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 79 BiRNN mistranslates “to” as “nach” (directionality)
  • 80. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 80 BiRNN mistranslates “to” as “nach” (directionality)
  • 81. John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE 81 BiRNN mistranslates “to” as “nach” (directionality) Analysis [Marcheggiani at al., 2018]
  • 82. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 82 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 83. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 83 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 84. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 84 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 85. Conclusion } GCNs for encoding linguistic structures into NN } Semantics, coreference, discourse } Fast } Cheap } State-of-the-art model for dependency-based SRL } First to exploit semantics in NMT 85
  • 86. Roadmap 86 Including structured bias into neural NLP models
  • 87. Roadmap 87 Including structured bias into neural NLP models Low-resource setting
  • 88. Roadmap 88 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level
  • 89. Roadmap 89 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs
  • 90. Roadmap 90 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs Thanks for your attention!