SlideShare a Scribd company logo
Encoding Linguistic Structures with
Graph Convolutional Networks
Diego Marcheggiani
Joint work with IvanTitov and Joost Bastings
University of Amsterdam
University of Edinburgh
@South England NLP Meetup
Structured (Linguistic) Priors
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
“I voted for Palpatine because he was
most aligned with my values,” she said.
2
Sequence to Sequence
3
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
4
[Sutskever et al., 2014]
the black cat
le chat noire <s>
<s> le chat noire
Sequence to Sequence
} Language is not (only) a sequence of words
} We have linguistic knowledge
Encode structured linguistic knowledge into NN using
Graph Convolutional Networks
5
the black cat
le chat noire <s>
<s> le chat noire
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017.
Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks
Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018.
6
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
Sequa makes and repairs jet engines.
Sequa makes and repairs jet engines.
7
Semantic Role Labeling
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
8
Sequa makes and repairs jet engines.
make.01 repair.01
Sequa makes and repairs jet engines.
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Semantic Role Labeling
9
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Semantic Role Labeling
10
} Predicting the predicate-argument structure of a sentence
} Discover and disambiguate predicates
} Identify arguments and label them with their semantic roles
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
11
Semantic Role Labeling
} Only the head of an argument is labeled
} Sequence labeling task for each predicate
} Focus on argument identification and labeling
12
Sequa makes and repairs jet engines.
make.01 repair.01
Creator
Creation
Entity repaired
Repairer
Semantic Role Labeling
13
Question answering
Narayanan and Harabagiu 2004
Shen and Lapata 2007
Khashabi et al. 2018
Machine translation
Wu and Fung 2009
Aziz et al. 2011
Information extraction
Surdeanu et al. 2003
Christensen et al. 2010
Related work
14
Tutorial on Semantic Role
Labeling at EMNLP 2017
Related work
} SRL systems that use syntax with simple NN architectures
} [FitzGerald et al., 2015]
} [Roth and Lapata,2016]
} Recent models ignore linguistic bias
} [Zhou and Xu, 2014]
} [He et al., 2017]
} [Marcheggiani et al., 2017]
15
Tutorial on Semantic Role
Labeling at EMNLP 2017
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
Sequa makes and repairs jet engines.
creator
creation
SBJ COORD
OBJ
CONJ NMOD
ROOT
16
Sequa makes and repairs jet engines.
creator
creation
entity repaired
repairer
SBJ COORD
OBJ
CONJ NMOD
ROOT
Motivations
} Some semantic dependencies are mirrored in the syntactic graph
} Not all of them – syntax-semantics interface is not trivial
17
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
18
Graph Convolutional Networks (message passing)
Undirected graph
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
19
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Gori et al. 2005
Scarselli et al. 2009
Kipf and Welling,2016]
20
Graph Convolutional Networks (message passing)
Undirected graph Update of the blue node
[Kipf and Welling,2016]
21
hi = ReLU
0
@W0hi +
X
j2N (v)
W1hj
1
A
<latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit>
Neighborhood
Self loop
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
[Kipf and Welling,2016]
…
…
…
22
GCNs Pipeline
Hidden layer Hidden layer
Input Output
X = H(0)
H(1) H(2)
Z = H(n)
[Kipf and Welling,2016]
…
…
…
Extend GCNs for syntactic dependency trees
Initial feature
representation of
nodes
Representation
informed by nodes’
neighborhood
23
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
24
Example
Lane disputed those estimates
NMOD
SBJ OBJ
[Marcheggiani andTitov, 2017]
25
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
26
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
[Marcheggiani andTitov, 2017]
27
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
28
Example
Lane disputed those estimates
NMOD
SBJ OBJ
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥
W
(1)
subj
⇥
W
(1)
nm
od
⇥W
(1)
obj
⇥W (1)
obj 0
⇥
W(1)nm
od0
⇥
W(1)subj0
[Marcheggiani andTitov, 2017]
29
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
[Marcheggiani andTitov, 2017]
30
Example
⇥W
(1)
self
Lane disputed those estimates
NMOD
SBJ OBJ
⇥
W
(1)
subj
⇥W
(1)
self
⇥W
(1)
self
⇥W
(1)
self
⇥W (1)
obj 0
⇥
W
(1)
nm
od
⇥
W(1)nm
od0
⇥W
(1)
obj
⇥
W(1)subj0
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥W
(2)
self
⇥
W
(2)
subj
⇥
W(2)subj0
⇥W (2)
obj 0
⇥W
(2)
obj
⇥
W (2)nm
od
⇥
W
(2)
nm
od
0
Stacking GCNs widens the
syntactic neighborhood
[Marcheggiani andTitov, 2017]
31
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
[Marcheggiani andTitov, 2017]
32
Syntactic GCNs
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Syntactic neighborhood
[Marcheggiani andTitov, 2017]
33
Syntactic GCNs
Syntactic neighborhood
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
34
Syntactic GCNs
Syntactic neighborhood Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
35
} Overparametrized: one matrix for each label-direction pair
}
Syntactic GCNs
Syntactic neighborhood
W
(k)
L(u,v) = V
(k)
dir(u,v)
Self-loop is included in N
Messages are direction and
label specific
h(k+1)
v = ReLU
0
@
X
u2N (v)
W
(k)
L(u,v)h(k)
u + b
(k)
L(u,v)
1
A
Message
[Marcheggiani andTitov, 2017]
36
Edge-wise Gates
} Not all edges are equally important for the final task
[Marcheggiani andTitov, 2017]
37
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
[Marcheggiani andTitov, 2017]
38
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
39
Edge-wise Gates
} Not all edges are equally important for the final task
} We should not blindly rely on predicted syntax
} Gates decide the“importance” of each message
Gates depend on
nodes and edges Lane disputed those estimates
NMOD
SBJ
OBJ
ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)
g g g g g g g g g g
[Marcheggiani andTitov, 2017]
40
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
41
Our Model
} Word representation
} Bidirectional LSTM encoder
} GCN Encoder
} Local role classifier
[Marcheggiani andTitov, 2017]
42
Word Representation
} Pretrained word embeddings
} Word embeddings
} POS tag embeddings
} Predicate lemma embeddings
Lane disputed those estimates
word
representation
[Marcheggiani andTitov, 2017]
43
BiLSTM Encoder
} Encode each word with its left and right context
} Stacked BiLSTM
Lane disputed those estimates
word
representation
J layers
BiLSTM
[Marcheggiani andTitov, 2017]
44
GCNs Encoder
} Syntactic GCNs after BiLSTM encoder
} Add syntactic information
} Skip connections
} Longer dependencies are captured
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
[Marcheggiani andTitov, 2017]
45
Semantic Role Classifier
Lane disputed those estimates
word
representation
J layers
BiLSTM
dobj
nmodnsubj
K layers
GCN
A1
Classifier
predicate
representation
candidate argument
representation
} Local log-linear classifier
p(r|ti, tp, l) / exp(Wl,r(ti tp))
46
Experiments
} Data
} CoNLL-2009 dataset - English and Chinese
} F1 evaluation measure
} Model
} Hyperparameters tuned on English development set
} State-of-the-art predicate disambiguation models
[Marcheggiani andTitov, 2017]
47
Ablation Experiments (Dev set)
82.7
83.3
81
82
83
84
85
English SRL w/o predicate disambiguation
BiLSTM GCN
[Marcheggiani andTitov, 2017]
48
75.2
77.1
73
74
75
76
77
78
Chinese SRL w/o predicate disambiguation
BiLSTM GCN
English Test Set
87.3
87.7 87.7
88
86
87
88
89
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
49
English Out of Domain
75.2
76.1
77.7
77.2
74
75
76
77
78
FitzGerald et al. (2015)
(global)
Roth and Lapata (2016)
(global)
Marcheggiani et al. (2017,
CoNLL) (local)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
50
English Test Set (Ensemble)
87.7
87.9
89.1
86
87
88
89
90
FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
51
Chinese Test Set
77.7
78.6
79.4
82.5
76
77
78
79
80
81
82
83
Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009)
(global)
Roth and Lapata (2016)
(global)
Ours (Bi-LSTM + GCN)
(local)
SRL with predicate disambiguation
[Marcheggiani andTitov, 2017]
52
Syntactic Graph Convolutional Networks
53
} Fast and simple
} Can be seamlessly applied to other tasks
Syntactic Graph Convolutional Networks
54
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Syntactic Graph Convolutional Networks
55
} Fast and simple
} Can be seamlessly applied to other tasks
Graph Convolutional Encoders for Syntax-aware Machine Translation
Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an.
In Proceedings of EMNLP, 2017.
Improvements on
English to German and
English to Czech translations
Multi-document Question Answering
56
[De Cao et al., 2018]
• Nodes are entities and edges are co-reference links
• Inference on a graph representing the documents collection
Multi-document Question Answering
57
[De Cao et al., 2018]
Syntactic Graph Convolutional Networks
58
Syntactic Graph Convolutional Networks
59
Syntactic Graph Convolutional Networks
60
Outline
} Semantic Role Labeling
} Graph Convolutional Networks (GCN)
} Syntactic GCN for Semantic Role Labeling (SRL)
} SRL Model
} Exploiting Semantics in Neural MachineTranslation with GCNs
61
Motivations [Marcheggiani at al., 2018]
62
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
SRL helps to generalize over different surface realizations
of the same underlying “meaning”.
[Marcheggiani at al., 2018]
63
John gave his wonderful wife a nice present .
Giver
Thing given
Entity given to
John gave a nice present to his wonderful wife .
Giver
Entity given to
Thing given
Motivations
64
Motivations
65
Lost in translation
Related work
} Semantics in statistical MT
} [Wu and Fung,2009]
} [Liu and Gildea, 2010]
} [Aziz et al., 2011]
} ...
} Syntax in neural MT
} [Sennrich and Haddow,2016]
} [Aharoni and Goldberg,2017 ]
} [Bastings et al., 2017]
} …
} Semantics in neural MT
} ???
[Marcheggiani at al., 2018]
66
Predicate-argument encoding
67
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Giver
Thing given
Entity given to
Our Model
} Standard sequence2sequence with attention
} Semantic GCN encoder on top of a bidirectional RNN
} RNN decoder
[Marcheggiani at al., 2018]
68
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
69
Our model
John gave his wonderful wife a nice present
WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
BiRNN/
CNN
Semantic
GCN
Semantic
GCN WA0
WA1
WA2
WA0’
WA2’
WA1’
Wself
Wself
Wself
Wself
Wself
Wself
Wself
Wself
<bos> John
John
+
RNN
DECODER
ATTENTION
MECHANISM
[Marcheggiani at al., 2018]
70
Experiments
} Data
} WMT‘16 English-German dataset (~4.5 million sentence pairs)
} BLEU as evaluation measure
} Model
} Hyperparameters tuned on News Commentary En-De (~226K sentence pairs)
} GRU as RNN
[Marcheggiani at al., 2018]
71
Results
23.3
23.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
72
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
73
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
74
+ 1.2 BLEU
Results
23.3
23.9
24.5
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Semantics is helpful
[Marcheggiani at al., 2018]
75
+ 1.2 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
76
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
[Marcheggiani at al., 2018]
77
+ 1.6 BLEU
Results
23.3
23.9
24.5
24.9
20
21
22
23
24
25
26
BiRNN
(Bastings et al.2017)
BiRNN + Syntactic
GCN
(Bastings et al.2017)
BiRNN + Semantic
GCN
BiRNN+Syntactic GCN
+Semantic GCN
FullWMT 2016 English-German BLEU
Syntax and
semantics are
complementary
[Marcheggiani at al., 2018]
78
+ 1.6 BLEU
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
79
BiRNN mistranslates “to” as “nach” (directionality)
Analysis
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
[Marcheggiani at al., 2018]
80
BiRNN mistranslates “to” as “nach” (directionality)
John sold the car to Mark .
Seller Thing sold Buyer
The boy walking down the dusty road is drinking a beer
Walker AM-DIR
Drinker Liquid
SOURCE
SEM GCN
BiRNN John verkaufte das Auto nach Mark .
John verkaufte das Auto an Mark .
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SOURCE
81
BiRNN mistranslates “to” as “nach” (directionality)
Analysis [Marcheggiani at al., 2018]
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
82
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
83
Both translations are wrong,
but the BiRNN’s one is grammatically correct
The boy sitting on a bench in the park plays chess .
Thing sitting Location Player Game
AM-LOC
SEM GCN
BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken .
Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier .
SEM GCN
BiRNN Der Junge auf einer Bank im Park spielt Schach .
Der Junge sitzt auf einer Bank im Park Schach .
SOURCE
Analysis [Marcheggiani at al., 2018]
84
Both translations are wrong,
but the BiRNN’s one is grammatically correct
Conclusion
} GCNs for encoding linguistic structures into NN
} Semantics, coreference, discourse
} Fast
} Cheap
} State-of-the-art model for dependency-based SRL
} First to exploit semantics in NMT
85
Roadmap
86
Including structured bias into neural NLP models
Roadmap
87
Including structured bias into neural NLP models
Low-resource setting
Roadmap
88
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Roadmap
89
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Roadmap
90
Including structured bias into neural NLP models
Low-resource setting
Long-range dependencies
Document level
Cross-document level
Integrating external knowledge
i.e., knowledge graphs
Thanks for your attention!

More Related Content

What's hot

Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法
Kensuke Mitsuzawa
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
Tomonari Masada
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
Sean Moran
 
Sara el hassad
Sara el hassadSara el hassad
Sara el hassad
Sara EL HASSAD
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Alireza Mohammadshahi
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Cemal Ardil
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
Mehwish Alam
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
GoDataDriven
 
Link Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: AccuracyLink Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: Accuracy
Holistic Benchmarking of Big Linked Data
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
Giovanni Ciatto
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in Argumentation
Carlo Taticchi
 
Link Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: EfficiencyLink Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: Efficiency
Holistic Benchmarking of Big Linked Data
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Holistic Benchmarking of Big Linked Data
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible Worlds
Fulvio Rotella
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
UKM university
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
Universitat Politècnica de Catalunya
 
Link Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-OnLink Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-On
Holistic Benchmarking of Big Linked Data
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
CSCJournals
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity Constraints
Jie Bao
 

What's hot (19)

Learning to rankの評価手法
Learning to rankの評価手法Learning to rankの評価手法
Learning to rankの評価手法
 
Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...Supplementary material for my following paper: Infinite Latent Process Decomp...
Supplementary material for my following paper: Infinite Latent Process Decomp...
 
Sparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image AnnotationSparse Kernel Learning for Image Annotation
Sparse Kernel Learning for Image Annotation
 
Sara el hassad
Sara el hassadSara el hassad
Sara el hassad
 
Graph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency ParsingGraph-to-Graph Transformer for Transition-based Dependency Parsing
Graph-to-Graph Transformer for Transition-based Dependency Parsing
 
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
Compact binary-tree-representation-of-logic-function-with-enhanced-throughput-
 
Navigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept AnalysisNavigating and Exploring RDF Data using Formal Concept Analysis
Navigating and Exploring RDF Data using Formal Concept Analysis
 
PyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at ScalePyData Amsterdam - Name Matching at Scale
PyData Amsterdam - Name Matching at Scale
 
Link Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: AccuracyLink Discovery Tutorial Part II: Accuracy
Link Discovery Tutorial Part II: Accuracy
 
2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
 
Looking for Invariant Operators in Argumentation
Looking for Invariant Operators in ArgumentationLooking for Invariant Operators in Argumentation
Looking for Invariant Operators in Argumentation
 
Link Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: EfficiencyLink Discovery Tutorial Part I: Efficiency
Link Discovery Tutorial Part I: Efficiency
 
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching SystemsLink Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
Link Discovery Tutorial Part III: Benchmarking for Instance Matching Systems
 
Probabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible WorldsProbabilistic Abductive Logic Programming using Possible Worlds
Probabilistic Abductive Logic Programming using Possible Worlds
 
Learning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniquesLearning for semantic parsing using statistical syntactic parsing techniques
Learning for semantic parsing using statistical syntactic parsing techniques
 
SEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial NetworkSEGAN: Speech Enhancement Generative Adversarial Network
SEGAN: Speech Enhancement Generative Adversarial Network
 
Link Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-OnLink Discovery Tutorial Part V: Hands-On
Link Discovery Tutorial Part V: Hands-On
 
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman ProblemA lexisearch algorithm for the Bottleneck Traveling Salesman Problem
A lexisearch algorithm for the Bottleneck Traveling Salesman Problem
 
Extending OWL with Integrity Constraints
Extending OWL with Integrity ConstraintsExtending OWL with Integrity Constraints
Extending OWL with Integrity Constraints
 

Similar to Encoding Linguistic Structures with Graph Convolutional Networks

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
Jianfeng Chen
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
cvpaper. challenge
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
Jordan Open Source Association
 
15 unionfind
15 unionfind15 unionfind
15 unionfind
Carlos andré dantas
 
Algorithms, Union Find
Algorithms, Union FindAlgorithms, Union Find
Algorithms, Union Find
Nikita Shpilevoy
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
Taeksoo Kim
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
GUANGYUAN PIAO
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
Martin Szomszor
 
DSA Report.pdf
DSA Report.pdfDSA Report.pdf
DSA Report.pdf
ChhaviCoachingCenter
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Codemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
GreeceJS
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
KtonNguyn2
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Nanjing University
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
Jean Ihm
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
Miguel Rebollo
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
ConnorShorten2
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014
Marius Georgescu
 
Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074
Paul Sandoz
 
Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...
ijcsit
 

Similar to Encoding Linguistic Structures with Graph Convolutional Networks (20)

High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and ModelingHigh-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
 
On the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSEOn the value of Sampling and Pruning for SBSE
On the value of Sampling and Pruning for SBSE
 
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
教師なし画像特徴表現学習の動向 {Un, Self} supervised representation learning (CVPR 2018 完全読破...
 
JOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured DataJOSA TechTalks - Machine Learning on Graph-Structured Data
JOSA TechTalks - Machine Learning on Graph-Structured Data
 
15 unionfind
15 unionfind15 unionfind
15 unionfind
 
Algorithms, Union Find
Algorithms, Union FindAlgorithms, Union Find
Algorithms, Union Find
 
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
 
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
A Study of the Similarities of Entity Embeddings Learned from Different Aspec...
 
Syntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service ArchitecturesSyntactic Mediation in Grid and Web Service Architectures
Syntactic Mediation in Grid and Web Service Architectures
 
DSA Report.pdf
DSA Report.pdfDSA Report.pdf
DSA Report.pdf
 
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion DubaiChaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
Chaos Testing with F# and Azure by Rachel Reese at Codemotion Dubai
 
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
All About GRAND Stack: GraphQL, React, Apollo, and Neo4j (Mark Needham) - Gre...
 
IA3_presentation.pptx
IA3_presentation.pptxIA3_presentation.pptx
IA3_presentation.pptx
 
Bootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph EmbeddingBootstrapping Entity Alignment with Knowledge Graph Embedding
Bootstrapping Entity Alignment with Knowledge Graph Embedding
 
PGQL: A Language for Graphs
PGQL: A Language for GraphsPGQL: A Language for Graphs
PGQL: A Language for Graphs
 
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 Co-Learning: Consensus-based Learning for Multi-Agent Systems Co-Learning: Consensus-based Learning for Multi-Agent Systems
Co-Learning: Consensus-based Learning for Multi-Agent Systems
 
Weaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdfWeaviate Air #3 - New in AI segment.pdf
Weaviate Air #3 - New in AI segment.pdf
 
Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014Introduction of IPv6NET in Tridentcom 2014
Introduction of IPv6NET in Tridentcom 2014
 
Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074Codeone18 jdk-incremental-dev6074
Codeone18 jdk-incremental-dev6074
 
Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...Labeled generalized stochastic petri net Based approach for web services Comp...
Labeled generalized stochastic petri net Based approach for web services Comp...
 

Recently uploaded

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
Claudio Di Ciccio
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
Techgropse Pvt.Ltd.
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 

Recently uploaded (20)

“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
CAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on BlockchainCAKE: Sharing Slices of Confidential Data on Blockchain
CAKE: Sharing Slices of Confidential Data on Blockchain
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdfAI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
AI-Powered Food Delivery Transforming App Development in Saudi Arabia.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 

Encoding Linguistic Structures with Graph Convolutional Networks

  • 1. Encoding Linguistic Structures with Graph Convolutional Networks Diego Marcheggiani Joint work with IvanTitov and Joost Bastings University of Amsterdam University of Edinburgh @South England NLP Meetup
  • 2. Structured (Linguistic) Priors Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT “I voted for Palpatine because he was most aligned with my values,” she said. 2
  • 3. Sequence to Sequence 3 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 4. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge 4 [Sutskever et al., 2014] the black cat le chat noire <s> <s> le chat noire
  • 5. Sequence to Sequence } Language is not (only) a sequence of words } We have linguistic knowledge Encode structured linguistic knowledge into NN using Graph Convolutional Networks 5 the black cat le chat noire <s> <s> le chat noire
  • 6. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling Diego Marcheggiani,IvanTitov. In Proceedings of EMNLP, 2017. Exploiting Semantics in Neural MachineTranslation with Graph Convolutional Networks Diego Marcheggiani,Joost Bastings,IvanTitov. In Proceedings of NAACL-HLT, 2018. 6
  • 7. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence Sequa makes and repairs jet engines. Sequa makes and repairs jet engines. 7
  • 8. Semantic Role Labeling } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates 8 Sequa makes and repairs jet engines. make.01 repair.01 Sequa makes and repairs jet engines.
  • 9. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Semantic Role Labeling 9
  • 10. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Semantic Role Labeling 10
  • 11. } Predicting the predicate-argument structure of a sentence } Discover and disambiguate predicates } Identify arguments and label them with their semantic roles Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer Semantic Role Labeling 11
  • 12. Semantic Role Labeling } Only the head of an argument is labeled } Sequence labeling task for each predicate } Focus on argument identification and labeling 12 Sequa makes and repairs jet engines. make.01 repair.01 Creator Creation Entity repaired Repairer
  • 13. Semantic Role Labeling 13 Question answering Narayanan and Harabagiu 2004 Shen and Lapata 2007 Khashabi et al. 2018 Machine translation Wu and Fung 2009 Aziz et al. 2011 Information extraction Surdeanu et al. 2003 Christensen et al. 2010
  • 14. Related work 14 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 15. Related work } SRL systems that use syntax with simple NN architectures } [FitzGerald et al., 2015] } [Roth and Lapata,2016] } Recent models ignore linguistic bias } [Zhou and Xu, 2014] } [He et al., 2017] } [Marcheggiani et al., 2017] 15 Tutorial on Semantic Role Labeling at EMNLP 2017
  • 16. Motivations } Some semantic dependencies are mirrored in the syntactic graph Sequa makes and repairs jet engines. creator creation SBJ COORD OBJ CONJ NMOD ROOT 16
  • 17. Sequa makes and repairs jet engines. creator creation entity repaired repairer SBJ COORD OBJ CONJ NMOD ROOT Motivations } Some semantic dependencies are mirrored in the syntactic graph } Not all of them – syntax-semantics interface is not trivial 17
  • 18. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 18
  • 19. Graph Convolutional Networks (message passing) Undirected graph [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 19
  • 20. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Gori et al. 2005 Scarselli et al. 2009 Kipf and Welling,2016] 20
  • 21. Graph Convolutional Networks (message passing) Undirected graph Update of the blue node [Kipf and Welling,2016] 21 hi = ReLU 0 @W0hi + X j2N (v) W1hj 1 A <latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit><latexit sha1_base64="dRNZOAdr3+64yfJmCNqaHzngt30=">AAACcXicbVFdS9xAFJ2kttqtrdv6VEQYXGxXhCWRQvtSkPbFBxEt3Q8wS5jM3mxGJ5MwcyMNIT/Cn9Wf0N/Rh746WaOwbi8MnDnn3LkfE+VSGPS8P477bO35i/WNl51Xm6/fbHXfvhuZrNAchjyTmZ5EzIAUCoYoUMIk18DSSMI4uv7e6OMb0EZk6ieWOUxTNlciFpyhpcLubbB4o4pkATVNQkEfCcav66/0B5wOaSAhxn6r8JKpehx6q+7Dh6uGWR2YIg2rKxoIRYOUYcKZrM7q/s1BTcehb7OvlrMDLeYJHoTdnjfwFkFXgd+CHmnjPOz+DmYZL1JQyCUz5tL3cpxWTKPgEupOUBjIbQE2h+rXomBN9y03o3Gm7VFIF+ySkaXGlGlknU3r5qnWkP/TLguMv0wrofICQfH7QnEhKWa0WT+dCQ0cZWkB41rYFilPmGYc7Sd17Oz+00lXweho4HsD/+JT7/hbu4UNskP2SJ/45DM5JifknAwJJ/+cXeeD89H56753qbt3b3WdNmebLIV7eAfNqr4U</latexit> Neighborhood Self loop
  • 22. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) Initial feature representation of nodes Representation informed by nodes’ neighborhood [Kipf and Welling,2016] … … … 22
  • 23. GCNs Pipeline Hidden layer Hidden layer Input Output X = H(0) H(1) H(2) Z = H(n) [Kipf and Welling,2016] … … … Extend GCNs for syntactic dependency trees Initial feature representation of nodes Representation informed by nodes’ neighborhood 23
  • 24. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 24
  • 25. Example Lane disputed those estimates NMOD SBJ OBJ [Marcheggiani andTitov, 2017] 25
  • 26. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 26
  • 27. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj [Marcheggiani andTitov, 2017] 27
  • 28. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 28
  • 29. Example Lane disputed those estimates NMOD SBJ OBJ ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) self ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥ W (1) subj ⇥ W (1) nm od ⇥W (1) obj ⇥W (1) obj 0 ⇥ W(1)nm od0 ⇥ W(1)subj0 [Marcheggiani andTitov, 2017] 29
  • 30. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) [Marcheggiani andTitov, 2017] 30
  • 31. Example ⇥W (1) self Lane disputed those estimates NMOD SBJ OBJ ⇥ W (1) subj ⇥W (1) self ⇥W (1) self ⇥W (1) self ⇥W (1) obj 0 ⇥ W (1) nm od ⇥ W(1)nm od0 ⇥W (1) obj ⇥ W(1)subj0 ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥W (2) self ⇥ W (2) subj ⇥ W(2)subj0 ⇥W (2) obj 0 ⇥W (2) obj ⇥ W (2)nm od ⇥ W (2) nm od 0 Stacking GCNs widens the syntactic neighborhood [Marcheggiani andTitov, 2017] 31
  • 32. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A [Marcheggiani andTitov, 2017] 32
  • 33. Syntactic GCNs h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Syntactic neighborhood [Marcheggiani andTitov, 2017] 33
  • 34. Syntactic GCNs Syntactic neighborhood h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 34
  • 35. Syntactic GCNs Syntactic neighborhood Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 35
  • 36. } Overparametrized: one matrix for each label-direction pair } Syntactic GCNs Syntactic neighborhood W (k) L(u,v) = V (k) dir(u,v) Self-loop is included in N Messages are direction and label specific h(k+1) v = ReLU 0 @ X u2N (v) W (k) L(u,v)h(k) u + b (k) L(u,v) 1 A Message [Marcheggiani andTitov, 2017] 36
  • 37. Edge-wise Gates } Not all edges are equally important for the final task [Marcheggiani andTitov, 2017] 37
  • 38. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax [Marcheggiani andTitov, 2017] 38
  • 39. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 39
  • 40. Edge-wise Gates } Not all edges are equally important for the final task } We should not blindly rely on predicted syntax } Gates decide the“importance” of each message Gates depend on nodes and edges Lane disputed those estimates NMOD SBJ OBJ ReLU(⌃·) ReLU(⌃·)ReLU(⌃·)ReLU(⌃·) g g g g g g g g g g [Marcheggiani andTitov, 2017] 40
  • 41. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 41
  • 42. Our Model } Word representation } Bidirectional LSTM encoder } GCN Encoder } Local role classifier [Marcheggiani andTitov, 2017] 42
  • 43. Word Representation } Pretrained word embeddings } Word embeddings } POS tag embeddings } Predicate lemma embeddings Lane disputed those estimates word representation [Marcheggiani andTitov, 2017] 43
  • 44. BiLSTM Encoder } Encode each word with its left and right context } Stacked BiLSTM Lane disputed those estimates word representation J layers BiLSTM [Marcheggiani andTitov, 2017] 44
  • 45. GCNs Encoder } Syntactic GCNs after BiLSTM encoder } Add syntactic information } Skip connections } Longer dependencies are captured Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN [Marcheggiani andTitov, 2017] 45
  • 46. Semantic Role Classifier Lane disputed those estimates word representation J layers BiLSTM dobj nmodnsubj K layers GCN A1 Classifier predicate representation candidate argument representation } Local log-linear classifier p(r|ti, tp, l) / exp(Wl,r(ti tp)) 46
  • 47. Experiments } Data } CoNLL-2009 dataset - English and Chinese } F1 evaluation measure } Model } Hyperparameters tuned on English development set } State-of-the-art predicate disambiguation models [Marcheggiani andTitov, 2017] 47
  • 48. Ablation Experiments (Dev set) 82.7 83.3 81 82 83 84 85 English SRL w/o predicate disambiguation BiLSTM GCN [Marcheggiani andTitov, 2017] 48 75.2 77.1 73 74 75 76 77 78 Chinese SRL w/o predicate disambiguation BiLSTM GCN
  • 49. English Test Set 87.3 87.7 87.7 88 86 87 88 89 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 49
  • 50. English Out of Domain 75.2 76.1 77.7 77.2 74 75 76 77 78 FitzGerald et al. (2015) (global) Roth and Lapata (2016) (global) Marcheggiani et al. (2017, CoNLL) (local) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 50
  • 51. English Test Set (Ensemble) 87.7 87.9 89.1 86 87 88 89 90 FitzGerald et al. (2015) (ensemble) Roth and Lapata (2016) (ensemble) Ours (Bi-LSTM + GCN) (ensemble) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 51
  • 52. Chinese Test Set 77.7 78.6 79.4 82.5 76 77 78 79 80 81 82 83 Zhao et al. (2009) (global) Bjö̈rkelund et al. (2009) (global) Roth and Lapata (2016) (global) Ours (Bi-LSTM + GCN) (local) SRL with predicate disambiguation [Marcheggiani andTitov, 2017] 52
  • 53. Syntactic Graph Convolutional Networks 53 } Fast and simple } Can be seamlessly applied to other tasks
  • 54. Syntactic Graph Convolutional Networks 54 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017.
  • 55. Syntactic Graph Convolutional Networks 55 } Fast and simple } Can be seamlessly applied to other tasks Graph Convolutional Encoders for Syntax-aware Machine Translation Joost Bastings,IvanTitov,Wilker Aziz,Diego Marcheggiani,Khalil Sima'an. In Proceedings of EMNLP, 2017. Improvements on English to German and English to Czech translations
  • 56. Multi-document Question Answering 56 [De Cao et al., 2018] • Nodes are entities and edges are co-reference links • Inference on a graph representing the documents collection
  • 61. Outline } Semantic Role Labeling } Graph Convolutional Networks (GCN) } Syntactic GCN for Semantic Role Labeling (SRL) } SRL Model } Exploiting Semantics in Neural MachineTranslation with GCNs 61
  • 62. Motivations [Marcheggiani at al., 2018] 62 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 63. Motivations SRL helps to generalize over different surface realizations of the same underlying “meaning”. [Marcheggiani at al., 2018] 63 John gave his wonderful wife a nice present . Giver Thing given Entity given to John gave a nice present to his wonderful wife . Giver Entity given to Thing given
  • 66. Related work } Semantics in statistical MT } [Wu and Fung,2009] } [Liu and Gildea, 2010] } [Aziz et al., 2011] } ... } Syntax in neural MT } [Sennrich and Haddow,2016] } [Aharoni and Goldberg,2017 ] } [Bastings et al., 2017] } … } Semantics in neural MT } ??? [Marcheggiani at al., 2018] 66
  • 67. Predicate-argument encoding 67 John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself Giver Thing given Entity given to
  • 68. Our Model } Standard sequence2sequence with attention } Semantic GCN encoder on top of a bidirectional RNN } RNN decoder [Marcheggiani at al., 2018] 68
  • 69. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 69
  • 70. Our model John gave his wonderful wife a nice present WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself BiRNN/ CNN Semantic GCN Semantic GCN WA0 WA1 WA2 WA0’ WA2’ WA1’ Wself Wself Wself Wself Wself Wself Wself Wself <bos> John John + RNN DECODER ATTENTION MECHANISM [Marcheggiani at al., 2018] 70
  • 71. Experiments } Data } WMT‘16 English-German dataset (~4.5 million sentence pairs) } BLEU as evaluation measure } Model } Hyperparameters tuned on News Commentary En-De (~226K sentence pairs) } GRU as RNN [Marcheggiani at al., 2018] 71
  • 72. Results 23.3 23.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 72
  • 73. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 73
  • 74. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 74 + 1.2 BLEU
  • 75. Results 23.3 23.9 24.5 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Semantics is helpful [Marcheggiani at al., 2018] 75 + 1.2 BLEU
  • 76. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 76
  • 77. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU [Marcheggiani at al., 2018] 77 + 1.6 BLEU
  • 78. Results 23.3 23.9 24.5 24.9 20 21 22 23 24 25 26 BiRNN (Bastings et al.2017) BiRNN + Syntactic GCN (Bastings et al.2017) BiRNN + Semantic GCN BiRNN+Syntactic GCN +Semantic GCN FullWMT 2016 English-German BLEU Syntax and semantics are complementary [Marcheggiani at al., 2018] 78 + 1.6 BLEU
  • 79. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 79 BiRNN mistranslates “to” as “nach” (directionality)
  • 80. Analysis John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE [Marcheggiani at al., 2018] 80 BiRNN mistranslates “to” as “nach” (directionality)
  • 81. John sold the car to Mark . Seller Thing sold Buyer The boy walking down the dusty road is drinking a beer Walker AM-DIR Drinker Liquid SOURCE SEM GCN BiRNN John verkaufte das Auto nach Mark . John verkaufte das Auto an Mark . SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SOURCE 81 BiRNN mistranslates “to” as “nach” (directionality) Analysis [Marcheggiani at al., 2018]
  • 82. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 82 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 83. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 83 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 84. The boy sitting on a bench in the park plays chess . Thing sitting Location Player Game AM-LOC SEM GCN BiRNN Der Junge zu Fuß die staubige Straße ist ein Bier trinken . Der Junge , der die staubige Straße hinunter geht , trinkt ein Bier . SEM GCN BiRNN Der Junge auf einer Bank im Park spielt Schach . Der Junge sitzt auf einer Bank im Park Schach . SOURCE Analysis [Marcheggiani at al., 2018] 84 Both translations are wrong, but the BiRNN’s one is grammatically correct
  • 85. Conclusion } GCNs for encoding linguistic structures into NN } Semantics, coreference, discourse } Fast } Cheap } State-of-the-art model for dependency-based SRL } First to exploit semantics in NMT 85
  • 86. Roadmap 86 Including structured bias into neural NLP models
  • 87. Roadmap 87 Including structured bias into neural NLP models Low-resource setting
  • 88. Roadmap 88 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level
  • 89. Roadmap 89 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs
  • 90. Roadmap 90 Including structured bias into neural NLP models Low-resource setting Long-range dependencies Document level Cross-document level Integrating external knowledge i.e., knowledge graphs Thanks for your attention!