12. 12!
!
Text representation, e.g., word and document
representation, …!
…!
Deep learning has been attracting increasing!
attention …!
A future direction of deep learning is to integrate!
unlabeled data …!
The Skip-gram model is quite effective and efficient …!
…!
degree
network
edge
node
word
document
classification
text
embedding
•
40. degree
network
edge
node
word
document
classification
text
embedding
!
Text representation, e.g., word and document
representation, …!
…!
Deep learning has been attracting increasing!
attention …!
A future direction of deep learning is to integrate!
unlabeled data …!
The Skip-gram model is quite effective and efficient …!
Information networks encode the relationships!
between the data objects …!
text
information
network
word
…
classification
doc_1
doc_2
doc_3
doc_4
…
…
40!
•
59.
Text representation, e.g., word and document
representation, …!
…!
Deep learning has been attracting increasing …!
A future direction of deep learning is to integrate …!
The Skip-gram model is quite effective and efficient …!
Information networks encode the relationships!
label! document!
label
label
null
null
null
degree
network
edge
node
word
document
classification
text
embedding
text
information
network
word
…
classification
doc_1
doc_2
doc_3
doc_4
…
…
text
information
network
word
…
classification
label_2
label_1
label_3
…
…
59!
101. !
Efficient graphlet kernels for large graph comparison
F1 F2 F3 F4 F5 F6
F7 F8 F9 F10 F11
Figure 2: All graphlets of size 4
We now consider size 4 graphlets.
Modulo isomorphism there are 11 graphlets of size 4
(see Figure 2). Let us denote these graphlets Fi and
their counts |Fi|, i 2 1, 2, . . . , 11. As in the previous
case, we will first count all graphlets which contain at
least one edge.
Assume we want to count subgraphs containing edge
(v1, v2). As before, for v2 there are |N(v1)| choices
and for each pair (v1, v2) we have 4 cases for the third
node v3: v3 2 N(v1) N(v2), v3 2 N(v1) N(v2),
dataset size classes
MUTAG 188 2 (125 vs.
PTC 344 2 (192 vs.
Enzyme 600 6 (100 each
D & D 1178 2 (691 vs.
Table 1: Statistics on cl
these graphlets by 2.
5 Experiments
In this section, we evaluate th
nel and compare it with stat
in terms of runtime, scalabi
racy. Our baseline compara
dom walk kernel of (Gärtne
et al., 2004; Vishwanathan e
common walks in two graph
kernel of (Borgwardt & Krie
shortest path lengths in two
101!
114. !
ó
ó
ó
ó
How to assemble by
end-to-end learning?!
We can adapt deep
learning methods
developed for text!
114!
115. !
GRU
GRU
GRU
GRU
GRU
GRU
B
C
F
ℎ
#
ℎ
$
ℎ
%
x
3
x
2
x
4
dense
layers
Attention
Size
Increment
(c)
(d)
equence
Input
Output
'
sequence
&
nodes
Output!
Dense
layers!
A B C F
D F + +
A
A
A
A
D F + +
GRU
GRU
GRU
GRU
GRU
GRU
GRU
GRU
A B C F
ℎ" ℎ# ℎ$ ℎ%
x
1
x
3
x
2
x
4
Sample
Attentio
(b) (c) (d)
& nodes
'
sequence
Sequence Input
Node
Embedding
Bi-directional
GRU
Output
'
sequence
& nodes
A B C F
D F + +
A
A
A
A
D F + +
C
D
F
A
GRU
GRU
GRU
GRU
GRU
GRU
GRU
GRU
A B C F
ℎ" ℎ# ℎ$ ℎ%
x
1
x
3
x
2
x
4
Sample
Attention
(a) (b) (c) (d)
& nodes
'
sequence
Sequence Input
Node
Embedding
Bi-directional
GRU
Output
'
sequence
& nodes
A B C F
D F + +
A
A
A
A
D F + +
GRU
GRU
GRU
GRU
GRU
GRU
GRU
GRU
A B C F
ℎ" ℎ# ℎ$ ℎ%
x
1
x
3
x
2
x
4
Sample
Attentio
(b) (c) (d)
& nodes
'
sequence
Sequence Input
Node
Embedding
Bi-directional
GRU
Output
'
sequence
& nodes
C
D
E
F
B
A
A B C F
D F + +
A
A
A
A
D F + +
𝑇 nodes
𝐾
sequence
sampling
Sample through
random walks!
115!