SlideShare a Scribd company logo
1 of 35
Download to read offline
Scalable Global Alignment Graph Kernel Using
Random Features: From Node Embedding to
Graph Embedding
KDD2019
Lingfei Wu, Ian En-Hsu Yen, Zhen Zhang †, Kun Xu, Liang Zhao, Xi
Peng, Yinglong Xia, Charu Aggarwal
Presenter: Hagawa, Nishi, Eugene
2019.11.11
1 / 35
Problem Setup
Goal:
▶ Create a good kernel to measure Graph similarity
▶ Less computational complexity
▶ Take into account global and local graph property
▶ Have positive definite
▶ Leads to good classifier
Application:
▶ Kernel SVM (input: graph,
output: binary)
▶ Kernel PCA
▶ Kernel Ridge Regression
▶ . . .
How similar?
𝑘( ) = 0.5,
2 / 35
Difficulty : Graph isomorphism
difficulty to define similarity between graphs
▶ 2 graphs : G1(V1, E1, ℓ1, L1), G2(V2, E2, ℓ2, L2)
▶ Bijection1 f exists, if and only if, G1 is isomorphism with G2
▶ Bijection f : V1 → V2 s.t {va, vb} ∈ E1, va and vb are adjacent.
▶ Partial isomorphism is NP-complete
1
全単射
3 / 35
Related Work
2 groups of recent graph kernel method
Comparing sub-structure:
▶ The major difference is how to define and explore sub-structures
- random walks, shortest paths, cycles, subtree patterns, graphlets...
Geometric node embeddings:
▶ Capture global property
▶ Achieved state-of-the-art performance in the graph classification task
Bad points of related works
Comparing sub-structure:
▶ Do not take into account the global property
Geometric node embeddings:
▶ Do not necessarily use positive definite for Kernel
Poor scalability:
4 / 35
Contribution
▶ Propose a Positive definite Kernel
▶ Reduce computational complexity
▶ From quadratic to (quasi-)linear 2
▶ Propose an approximation of the kernel with convergence analysis
▶ Take into account global property
▶ Outperforms 12 state-of-the-art graph classification algorithms
- Include graph kernels, deep graph neural networks
2
quasi-linear : n log n. Time and Space.
5 / 35
Common kernel
Compare directly 2 graphs using kernel
Similarity
𝒌(・, ・)
Figure: calculation of kernel value between 2 graphs
6 / 35
Proposed kernel
Compare directly 2 graphs using kernel
Similarity
𝒌(・, ・)
Random Graphs
Similarity
with	𝒌(・, ・)
Figure: calculation of kernel value between 2 graphs
7 / 35
Notation : Graph definition
Graph: G = (V , E, ℓ)
Node: V = {vi }n
i=1
Edge: E = (V × V )
Assign label function: ℓ : V → Σ
Size of node: n
# of edge: m
Node label: l
# of graphs: N
G<latexit sha1_base64="QLLEFqFGXJzmcwbhRTcNSo8/+r8=">AAAB6HicbVDLSsNAFJ3UV62vqks3g0VwVRItPnZFF7pswT6gDWUyvWnHTiZhZiKU0C9w40IRt36SO//GSRpErQcuHM65l3vv8SLOlLbtT6uwtLyyulZcL21sbm3vlHf32iqMJYUWDXkoux5RwJmAlmaaQzeSQAKPQ8ebXKd+5wGkYqG409MI3ICMBPMZJdpIzZtBuWJX7Qx4kTg5qaAcjUH5oz8MaRyA0JQTpXqOHWk3IVIzymFW6scKIkInZAQ9QwUJQLlJdugMHxlliP1QmhIaZ+rPiYQESk0Dz3QGRI/VXy8V//N6sfYv3ISJKNYg6HyRH3OsQ5x+jYdMAtV8agihkplbMR0TSag22ZSyEC5TnH2/vEjaJ1XntFpr1ir1qzyOIjpAh+gYOegc1dEtaqAWogjQI3pGL9a99WS9Wm/z1oKVz+yjX7DevwC1D40D</latexit>
v1<latexit sha1_base64="6r48FeRijmeRwM0ce/9YOgxnVX0=">AAAB7HicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0+HErevFYwbSFNpTNdtMu3WzC7qZQQn+DFw+KePUHefPfuEmDqPXBwOO9GWbm+TFnStv2p1VaWV1b3yhvVra2d3b3qvsHbRUlklCXRDySXR8rypmgrmaa024sKQ59Tjv+5DbzO1MqFYvEg57F1AvxSLCAEayN5E4HqTMfVGt23c6BlolTkBoUaA2qH/1hRJKQCk04Vqrn2LH2Uiw1I5zOK/1E0RiTCR7RnqECh1R5aX7sHJ0YZYiCSJoSGuXqz4kUh0rNQt90hliP1V8vE//zeokOrryUiTjRVJDFoiDhSEco+xwNmaRE85khmEhmbkVkjCUm2uRTyUO4znDx/fIyaZ/VnfN6475Ra94UcZThCI7hFBy4hCbcQQtcIMDgEZ7hxRLWk/VqvS1aS1Yxcwi/YL1/AeZQjuI=</latexit>
v2<latexit sha1_base64="HvFip7AjDkPR91+3+J6CugKM0SQ=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkktftyKXjxWMG2hDWWz3bRLN5uwuymU0N/gxYMiXv1B3vw3btIgan0w8Hhvhpl5fsyZ0rb9aa2srq1vbJa2yts7u3v7lYPDtooSSahLIh7Jro8V5UxQVzPNaTeWFIc+px1/cpv5nSmVikXiQc9i6oV4JFjACNZGcqeDtD4fVKp2zc6BlolTkCoUaA0qH/1hRJKQCk04Vqrn2LH2Uiw1I5zOy/1E0RiTCR7RnqECh1R5aX7sHJ0aZYiCSJoSGuXqz4kUh0rNQt90hliP1V8vE//zeokOrryUiTjRVJDFoiDhSEco+xwNmaRE85khmEhmbkVkjCUm2uRTzkO4znDx/fIyaddrznmtcd+oNm+KOEpwDCdwBg5cQhPuoAUuEGDwCM/wYgnryXq13hatK1YxcwS/YL1/AefVjuM=</latexit>
v3<latexit sha1_base64="+XpoULfOHqCHvyZwfk/DV8G7sg0=">AAAB7HicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0+HErevFYwbSFNpTNdtMu3WzC7qZQQn+DFw+KePUHefPfuEmDqPXBwOO9GWbm+TFnStv2p1VaWV1b3yhvVra2d3b3qvsHbRUlklCXRDySXR8rypmgrmaa024sKQ59Tjv+5DbzO1MqFYvEg57F1AvxSLCAEayN5E4H6fl8UK3ZdTsHWiZOQWpQoDWofvSHEUlCKjThWKmeY8faS7HUjHA6r/QTRWNMJnhEe4YKHFLlpfmxc3RilCEKImlKaJSrPydSHCo1C33TGWI9Vn+9TPzP6yU6uPJSJuJEU0EWi4KEIx2h7HM0ZJISzWeGYCKZuRWRMZaYaJNPJQ/hOsPF98vLpH1Wd87rjftGrXlTxFGGIziGU3DgEppwBy1wgQCDR3iGF0tYT9ar9bZoLVnFzCH8gvX+BelajuQ=</latexit>
V = {v1, v2, v3}<latexit sha1_base64="5/LCIMtGZ5h5wVQCMmTMg/5dkCc=">AAAB/3icbVDLSsNAFJ34rPUVFdy4GSyCCylJW3wshKIblxXsA5oQJtNJO3QyCTOTQold+CtuXCji1t9w5984SYuo9cBcDufcy71z/JhRqSzr01hYXFpeWS2sFdc3Nre2zZ3dlowSgUkTRywSHR9JwignTUUVI51YEBT6jLT94XXmt0dESBrxOzWOiRuiPqcBxUhpyTP3W/ASOikcefaJLpWsVJ2JZ5asspUDzhN7RkpghoZnfji9CCch4QozJGXXtmLlpkgoihmZFJ1EkhjhIeqTrqYchUS6aX7/BB5ppQeDSOjHFczVnxMpCqUch77uDJEayL9eJv7ndRMVnLsp5XGiCMfTRUHCoIpgFgbsUUGwYmNNEBZU3wrxAAmElY6smIdwkeH0+8vzpFUp29Vy7bZWql/N4iiAA3AIjoENzkAd3IAGaAIM7sEjeAYvxoPxZLwab9PWBWM2swd+wXj/AprvlA8=</latexit>
⌃ = { , }<latexit sha1_base64="ZY89SR6jHBd25PoJ2nDrWsihEs4=">AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwISXR4mMhFN24rGgf0IQymU7aoTNJmJkINbT+ihsXirj1P9z5N07SIGo9cOFwzr3ce48XMSqVZX0aM7Nz8wuLhaXi8srq2rq5sdmQYSwwqeOQhaLlIUkYDUhdUcVIKxIEcY+Rpje4TP3mHRGShsGtGkbE5agXUJ9ipLTUMbedG9rjCJ5DJxmPxwe6nFHHLFllKwOcJnZOSiBHrWN+ON0Qx5wECjMkZdu2IuUmSCiKGRkVnViSCOEB6pG2pgHiRLpJdv0I7mmlC/1Q6AoUzNSfEwniUg65pzs5Un3510vF/7x2rPxTN6FBFCsS4MkiP2ZQhTCNAnapIFixoSYIC6pvhbiPBMJKB1bMQjhLcfz98jRpHJbto3LlulKqXuRxFMAO2AX7wAYnoAquQA3UAQb34BE8gxfjwXgyXo23SeuMkc9sgV8w3r8ASdWVRQ==</latexit>
8 / 35
Notation
Set of graphs: G = {Gi }N
i=1
Set of graph lebels: Y = {Yi }N
i=1
Set of geometric embeddings (each graph): U = {ui }n
i=1 ∈ Rn×d
Latent node embedding space (each node): u ∈ Rd
𝐺" ・・・
𝑁
𝑛Latent node ↑
embedding
Node size→
# of graphs→
𝑌"Graph label→ 𝑌&
𝐺&
u1 2 Rd
<latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit>
u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit>
u3<latexit sha1_base64="w2BS8kqWqIp26xG7B4vB81cBpaY=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRItPnZFNy4r2Ae0IUwmk3boZCbMTAol9E/cuFDErX/izr9xkgZR64GBwzn3cs+cIGFUacf5tCorq2vrG9XN2tb2zu6evX/QVSKVmHSwYEL2A6QIo5x0NNWM9BNJUBww0gsmt7nfmxKpqOAPepYQL0YjTiOKkTaSb9vDQLAwRnqcpXM/O5/7dt1pOAXgMnFLUgcl2r79MQwFTmPCNWZIqYHrJNrLkNQUMzKvDVNFEoQnaEQGhnIUE+VlRfI5PDFKCCMhzeMaFurPjQzFSs3iwEzmGdVfLxf/8wapjq68jPIk1YTjxaEoZVALmNcAQyoJ1mxmCMKSmqwQj5FEWJuyakUJ1zkuvr+8TLpnDfe80bxv1ls3ZR1VcASOwSlwwSVogTvQBh2AwRQ8gmfwYmXWk/VqvS1GK1a5cwh+wXr/AiT3lCE=</latexit>
9 / 35
Geometric Embeddings
Use partial eigendecomposition 3 to extract node embeddings:
1. Create normalized Laplacian matrix L ∈ Rn×n
2. Do partial eigendecomposition and obtaining U
3. Use the smallest d eigenvectors
Normalized
Laplacian matrix
→
Partial
Eigendecomposition
𝑈Λ𝑈#
𝑛×𝑑		𝑑×𝑑		𝑑×𝑛
The smallest 𝑑 eigenvectors
L𝑛×𝑛
A
B C
A B C
A 0 1 1
B 1 0 0
C 1 0 0
Adjacency matrix
A B C
A 2 0 0
B 0 1 0
C 0 0 1
Degree matrix
A B C
A 2 -1 -1
B -1 1 0
C -1 0 1
Laplacian matrix
-=
Normalize
Figure: Example obtaining U
3
Time complexity: Linear (# of graph edge) (...I don’t know how.)
10 / 35
Transportation Distance [1]
Earth Mover’s Distance (EMD): measure of dissimilarity
EMD (Gx , Gy ) := min
T ∈R
nx ×ny
+
⟨D, T ⟩
s.t.T 1 = t(Gx )
, T T
1 = t(Gy )
▶ Linear programming problem
▶ Flow matrix T
- Tij : how much of vi in Gx travels to vj in Gy
▶ GX → UX = {ux
1, ux
2, · · · , ux
nx
}
▶ GY → UY = {uy
1, uy
2, · · · , uy
ny }
▶ Transport cost matrix D
- Dij = ∥ux
i − uy
j ∥2
11 / 35
Transportation Distance [1]
Earth Mover’s Distance (EMD): measure of dissimilarity
EMD (Gx , Gy ) := min
T ∈R
nx ×ny
+
⟨D, T ⟩
s.t.T 1 = t(Gx )
, T T
1 = t(Gy )
▶ Node vi has ci outgoing edges
▶ Normalized bog-of-words (nBOW): ti = ci /
∑n
j=1 cj ∈ R
12 / 35
Transportation Distance: Example
AB
C
ab
c
A
B
C
a
b
c
a b c
A
B
C
A
B
C
a b c
Figure: EMD example
▶ EMD focus on node size and outgoing edges of each graph
13 / 35
Straightforward way to define kernel, It’s high cost
EDM based Kernel = −
1
2
JDemd J
J = I −
1
N
11⊤
▶ Not necessarily positive definite
▶ Time complexity:O(N2n3log(n)), Space complexity :O(N2)
Graph A Graph CGraph B
A B C
A EMD(A,A) EMD(A,B) EMD(A,C)
B EMD(B,A) EMD(B,B) EMD(B,C)
C EMD(C,A) EMD(C,B) EMD(C,C)
Distance
Matrix
Demd<latexit sha1_base64="/2TlWLrs+SXNjZgq6EbvOI79l2o=">AAAB7nicbVDLSsNAFJ3UV62vqks3g0VwVRItPnZFXbisYB/QhjKZ3LRDJ5MwMxFK6Ee4caGIW7/HnX/jJA2i1gMXDufcy733eDFnStv2p1VaWl5ZXSuvVzY2t7Z3qrt7HRUlkkKbRjySPY8o4ExAWzPNoRdLIKHHoetNrjO/+wBSsUjc62kMbkhGggWMEm2k7s0whdCfDas1u27nwIvEKUgNFWgNqx8DP6JJCEJTTpTqO3as3ZRIzSiHWWWQKIgJnZAR9A0VJATlpvm5M3xkFB8HkTQlNM7VnxMpCZWahp7pDIkeq79eJv7n9RMdXLgpE3GiQdD5oiDhWEc4+x37TALVfGoIoZKZWzEdE0moNglV8hAuM5x9v7xIOid157TeuGvUmldFHGV0gA7RMXLQOWqiW9RCbUTRBD2iZ/RixdaT9Wq9zVtLVjGzj37Bev8Cc8SPyQ==</latexit>
Figure: Straightforward kernel based on EMD
14 / 35
Global Alignment Graph Kernel
Using EMD and Random feature (RF)
Proposed Kernel: 4
k (Gx , Gy ) :=
∫
p (Gω) ϕGω (Gx ) dGω
where ϕGω
:= exp (−γEMD(Gx , Gω))
▶ Gω : random graph
▶ W = {wi }D
i=1
▶ wi is sampled from V ∈ Rd
▶ p(Gω) is a distribution over the space of all random graphs of variable
sizes Ω := ∪Dmax
D=1VD
4
ランダムグラフの詳細に踏み込もうと思ったが, 非常に込み入った話でハガワは挫折
した. 気になる方はこちらを参照. 確率なんもワカンネ.
15 / 35
Global Alignment Graph Kernel Using EMD and RF
Approximation5:
˜k (Gx , Gy ) =
1
R
R∑
i=1
ϕGωi (Gx ) ϕGωi (Gy )
→ k (Gx , Gy ) , as R → ∞
𝜙"#
(𝐺&)
Random Graphs
𝐺(
𝐺&
𝜙"#
(𝐺))
𝐺)
5
the uniform convergence of approximate proposed kernel
16 / 35
Algorithm
Set Data and hyperparameters
▶ Node embedding size (dimension): d
▶ Max size of random graphs: Dmax
▶ Graph embedding size: R
𝜙"#
(𝐺&)
Random Graphs
𝐺(
𝐺&
𝜙"#
(𝐺))
𝐺)
DataGraphs
𝑅𝐷,-&
𝑑
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
17 / 35
Compute {t(Gt )}N
i=1 and Laplacian matrix L
A
B C
A B C
A 2 -1 -1
B -1 1 0
C -1 0 1
Laplacian matrix
𝒕(𝑮 𝒙)
½
¼
¼
→For All Graphs
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
18 / 35
Obtain node embedding vectors
Normalized
Laplacian matrix
→
Partial
Eigendecomposition
𝑈Λ𝑈#
𝑛×𝑑		𝑑×𝑑		𝑑×𝑛
The smallest 𝑑 eigenvectors
L𝑛×𝑛
→For All Graphs
𝐺"
u1 2 Rd
<latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit>
u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit>
u3<latexit sha1_base64="w2BS8kqWqIp26xG7B4vB81cBpaY=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRItPnZFNy4r2Ae0IUwmk3boZCbMTAol9E/cuFDErX/izr9xkgZR64GBwzn3cs+cIGFUacf5tCorq2vrG9XN2tb2zu6evX/QVSKVmHSwYEL2A6QIo5x0NNWM9BNJUBww0gsmt7nfmxKpqOAPepYQL0YjTiOKkTaSb9vDQLAwRnqcpXM/O5/7dt1pOAXgMnFLUgcl2r79MQwFTmPCNWZIqYHrJNrLkNQUMzKvDVNFEoQnaEQGhnIUE+VlRfI5PDFKCCMhzeMaFurPjQzFSs3iwEzmGdVfLxf/8wapjq68jPIk1YTjxaEoZVALmNcAQyoJ1mxmCMKSmqwQj5FEWJuyakUJ1zkuvr+8TLpnDfe80bxv1ls3ZR1VcASOwSlwwSVogTvQBh2AwRQ8gmfwYmXWk/VqvS1GK1a5cwh+wXr/AiT3lCE=</latexit>
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
19 / 35
Generate random graph 6
2	 ← 𝑅𝑎𝑛𝑑(1, 𝐷,-.)
𝑈2×𝑑	
← Generate_𝑟𝑎𝑛𝑑𝑜𝑚_𝑔𝑟𝑎𝑝ℎ	(2, 𝑑)
u1 2 Rd
<latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit>
u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit>
Figure: 2 nodes random graph
example
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
6
In after section, I show 2 way to generate random graphs.
20 / 35
Compute a feature veotor Zj
𝜙"#
(𝐺&)
𝑧)=
𝑍)+
⋮
⋮
⋮
𝑍)-
Zji = ϕGω (Gi ) := exp (−γ EMD (Gi , Gω))
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
21 / 35
Generate random graph for R times
𝑧"=
𝑍""
⋮
⋮
⋮
𝑍"%
𝑧&=
𝑍&"
⋮
⋮
⋮
𝑍&%
𝜙()
(𝐺,) 𝜙()
(𝐺,)
⋯
𝜙()
(𝐺,)
𝑧/=
𝑍/"
⋮
⋮
⋮
𝑍/%
⋯
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
7
7
R : number of Random graphs
22 / 35
Output N × R Matrix Z
𝑍=
"
√$
𝑍""
⋮
𝑍"&
⋮
𝑍"'
⋯
	
	
	
⋯
𝑍$"
⋮
𝑍$&
⋮
𝑍$'
𝑍"&
𝑍*&
𝑍*'
⋯
	
	
	
⋯
Algorithm 1 Random Graph Embedding
Input: Data graphs {Gi }N
i=1, node embedding size d, maximum
size of random graphs Dmax , graph embedding size R.
Output: Feature matrix ZN ⇥R for data graphs
1: Compute nBOW weights vectors {t(Gi )}N
i=1 of the normalized
Laplacian L of all graphs
2: Obtain node embedding vectors {ui }n
i=1 by computing d small-
est eigenvectors of L
3: for j = 1, . . . ,R do
4: Draw Dj uniformly from [1, Dmax ].
5: Generate a random graph G j with Dj number of nodes
embeddings W from Algorithm 2.
6: Compute a feature vector Zj = G j
({Gi }N
i=1)) using EMD
or other optimal transportation distance in Equation (3).
7: end for
8: Return feature matrix Z({Gi }N
i=1) = 1p
R
{Zi }R
i=1
23 / 35
How to generate Random Graph
Data-independent and Data-dependent Distributions
Data-dependent 8
Random Graph Embedding(Anchor Sug-Graphs(ASG)):
1. Pick up Gk from data set
2. Uniformly draw Dj nodes
3. {wi }
Dj
i=1 = {un1 , un1 , · · · , unDj
}
Incorporating Label information:
▶ d(ui , uj) = max(∥ui − uj∥2,
√
d) if vi and vj have diffrent node label
▶ Make distance between different node labels
▶
√
d is largest distance in a d-dimentionnal unit hypercube space
8
data independent は appendix を参照
24 / 35
Complexity comparison (Left: Proposed, Right: Straightforward)
𝜙"#
(𝐺&)
Random Graphs
𝐺(
𝐺&
𝜙"#
(𝐺))
𝐺)
Figure: Proposed kernel
Graph A Graph CGraph B
A B C
A EMD(A,A) EMD(A,B) EMD(A,C)
B EMD(B,A) EMD(B,B) EMD(B,C)
C EMD(C,A) EMD(C,B) EMD(C,C)
Distance
Matrix
Figure: Straitforward kernel
Time complexity (dmz is partial eigendecomposition cost) 9:
▶ O(NRD2nlog(n) + dmz) ▶ O(N2n3log(n) + dmz)
※ R is # of Random Graphs, D is # of Random Graph nodes (D < n)
Space complexity:
▶ O(NR) ▶ O(N2)
9
dmz is eigendecomposition cost.
25 / 35
Experiments
Experimental setup
Machine:
▶ Use linear SVM (LIBLINEAR)
Data:
▶ 9 Datasets
Hyperparameters:
▶ γ(Kernel)→[1e-3 1e-2 1e-1 1 10]
▶ D max (Size of random graph)→[3:3:30]
▶ SVM
Evaluation:
▶ 10-fold cross-validation
▶ 10 times average accuracy
26 / 35
# of Random Graph (R) and Testing accuracy:
10
0
10
1
10
2
10
3
10
4
Varying R
15
20
25
30
35
40
45
50
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(a) ENZYMES
10
0
10
1
10
2
10
3
10
4
Varying R
62
64
66
68
70
72
74
76
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(b) NCI109
10
0
10
1
10
2
10
3
10
4
Varying R
55
60
65
70
75
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
(c) IMDBBINARY
10
0
10
1
10
2
10
3
10
4
Varying R
55
60
65
70
75
80
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
(d) COLLAB
0 500 1000 1500 2000 2500
Varying R
0
10
20
30
40
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(e) ENZYMES
0 1000 2000 3000 4000 5000
Varying R
0
100
200
300
400
500
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(f) NCI109
0 1000 2000 3000 4000 5000
Varying R
0
20
40
60
80
100
120
140
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
(g) IMDBBINARY
0 1000 2000 3000 4000 5000
Varying R
0
500
1000
1500
2000
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
(h) COLLAB
Figure 2: Test accuracies and runtime of three variants of RGE with and without node labels when varying R.
▶ Converge very rapidly when increasing R
# of Random Graph (R) and Runtime:
10
0
10
1
10
2
10
3
10
4
Varying R
15
20
25
30
35
40
45
50
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(a) ENZYMES
10
0
10
1
10
2
10
3
10
4
Varying R
62
64
66
68
70
72
74
76
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(b) NCI109
10
0
10
1
10
2
10
3
10
4
Varying R
55
60
65
70
75
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
(c) IMDBBINARY
10
0
10
1
10
2
10
3
10
4
Varying R
55
60
65
70
75
80
TestingAccuracy%
Testing Accuracy VS R
RGE(RF)
RGE(ASG)
(d) COLLAB
0 500 1000 1500 2000 2500
Varying R
0
10
20
30
40
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(e) ENZYMES
0 1000 2000 3000 4000 5000
Varying R
0
100
200
300
400
500
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
RGE(ASG)-NodeLab
(f) NCI109
0 1000 2000 3000 4000 5000
Varying R
0
20
40
60
80
100
120
140
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
(g) IMDBBINARY
0 1000 2000 3000 4000 5000
Varying R
0
500
1000
1500
2000
Runtime(Seconds)
Total Runtime VS R
RGE(RF)
RGE(ASG)
(h) COLLAB
Figure 2: Test accuracies and runtime of three variants of RGE with and without node labels when varying R.
▶ Show quasi-linear scalability with respect to R
27 / 35
10
2
10
3
10
4
Varying number of graphs N
10
-2
10
0
10
2
10
4
10
6
10
8
Time(Seconds)
Runtime VS number of graphs N
RGE(Eigentime)
RGE(FeaGentime)
RGE(Runtime)
Linear
Quatratic
(a) Number of graphs N
10
2
10
3
Varying size of graph n
10
0
10
1
10
2
10
3
10
4
10
5
Time(Seconds)
Runtime VS size of graph n
RGE(Eigentime)
RGE(FeaGentime)
RGE(Runtime)
Linear
Quatratic
(b) Size of graph n
▶ shows the linear scalability with respect to N (a)
▶ shows the quasi-liniear scalability with respect to n (b)
28 / 35
classification accuracy:
Table 1: Comparison of classication accuracy against graph kernel methods without node labels.
Datasets MUTAG PTC-MR ENZYMES NCI1 NCI019
RGE(RF) 86.33 ± 1.39(1s) 59.82 ± 1.42(1s) 35.98 ± 0.89(38s) 74.70 ± 0.56(727s) 72.50 ± 0.32(865s)
RGE(ASG) 85.56 ± 0.91(2s) 59.97 ± 1.65 (1s) 38.52 ± 0.91(18s) 74.30 ± 0.45(579s) 72.70 ± 0.42(572s)
EMD 84.66 ± 2.69 (7s) 57.65 ± 0.59 (46s) 35.45 ± 0.93 (216s) 72.65 ± 0.34 (8359s) 70.84 ± 0.18 (8281s)
PM 83.83 ± 2.86 59.41 ± 0.68 28.17 ± 0.37 69.73 ± 0.11 68.37 ± 0.14
Lo- 82.58 ± 0.79 55.21 ± 0.72 26.5 ± 0.54 62.28 ± 0.34 62.52 ± 0.29
OA-E (A) 79.89 ± 0.98 56.77 ± 0.85 36.12 ± 0.81 67.99 ± 0.28 67.14 ± 0.26
RW 77.78 ± 0.98 56.18 ± 1.12 20.17 ± 0.83 56.89 ± 0.34 56.13 ± 0.31
GL 66.11 ± 1.31 57.05 ± 0.83 18.16 ± 0.47 47.37 ± 0.15 48.39 ± 0.18
SP 82.22 ± 1.14 56.18 ± 0.56 28.17 ± 0.64 62.02 ± 0.17 61.41 ± 0.32
Table 2: Comparison of classication accuracy against graph kernel methods with node labels or WL technique.
Datasets PTC-MR ENZYMES PROTEINS NCI1 NCI019
RGE(ASG) 61.5 ± 2.34(1s) 48.27 ± 0.99(28s) 75.98 ± 0.71(20s) 76.46 ± 0.45(379s) 74.42 ± 0.30(526s)
EMD 57.67 ± 2.11 (42s) 42.85 ± 0.72 (296s) 76.03 ± 0.28 (1936s) 75.89 ± 0.16 (7942s) 73.63 ± 0.33 (8073s)
PM 60.38 ± 0.86 40.33 ± 0.34 74.39 ± 0.45 72.91 ± 0.53 71.97 ± 0.15
OA-E (A) 58.76 ± 0.92 43.56 ± 0.66 — 69.83 ± 0.30 68.96 ± 0.35
V-OA 56.4 ± 1.8 35.1 ± 1.1 73.8 ± 0.5 65.6 ± 0.4 65.1 ± 0.4
RW 57.06 ± 0.86 19.33 ± 0.62 71.67 ± 0.78 63.34 ± 0.27 63.51 ± 0.18
GL 59.41 ± 0.94 32.70 ± 1.20 71.63 ± 0.33 66.00 ± 0.07 66.59 ± 0.08
SP 60.00 ± 0.72 41.68 ± 1.79 73.32 ± 0.45 73.47 ± 0.11 73.07 ± 0.11
WL-RGE(ASG) 62.20 ± 1.67(1s) 57.97 ± 1.16(38s) 76.63 ± 0.82(30s) 85.85 ± 0.42(401s) 85.32 ± 0.29(798s)
WL-ST 57.64 ± 0.68 52.22 ± 0.71 72.92 ± 0.67 82.19 ± 0.18 82.46 ± 0.24
▶ RGE is much faster than EMD
29 / 35
Table 2: Comparison of classication accuracy against graph kernel methods with node labels or WL technique.
Datasets PTC-MR ENZYMES PROTEINS NCI1 NCI019
RGE(ASG) 61.5 ± 2.34(1s) 48.27 ± 0.99(28s) 75.98 ± 0.71(20s) 76.46 ± 0.45(379s) 74.42 ± 0.30(526s)
EMD 57.67 ± 2.11 (42s) 42.85 ± 0.72 (296s) 76.03 ± 0.28 (1936s) 75.89 ± 0.16 (7942s) 73.63 ± 0.33 (8073s)
PM 60.38 ± 0.86 40.33 ± 0.34 74.39 ± 0.45 72.91 ± 0.53 71.97 ± 0.15
OA-E (A) 58.76 ± 0.92 43.56 ± 0.66 — 69.83 ± 0.30 68.96 ± 0.35
V-OA 56.4 ± 1.8 35.1 ± 1.1 73.8 ± 0.5 65.6 ± 0.4 65.1 ± 0.4
RW 57.06 ± 0.86 19.33 ± 0.62 71.67 ± 0.78 63.34 ± 0.27 63.51 ± 0.18
GL 59.41 ± 0.94 32.70 ± 1.20 71.63 ± 0.33 66.00 ± 0.07 66.59 ± 0.08
SP 60.00 ± 0.72 41.68 ± 1.79 73.32 ± 0.45 73.47 ± 0.11 73.07 ± 0.11
WL-RGE(ASG) 62.20 ± 1.67(1s) 57.97 ± 1.16(38s) 76.63 ± 0.82(30s) 85.85 ± 0.42(401s) 85.32 ± 0.29(798s)
WL-ST 57.64 ± 0.68 52.22 ± 0.71 72.92 ± 0.67 82.19 ± 0.18 82.46 ± 0.24
WL-SP 56.76 ± 0.78 59.05 ± 1.05 74.49 ± 0.74 84.55 ± 0.36 83.53 ± 0.30
WL-OA-E (A) 59.72 ± 1.10 53.76 ± 0.82 — 84.75 ± 0.21 84.23 ± 0.19
Table 3: Comparison of classication accuracy against recent deep learning models on graphs.
Datasets PTC-MR PROTEINS NCI1 IMDB-B IMDB-M COLLAB
(WL-)RGE(ASG) 62.20 ± 1.67 76.63 ± 0.82 85.85 ± 0.42 71.48 ± 1.01 47.26 ± 0.89 76.85 ± 0.34
DGCNN 58.59 ± 2.47 75.54 ± 0.94 74.44 ± 0.47 70.03 ± 0.86 47.83 ± 0.85 73.76 ± 0.49
PSCN 62.30 ± 5.70 75.00 ± 2.51 76.34 ± 1.68 71.00 ± 2.29 45.23 ± 2.84 72.60 ± 2.15
DCNN 56.6 ± 1.20 61.29 ± 1.60 56.61 ± 1.04 49.06 ± 1.37 33.49 ± 1.42 52.11 ± 0.53
DGK 57.32 ± 1.13 71.68 ± 0.50 62.48 ±0.25 66.96 ± 0.56 44.55 ± 0.52 73.09 ± 0.25
aph in the range of n = [8 1024], respectively. When generating
ndom adjacency matrices, we set the number of edges always be
ice the number of nodes in a graph. We report the runtime for
mputing node embeddings using a state-of-the-art eigensolver
0], generating RGE graph embeddings, and the overall computa-
n of graph classication, accordingly. Fig. 3(a) shows the linear
alability of RGE when increasing the number of graphs, conrm-
g our complexity analysis in the previous Section. In addition, as
property of our RGE embeddings, which open the door to lar
scale applications of graph kernels for various applications such
social networks analysis and computational biology.
Comparison with All Baselines. Tables 1, 2, and 3 show th
RGE consistently outperforms or matches other state-of-the-
graph kernels and deep learning approaches in terms of clas
cation accuracy. There are several further observations wor
making here. First, EMD, the closest method to RGE, shows go
▶ Outperforms other graph kernels and deep learning approaches
▶ RGE is much faster than EMD
▶ WL-technique makes good performance
30 / 35
Conclusion
Proposed good graph kernel!
▶ Be scalable
▶ Take into account global property
thank you.
31 / 35
Appendix I
▶ グラフが同型ならば, 隣接行列の固有値は一致するが, 逆は成り立た
ない
Normalized Laplacian Matrix:
Li,j :=



1 if i = j and deg (vi ) ̸= 0
− 1√
deg(vi ) deg(vj )
if i ̸= j and vi is adjacent to vj
0 otherwise.
deg(v): Degree of node (vertex) v
32 / 35
Appendix II
33 / 35
Appendix III Table 4: Properties of the datasets.
Dataset MUTAG PTC ENZYMES PROTEINS NCI1 NCI109 IMDB-B IMDB-M COLLAB
Max # Nodes 28 109 126 620 111 111 136 89 492
Min # Nodes 10 2 2 4 3 4 12 7 32
Ave # Nodes 17.9 25.6 32.6 39.05 29.9 29.7 19.77 13.0 74.49
Max # Edges 33 108 149 1049 119 119 1249 1467 40119
Min # Edges 10 1 1 5 2 3 26 12 60
Ave # Edges 19.8 26.0 62.1 72.81 32.3 32.1 96.53 65.93 2457.34
# Graph 188 344 600 1113 4110 4127 1000 1500 5000
# Graph Labels 2 2 6 2 2 2 2 3 3
# Node Labels 7 19 3 3 37 38 — — —
wice the number of nodes in a graph. We use the size of
ding d = 6 just like in the previous sections. We set the
eters related to RGE itself are DMax = 10 and R = 128.
e runtime for computing node embeddings using state-
gensolver [33, 40] and RGE graph embeddings, and the
me, respectively.
ditional Results and Discussions on
mparisons Against All Baselines
e RGE is a graph embedding, we directly employ a lin-
plemented in LIBLIBNEAR [7] since it can faithfully
eectiveness of our feature representation from the
nonlinear learning solvers. Following the convention
experiments ten times (thus 100 runs per dataset) an
average prediction accuracies and standard deviations
of hyperparameters and D_max are [1e-3 1e-2 1e
[3:3:30], respectively. All parameters of the SVM and
eters of our method were optimized only on the train
The node embedding size is set to either 4, 6 or 8 bu
the same number for all variants of RGE on the same
eliminate the random eects, we repeat the whole exp
times and report the average prediction accuracies a
deviations. For all baselines we take the best number
the papers except EMD, where we rerun the experim
comparisons in terms of both accuracy and runtime. Sin
EMD, and PM are essentially built on the same node
Terms
WL test:
▶ Technique to improve kernel with node labels
RGE(ASG)-NodeLab:
▶ Data-dependent random graph + Incorporating Label information
WL-RGE:
▶ Data-dependent random graph + WL test
34 / 35
引用 I
Giannis Nikolentzos, Polykarpos Meladianos, and Michalis
Vazirgiannis.
Matching node embeddings for graph similarity.
In Thirty-First AAAI Conference on Artificial Intelligence, 2017.
35 / 35

More Related Content

What's hot

Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationVARUN KUMAR
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Frank Nielsen
 
Litvinenko low-rank kriging +FFT poster
Litvinenko low-rank kriging +FFT  posterLitvinenko low-rank kriging +FFT  poster
Litvinenko low-rank kriging +FFT posterAlexander Litvinenko
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsLuc Brun
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017Fred J. Hickernell
 
Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Alexander Litvinenko
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016Daniel Omunting
 
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...Communication Systems & Networks
 
Efficient Technique for Image Stenography Based on coordinates of pixels
Efficient Technique for Image Stenography Based on coordinates of pixelsEfficient Technique for Image Stenography Based on coordinates of pixels
Efficient Technique for Image Stenography Based on coordinates of pixelsIOSR Journals
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...CDSL_at_SNU
 

What's hot (20)

Lecture 3 image sampling and quantization
Lecture 3 image sampling and quantizationLecture 3 image sampling and quantization
Lecture 3 image sampling and quantization
 
Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...Optimal interval clustering: Application to Bregman clustering and statistica...
Optimal interval clustering: Application to Bregman clustering and statistica...
 
Litvinenko low-rank kriging +FFT poster
Litvinenko low-rank kriging +FFT  posterLitvinenko low-rank kriging +FFT  poster
Litvinenko low-rank kriging +FFT poster
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
201707 SER332 Lecture 23
201707 SER332 Lecture 23  201707 SER332 Lecture 23
201707 SER332 Lecture 23
 
Graph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & TrendsGraph Edit Distance: Basics & Trends
Graph Edit Distance: Basics & Trends
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017
 
Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics Tucker tensor analysis of Matern functions in spatial statistics
Tucker tensor analysis of Matern functions in spatial statistics
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016Presentation 2(power point presentation) dis2016
Presentation 2(power point presentation) dis2016
 
ikh323-05
ikh323-05ikh323-05
ikh323-05
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...
Novel Performance Analysis of Network Coded Communications in Single-Relay Ne...
 
Efficient Technique for Image Stenography Based on coordinates of pixels
Efficient Technique for Image Stenography Based on coordinates of pixelsEfficient Technique for Image Stenography Based on coordinates of pixels
Efficient Technique for Image Stenography Based on coordinates of pixels
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Pixelrelationships
PixelrelationshipsPixelrelationships
Pixelrelationships
 
Clustering lect
Clustering lectClustering lect
Clustering lect
 
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
 

Similar to Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding

VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNDat Nguyen
 
Graphical Model Selection for Big Data
Graphical Model Selection for Big DataGraphical Model Selection for Big Data
Graphical Model Selection for Big DataAlexander Jung
 
Graph Kernels for Chemical Informatics
Graph Kernels for Chemical InformaticsGraph Kernels for Chemical Informatics
Graph Kernels for Chemical InformaticsMukund Raj
 
ASCC2022_JunsooKim_220530_.pdf
ASCC2022_JunsooKim_220530_.pdfASCC2022_JunsooKim_220530_.pdf
ASCC2022_JunsooKim_220530_.pdfJunsoo Kim
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksStratio
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersTaiji Suzuki
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Chiheb Ben Hammouda
 
Talk_HU_Berlin_Chiheb_benhammouda.pdf
Talk_HU_Berlin_Chiheb_benhammouda.pdfTalk_HU_Berlin_Chiheb_benhammouda.pdf
Talk_HU_Berlin_Chiheb_benhammouda.pdfChiheb Ben Hammouda
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNNtuxette
 
Math behind the kernels
Math behind the kernelsMath behind the kernels
Math behind the kernelsRevanth Kumar
 
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdfreservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdfRTEFGDFGJU
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionRichard Southern
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1aravindangc
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s methodFast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s methodIAEME Publication
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfRajJain516913
 

Similar to Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding (20)

VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
 
Graphical Model Selection for Big Data
Graphical Model Selection for Big DataGraphical Model Selection for Big Data
Graphical Model Selection for Big Data
 
Graph Kernels for Chemical Informatics
Graph Kernels for Chemical InformaticsGraph Kernels for Chemical Informatics
Graph Kernels for Chemical Informatics
 
ASCC2022_JunsooKim_220530_.pdf
ASCC2022_JunsooKim_220530_.pdfASCC2022_JunsooKim_220530_.pdf
ASCC2022_JunsooKim_220530_.pdf
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
Hierarchical Deterministic Quadrature Methods for Option Pricing under the Ro...
 
Talk_HU_Berlin_Chiheb_benhammouda.pdf
Talk_HU_Berlin_Chiheb_benhammouda.pdfTalk_HU_Berlin_Chiheb_benhammouda.pdf
Talk_HU_Berlin_Chiheb_benhammouda.pdf
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
Presentation.pdf
Presentation.pdfPresentation.pdf
Presentation.pdf
 
Math behind the kernels
Math behind the kernelsMath behind the kernels
Math behind the kernels
 
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdfreservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
reservoir-modeling-using-matlab-the-matalb-reservoir-simulation-toolbox-mrst.pdf
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain Decomposition
 
graph theory
graph theorygraph theory
graph theory
 
Computer Graphics Unit 1
Computer Graphics Unit 1Computer Graphics Unit 1
Computer Graphics Unit 1
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s methodFast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
sheet6.pdf
sheet6.pdfsheet6.pdf
sheet6.pdf
 
doc6.pdf
doc6.pdfdoc6.pdf
doc6.pdf
 

Recently uploaded

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 

Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding

  • 1. Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding KDD2019 Lingfei Wu, Ian En-Hsu Yen, Zhen Zhang †, Kun Xu, Liang Zhao, Xi Peng, Yinglong Xia, Charu Aggarwal Presenter: Hagawa, Nishi, Eugene 2019.11.11 1 / 35
  • 2. Problem Setup Goal: ▶ Create a good kernel to measure Graph similarity ▶ Less computational complexity ▶ Take into account global and local graph property ▶ Have positive definite ▶ Leads to good classifier Application: ▶ Kernel SVM (input: graph, output: binary) ▶ Kernel PCA ▶ Kernel Ridge Regression ▶ . . . How similar? 𝑘( ) = 0.5, 2 / 35
  • 3. Difficulty : Graph isomorphism difficulty to define similarity between graphs ▶ 2 graphs : G1(V1, E1, ℓ1, L1), G2(V2, E2, ℓ2, L2) ▶ Bijection1 f exists, if and only if, G1 is isomorphism with G2 ▶ Bijection f : V1 → V2 s.t {va, vb} ∈ E1, va and vb are adjacent. ▶ Partial isomorphism is NP-complete 1 全単射 3 / 35
  • 4. Related Work 2 groups of recent graph kernel method Comparing sub-structure: ▶ The major difference is how to define and explore sub-structures - random walks, shortest paths, cycles, subtree patterns, graphlets... Geometric node embeddings: ▶ Capture global property ▶ Achieved state-of-the-art performance in the graph classification task Bad points of related works Comparing sub-structure: ▶ Do not take into account the global property Geometric node embeddings: ▶ Do not necessarily use positive definite for Kernel Poor scalability: 4 / 35
  • 5. Contribution ▶ Propose a Positive definite Kernel ▶ Reduce computational complexity ▶ From quadratic to (quasi-)linear 2 ▶ Propose an approximation of the kernel with convergence analysis ▶ Take into account global property ▶ Outperforms 12 state-of-the-art graph classification algorithms - Include graph kernels, deep graph neural networks 2 quasi-linear : n log n. Time and Space. 5 / 35
  • 6. Common kernel Compare directly 2 graphs using kernel Similarity 𝒌(・, ・) Figure: calculation of kernel value between 2 graphs 6 / 35
  • 7. Proposed kernel Compare directly 2 graphs using kernel Similarity 𝒌(・, ・) Random Graphs Similarity with 𝒌(・, ・) Figure: calculation of kernel value between 2 graphs 7 / 35
  • 8. Notation : Graph definition Graph: G = (V , E, ℓ) Node: V = {vi }n i=1 Edge: E = (V × V ) Assign label function: ℓ : V → Σ Size of node: n # of edge: m Node label: l # of graphs: N G<latexit sha1_base64="QLLEFqFGXJzmcwbhRTcNSo8/+r8=">AAAB6HicbVDLSsNAFJ3UV62vqks3g0VwVRItPnZFF7pswT6gDWUyvWnHTiZhZiKU0C9w40IRt36SO//GSRpErQcuHM65l3vv8SLOlLbtT6uwtLyyulZcL21sbm3vlHf32iqMJYUWDXkoux5RwJmAlmaaQzeSQAKPQ8ebXKd+5wGkYqG409MI3ICMBPMZJdpIzZtBuWJX7Qx4kTg5qaAcjUH5oz8MaRyA0JQTpXqOHWk3IVIzymFW6scKIkInZAQ9QwUJQLlJdugMHxlliP1QmhIaZ+rPiYQESk0Dz3QGRI/VXy8V//N6sfYv3ISJKNYg6HyRH3OsQ5x+jYdMAtV8agihkplbMR0TSag22ZSyEC5TnH2/vEjaJ1XntFpr1ir1qzyOIjpAh+gYOegc1dEtaqAWogjQI3pGL9a99WS9Wm/z1oKVz+yjX7DevwC1D40D</latexit> v1<latexit sha1_base64="6r48FeRijmeRwM0ce/9YOgxnVX0=">AAAB7HicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0+HErevFYwbSFNpTNdtMu3WzC7qZQQn+DFw+KePUHefPfuEmDqPXBwOO9GWbm+TFnStv2p1VaWV1b3yhvVra2d3b3qvsHbRUlklCXRDySXR8rypmgrmaa024sKQ59Tjv+5DbzO1MqFYvEg57F1AvxSLCAEayN5E4HqTMfVGt23c6BlolTkBoUaA2qH/1hRJKQCk04Vqrn2LH2Uiw1I5zOK/1E0RiTCR7RnqECh1R5aX7sHJ0YZYiCSJoSGuXqz4kUh0rNQt90hliP1V8vE//zeokOrryUiTjRVJDFoiDhSEco+xwNmaRE85khmEhmbkVkjCUm2uRTyUO4znDx/fIyaZ/VnfN6475Ra94UcZThCI7hFBy4hCbcQQtcIMDgEZ7hxRLWk/VqvS1aS1Yxcwi/YL1/AeZQjuI=</latexit> v2<latexit sha1_base64="HvFip7AjDkPR91+3+J6CugKM0SQ=">AAAB7HicbVBNS8NAEJ34WetX1aOXxSJ4KkktftyKXjxWMG2hDWWz3bRLN5uwuymU0N/gxYMiXv1B3vw3btIgan0w8Hhvhpl5fsyZ0rb9aa2srq1vbJa2yts7u3v7lYPDtooSSahLIh7Jro8V5UxQVzPNaTeWFIc+px1/cpv5nSmVikXiQc9i6oV4JFjACNZGcqeDtD4fVKp2zc6BlolTkCoUaA0qH/1hRJKQCk04Vqrn2LH2Uiw1I5zOy/1E0RiTCR7RnqECh1R5aX7sHJ0aZYiCSJoSGuXqz4kUh0rNQt90hliP1V8vE//zeokOrryUiTjRVJDFoiDhSEco+xwNmaRE85khmEhmbkVkjCUm2uRTzkO4znDx/fIyaddrznmtcd+oNm+KOEpwDCdwBg5cQhPuoAUuEGDwCM/wYgnryXq13hatK1YxcwS/YL1/AefVjuM=</latexit> v3<latexit sha1_base64="+XpoULfOHqCHvyZwfk/DV8G7sg0=">AAAB7HicbVBNS8NAEJ3Ur1q/qh69LBbBU0m0+HErevFYwbSFNpTNdtMu3WzC7qZQQn+DFw+KePUHefPfuEmDqPXBwOO9GWbm+TFnStv2p1VaWV1b3yhvVra2d3b3qvsHbRUlklCXRDySXR8rypmgrmaa024sKQ59Tjv+5DbzO1MqFYvEg57F1AvxSLCAEayN5E4H6fl8UK3ZdTsHWiZOQWpQoDWofvSHEUlCKjThWKmeY8faS7HUjHA6r/QTRWNMJnhEe4YKHFLlpfmxc3RilCEKImlKaJSrPydSHCo1C33TGWI9Vn+9TPzP6yU6uPJSJuJEU0EWi4KEIx2h7HM0ZJISzWeGYCKZuRWRMZaYaJNPJQ/hOsPF98vLpH1Wd87rjftGrXlTxFGGIziGU3DgEppwBy1wgQCDR3iGF0tYT9ar9bZoLVnFzCH8gvX+BelajuQ=</latexit> V = {v1, v2, v3}<latexit sha1_base64="5/LCIMtGZ5h5wVQCMmTMg/5dkCc=">AAAB/3icbVDLSsNAFJ34rPUVFdy4GSyCCylJW3wshKIblxXsA5oQJtNJO3QyCTOTQold+CtuXCji1t9w5984SYuo9cBcDufcy71z/JhRqSzr01hYXFpeWS2sFdc3Nre2zZ3dlowSgUkTRywSHR9JwignTUUVI51YEBT6jLT94XXmt0dESBrxOzWOiRuiPqcBxUhpyTP3W/ASOikcefaJLpWsVJ2JZ5asspUDzhN7RkpghoZnfji9CCch4QozJGXXtmLlpkgoihmZFJ1EkhjhIeqTrqYchUS6aX7/BB5ppQeDSOjHFczVnxMpCqUch77uDJEayL9eJv7ndRMVnLsp5XGiCMfTRUHCoIpgFgbsUUGwYmNNEBZU3wrxAAmElY6smIdwkeH0+8vzpFUp29Vy7bZWql/N4iiAA3AIjoENzkAd3IAGaAIM7sEjeAYvxoPxZLwab9PWBWM2swd+wXj/AprvlA8=</latexit> ⌃ = { , }<latexit sha1_base64="ZY89SR6jHBd25PoJ2nDrWsihEs4=">AAAB/XicbVDLSsNAFJ34rPUVHzs3g0VwISXR4mMhFN24rGgf0IQymU7aoTNJmJkINbT+ihsXirj1P9z5N07SIGo9cOFwzr3ce48XMSqVZX0aM7Nz8wuLhaXi8srq2rq5sdmQYSwwqeOQhaLlIUkYDUhdUcVIKxIEcY+Rpje4TP3mHRGShsGtGkbE5agXUJ9ipLTUMbedG9rjCJ5DJxmPxwe6nFHHLFllKwOcJnZOSiBHrWN+ON0Qx5wECjMkZdu2IuUmSCiKGRkVnViSCOEB6pG2pgHiRLpJdv0I7mmlC/1Q6AoUzNSfEwniUg65pzs5Un3510vF/7x2rPxTN6FBFCsS4MkiP2ZQhTCNAnapIFixoSYIC6pvhbiPBMJKB1bMQjhLcfz98jRpHJbto3LlulKqXuRxFMAO2AX7wAYnoAquQA3UAQb34BE8gxfjwXgyXo23SeuMkc9sgV8w3r8ASdWVRQ==</latexit> 8 / 35
  • 9. Notation Set of graphs: G = {Gi }N i=1 Set of graph lebels: Y = {Yi }N i=1 Set of geometric embeddings (each graph): U = {ui }n i=1 ∈ Rn×d Latent node embedding space (each node): u ∈ Rd 𝐺" ・・・ 𝑁 𝑛Latent node ↑ embedding Node size→ # of graphs→ 𝑌"Graph label→ 𝑌& 𝐺& u1 2 Rd <latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit> u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit> u3<latexit sha1_base64="w2BS8kqWqIp26xG7B4vB81cBpaY=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRItPnZFNy4r2Ae0IUwmk3boZCbMTAol9E/cuFDErX/izr9xkgZR64GBwzn3cs+cIGFUacf5tCorq2vrG9XN2tb2zu6evX/QVSKVmHSwYEL2A6QIo5x0NNWM9BNJUBww0gsmt7nfmxKpqOAPepYQL0YjTiOKkTaSb9vDQLAwRnqcpXM/O5/7dt1pOAXgMnFLUgcl2r79MQwFTmPCNWZIqYHrJNrLkNQUMzKvDVNFEoQnaEQGhnIUE+VlRfI5PDFKCCMhzeMaFurPjQzFSs3iwEzmGdVfLxf/8wapjq68jPIk1YTjxaEoZVALmNcAQyoJ1mxmCMKSmqwQj5FEWJuyakUJ1zkuvr+8TLpnDfe80bxv1ls3ZR1VcASOwSlwwSVogTvQBh2AwRQ8gmfwYmXWk/VqvS1GK1a5cwh+wXr/AiT3lCE=</latexit> 9 / 35
  • 10. Geometric Embeddings Use partial eigendecomposition 3 to extract node embeddings: 1. Create normalized Laplacian matrix L ∈ Rn×n 2. Do partial eigendecomposition and obtaining U 3. Use the smallest d eigenvectors Normalized Laplacian matrix → Partial Eigendecomposition 𝑈Λ𝑈# 𝑛×𝑑 𝑑×𝑑 𝑑×𝑛 The smallest 𝑑 eigenvectors L𝑛×𝑛 A B C A B C A 0 1 1 B 1 0 0 C 1 0 0 Adjacency matrix A B C A 2 0 0 B 0 1 0 C 0 0 1 Degree matrix A B C A 2 -1 -1 B -1 1 0 C -1 0 1 Laplacian matrix -= Normalize Figure: Example obtaining U 3 Time complexity: Linear (# of graph edge) (...I don’t know how.) 10 / 35
  • 11. Transportation Distance [1] Earth Mover’s Distance (EMD): measure of dissimilarity EMD (Gx , Gy ) := min T ∈R nx ×ny + ⟨D, T ⟩ s.t.T 1 = t(Gx ) , T T 1 = t(Gy ) ▶ Linear programming problem ▶ Flow matrix T - Tij : how much of vi in Gx travels to vj in Gy ▶ GX → UX = {ux 1, ux 2, · · · , ux nx } ▶ GY → UY = {uy 1, uy 2, · · · , uy ny } ▶ Transport cost matrix D - Dij = ∥ux i − uy j ∥2 11 / 35
  • 12. Transportation Distance [1] Earth Mover’s Distance (EMD): measure of dissimilarity EMD (Gx , Gy ) := min T ∈R nx ×ny + ⟨D, T ⟩ s.t.T 1 = t(Gx ) , T T 1 = t(Gy ) ▶ Node vi has ci outgoing edges ▶ Normalized bog-of-words (nBOW): ti = ci / ∑n j=1 cj ∈ R 12 / 35
  • 13. Transportation Distance: Example AB C ab c A B C a b c a b c A B C A B C a b c Figure: EMD example ▶ EMD focus on node size and outgoing edges of each graph 13 / 35
  • 14. Straightforward way to define kernel, It’s high cost EDM based Kernel = − 1 2 JDemd J J = I − 1 N 11⊤ ▶ Not necessarily positive definite ▶ Time complexity:O(N2n3log(n)), Space complexity :O(N2) Graph A Graph CGraph B A B C A EMD(A,A) EMD(A,B) EMD(A,C) B EMD(B,A) EMD(B,B) EMD(B,C) C EMD(C,A) EMD(C,B) EMD(C,C) Distance Matrix Demd<latexit sha1_base64="/2TlWLrs+SXNjZgq6EbvOI79l2o=">AAAB7nicbVDLSsNAFJ3UV62vqks3g0VwVRItPnZFXbisYB/QhjKZ3LRDJ5MwMxFK6Ee4caGIW7/HnX/jJA2i1gMXDufcy733eDFnStv2p1VaWl5ZXSuvVzY2t7Z3qrt7HRUlkkKbRjySPY8o4ExAWzPNoRdLIKHHoetNrjO/+wBSsUjc62kMbkhGggWMEm2k7s0whdCfDas1u27nwIvEKUgNFWgNqx8DP6JJCEJTTpTqO3as3ZRIzSiHWWWQKIgJnZAR9A0VJATlpvm5M3xkFB8HkTQlNM7VnxMpCZWahp7pDIkeq79eJv7n9RMdXLgpE3GiQdD5oiDhWEc4+x37TALVfGoIoZKZWzEdE0moNglV8hAuM5x9v7xIOid157TeuGvUmldFHGV0gA7RMXLQOWqiW9RCbUTRBD2iZ/RixdaT9Wq9zVtLVjGzj37Bev8Cc8SPyQ==</latexit> Figure: Straightforward kernel based on EMD 14 / 35
  • 15. Global Alignment Graph Kernel Using EMD and Random feature (RF) Proposed Kernel: 4 k (Gx , Gy ) := ∫ p (Gω) ϕGω (Gx ) dGω where ϕGω := exp (−γEMD(Gx , Gω)) ▶ Gω : random graph ▶ W = {wi }D i=1 ▶ wi is sampled from V ∈ Rd ▶ p(Gω) is a distribution over the space of all random graphs of variable sizes Ω := ∪Dmax D=1VD 4 ランダムグラフの詳細に踏み込もうと思ったが, 非常に込み入った話でハガワは挫折 した. 気になる方はこちらを参照. 確率なんもワカンネ. 15 / 35
  • 16. Global Alignment Graph Kernel Using EMD and RF Approximation5: ˜k (Gx , Gy ) = 1 R R∑ i=1 ϕGωi (Gx ) ϕGωi (Gy ) → k (Gx , Gy ) , as R → ∞ 𝜙"# (𝐺&) Random Graphs 𝐺( 𝐺& 𝜙"# (𝐺)) 𝐺) 5 the uniform convergence of approximate proposed kernel 16 / 35
  • 17. Algorithm Set Data and hyperparameters ▶ Node embedding size (dimension): d ▶ Max size of random graphs: Dmax ▶ Graph embedding size: R 𝜙"# (𝐺&) Random Graphs 𝐺( 𝐺& 𝜙"# (𝐺)) 𝐺) DataGraphs 𝑅𝐷,-& 𝑑 Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 17 / 35
  • 18. Compute {t(Gt )}N i=1 and Laplacian matrix L A B C A B C A 2 -1 -1 B -1 1 0 C -1 0 1 Laplacian matrix 𝒕(𝑮 𝒙) ½ ¼ ¼ →For All Graphs Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 18 / 35
  • 19. Obtain node embedding vectors Normalized Laplacian matrix → Partial Eigendecomposition 𝑈Λ𝑈# 𝑛×𝑑 𝑑×𝑑 𝑑×𝑛 The smallest 𝑑 eigenvectors L𝑛×𝑛 →For All Graphs 𝐺" u1 2 Rd <latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit> u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit> u3<latexit sha1_base64="w2BS8kqWqIp26xG7B4vB81cBpaY=">AAAB+XicbVDLSsNAFJ3UV62vqEs3g0VwVRItPnZFNy4r2Ae0IUwmk3boZCbMTAol9E/cuFDErX/izr9xkgZR64GBwzn3cs+cIGFUacf5tCorq2vrG9XN2tb2zu6evX/QVSKVmHSwYEL2A6QIo5x0NNWM9BNJUBww0gsmt7nfmxKpqOAPepYQL0YjTiOKkTaSb9vDQLAwRnqcpXM/O5/7dt1pOAXgMnFLUgcl2r79MQwFTmPCNWZIqYHrJNrLkNQUMzKvDVNFEoQnaEQGhnIUE+VlRfI5PDFKCCMhzeMaFurPjQzFSs3iwEzmGdVfLxf/8wapjq68jPIk1YTjxaEoZVALmNcAQyoJ1mxmCMKSmqwQj5FEWJuyakUJ1zkuvr+8TLpnDfe80bxv1ls3ZR1VcASOwSlwwSVogTvQBh2AwRQ8gmfwYmXWk/VqvS1GK1a5cwh+wXr/AiT3lCE=</latexit> Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 19 / 35
  • 20. Generate random graph 6 2 ← 𝑅𝑎𝑛𝑑(1, 𝐷,-.) 𝑈2×𝑑 ← Generate_𝑟𝑎𝑛𝑑𝑜𝑚_𝑔𝑟𝑎𝑝ℎ (2, 𝑑) u1 2 Rd <latexit sha1_base64="FiX+xGGr4lrH54q+qBxUWlkIUrA=">AAACDnicbVDLSsNAFJ3UV62vqEs3g6XgqiRafOyKblxWsQ9oYphMJu3QySTMTIQS8gVu/BU3LhRx69qdf2OSBlHrgQuHc+7l3nvciFGpDONTqywsLi2vVFdra+sbm1v69k5PhrHApItDFoqBiyRhlJOuooqRQSQIClxG+u7kIvf7d0RIGvIbNY2IHaARpz7FSGWSozcsN2RegNQ4iVMnMVNoUQ6tXBBBcp3eJl4KoaPXjaZRAM4TsyR1UKLj6B+WF+I4IFxhhqQcmkak7AQJRTEjac2KJYkQnqARGWaUo4BIOyneSWEjUzzohyIrrmCh/pxIUCDlNHCzzvxO+dfLxf+8Yaz8UzuhPIoV4Xi2yI8ZVCHMs4EeFQQrNs0IwoJmt0I8RgJhlSVYK0I4y3H8/fI86R02zaNm66pVb5+XcVTBHtgHB8AEJ6ANLkEHdAEG9+ARPIMX7UF70l61t1lrRStndsEvaO9fjuucjQ==</latexit> u2<latexit sha1_base64="Pm48/PPv93nEVMYQDi7yld7eDYw=">AAAB+XicbVDLSsNAFJ34rPUVdelmsAiuSlKLj13RjcsK9gFtCJPJpB06mQkzk0IJ/RM3LhRx65+482+cpEHUemDgcM693DMnSBhV2nE+rZXVtfWNzcpWdXtnd2/fPjjsKpFKTDpYMCH7AVKEUU46mmpG+okkKA4Y6QWT29zvTYlUVPAHPUuIF6MRpxHFSBvJt+1hIFgYIz3O0rmfNea+XXPqTgG4TNyS1ECJtm9/DEOB05hwjRlSauA6ifYyJDXFjMyrw1SRBOEJGpGBoRzFRHlZkXwOT40SwkhI87iGhfpzI0OxUrM4MJN5RvXXy8X/vEGqoysvozxJNeF4cShKGdQC5jXAkEqCNZsZgrCkJivEYyQR1qasalHCdY6L7y8vk26j7p7Xm/fNWuumrKMCjsEJOAMuuAQtcAfaoAMwmIJH8AxerMx6sl6tt8XoilXuHIFfsN6/ACNylCA=</latexit> Figure: 2 nodes random graph example Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 6 In after section, I show 2 way to generate random graphs. 20 / 35
  • 21. Compute a feature veotor Zj 𝜙"# (𝐺&) 𝑧)= 𝑍)+ ⋮ ⋮ ⋮ 𝑍)- Zji = ϕGω (Gi ) := exp (−γ EMD (Gi , Gω)) Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 21 / 35
  • 22. Generate random graph for R times 𝑧"= 𝑍"" ⋮ ⋮ ⋮ 𝑍"% 𝑧&= 𝑍&" ⋮ ⋮ ⋮ 𝑍&% 𝜙() (𝐺,) 𝜙() (𝐺,) ⋯ 𝜙() (𝐺,) 𝑧/= 𝑍/" ⋮ ⋮ ⋮ 𝑍/% ⋯ Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 7 7 R : number of Random graphs 22 / 35
  • 23. Output N × R Matrix Z 𝑍= " √$ 𝑍"" ⋮ 𝑍"& ⋮ 𝑍"' ⋯ ⋯ 𝑍$" ⋮ 𝑍$& ⋮ 𝑍$' 𝑍"& 𝑍*& 𝑍*' ⋯ ⋯ Algorithm 1 Random Graph Embedding Input: Data graphs {Gi }N i=1, node embedding size d, maximum size of random graphs Dmax , graph embedding size R. Output: Feature matrix ZN ⇥R for data graphs 1: Compute nBOW weights vectors {t(Gi )}N i=1 of the normalized Laplacian L of all graphs 2: Obtain node embedding vectors {ui }n i=1 by computing d small- est eigenvectors of L 3: for j = 1, . . . ,R do 4: Draw Dj uniformly from [1, Dmax ]. 5: Generate a random graph G j with Dj number of nodes embeddings W from Algorithm 2. 6: Compute a feature vector Zj = G j ({Gi }N i=1)) using EMD or other optimal transportation distance in Equation (3). 7: end for 8: Return feature matrix Z({Gi }N i=1) = 1p R {Zi }R i=1 23 / 35
  • 24. How to generate Random Graph Data-independent and Data-dependent Distributions Data-dependent 8 Random Graph Embedding(Anchor Sug-Graphs(ASG)): 1. Pick up Gk from data set 2. Uniformly draw Dj nodes 3. {wi } Dj i=1 = {un1 , un1 , · · · , unDj } Incorporating Label information: ▶ d(ui , uj) = max(∥ui − uj∥2, √ d) if vi and vj have diffrent node label ▶ Make distance between different node labels ▶ √ d is largest distance in a d-dimentionnal unit hypercube space 8 data independent は appendix を参照 24 / 35
  • 25. Complexity comparison (Left: Proposed, Right: Straightforward) 𝜙"# (𝐺&) Random Graphs 𝐺( 𝐺& 𝜙"# (𝐺)) 𝐺) Figure: Proposed kernel Graph A Graph CGraph B A B C A EMD(A,A) EMD(A,B) EMD(A,C) B EMD(B,A) EMD(B,B) EMD(B,C) C EMD(C,A) EMD(C,B) EMD(C,C) Distance Matrix Figure: Straitforward kernel Time complexity (dmz is partial eigendecomposition cost) 9: ▶ O(NRD2nlog(n) + dmz) ▶ O(N2n3log(n) + dmz) ※ R is # of Random Graphs, D is # of Random Graph nodes (D < n) Space complexity: ▶ O(NR) ▶ O(N2) 9 dmz is eigendecomposition cost. 25 / 35
  • 26. Experiments Experimental setup Machine: ▶ Use linear SVM (LIBLINEAR) Data: ▶ 9 Datasets Hyperparameters: ▶ γ(Kernel)→[1e-3 1e-2 1e-1 1 10] ▶ D max (Size of random graph)→[3:3:30] ▶ SVM Evaluation: ▶ 10-fold cross-validation ▶ 10 times average accuracy 26 / 35
  • 27. # of Random Graph (R) and Testing accuracy: 10 0 10 1 10 2 10 3 10 4 Varying R 15 20 25 30 35 40 45 50 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (a) ENZYMES 10 0 10 1 10 2 10 3 10 4 Varying R 62 64 66 68 70 72 74 76 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (b) NCI109 10 0 10 1 10 2 10 3 10 4 Varying R 55 60 65 70 75 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) (c) IMDBBINARY 10 0 10 1 10 2 10 3 10 4 Varying R 55 60 65 70 75 80 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) (d) COLLAB 0 500 1000 1500 2000 2500 Varying R 0 10 20 30 40 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (e) ENZYMES 0 1000 2000 3000 4000 5000 Varying R 0 100 200 300 400 500 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (f) NCI109 0 1000 2000 3000 4000 5000 Varying R 0 20 40 60 80 100 120 140 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) (g) IMDBBINARY 0 1000 2000 3000 4000 5000 Varying R 0 500 1000 1500 2000 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) (h) COLLAB Figure 2: Test accuracies and runtime of three variants of RGE with and without node labels when varying R. ▶ Converge very rapidly when increasing R # of Random Graph (R) and Runtime: 10 0 10 1 10 2 10 3 10 4 Varying R 15 20 25 30 35 40 45 50 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (a) ENZYMES 10 0 10 1 10 2 10 3 10 4 Varying R 62 64 66 68 70 72 74 76 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (b) NCI109 10 0 10 1 10 2 10 3 10 4 Varying R 55 60 65 70 75 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) (c) IMDBBINARY 10 0 10 1 10 2 10 3 10 4 Varying R 55 60 65 70 75 80 TestingAccuracy% Testing Accuracy VS R RGE(RF) RGE(ASG) (d) COLLAB 0 500 1000 1500 2000 2500 Varying R 0 10 20 30 40 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (e) ENZYMES 0 1000 2000 3000 4000 5000 Varying R 0 100 200 300 400 500 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) RGE(ASG)-NodeLab (f) NCI109 0 1000 2000 3000 4000 5000 Varying R 0 20 40 60 80 100 120 140 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) (g) IMDBBINARY 0 1000 2000 3000 4000 5000 Varying R 0 500 1000 1500 2000 Runtime(Seconds) Total Runtime VS R RGE(RF) RGE(ASG) (h) COLLAB Figure 2: Test accuracies and runtime of three variants of RGE with and without node labels when varying R. ▶ Show quasi-linear scalability with respect to R 27 / 35
  • 28. 10 2 10 3 10 4 Varying number of graphs N 10 -2 10 0 10 2 10 4 10 6 10 8 Time(Seconds) Runtime VS number of graphs N RGE(Eigentime) RGE(FeaGentime) RGE(Runtime) Linear Quatratic (a) Number of graphs N 10 2 10 3 Varying size of graph n 10 0 10 1 10 2 10 3 10 4 10 5 Time(Seconds) Runtime VS size of graph n RGE(Eigentime) RGE(FeaGentime) RGE(Runtime) Linear Quatratic (b) Size of graph n ▶ shows the linear scalability with respect to N (a) ▶ shows the quasi-liniear scalability with respect to n (b) 28 / 35
  • 29. classification accuracy: Table 1: Comparison of classication accuracy against graph kernel methods without node labels. Datasets MUTAG PTC-MR ENZYMES NCI1 NCI019 RGE(RF) 86.33 ± 1.39(1s) 59.82 ± 1.42(1s) 35.98 ± 0.89(38s) 74.70 ± 0.56(727s) 72.50 ± 0.32(865s) RGE(ASG) 85.56 ± 0.91(2s) 59.97 ± 1.65 (1s) 38.52 ± 0.91(18s) 74.30 ± 0.45(579s) 72.70 ± 0.42(572s) EMD 84.66 ± 2.69 (7s) 57.65 ± 0.59 (46s) 35.45 ± 0.93 (216s) 72.65 ± 0.34 (8359s) 70.84 ± 0.18 (8281s) PM 83.83 ± 2.86 59.41 ± 0.68 28.17 ± 0.37 69.73 ± 0.11 68.37 ± 0.14 Lo- 82.58 ± 0.79 55.21 ± 0.72 26.5 ± 0.54 62.28 ± 0.34 62.52 ± 0.29 OA-E (A) 79.89 ± 0.98 56.77 ± 0.85 36.12 ± 0.81 67.99 ± 0.28 67.14 ± 0.26 RW 77.78 ± 0.98 56.18 ± 1.12 20.17 ± 0.83 56.89 ± 0.34 56.13 ± 0.31 GL 66.11 ± 1.31 57.05 ± 0.83 18.16 ± 0.47 47.37 ± 0.15 48.39 ± 0.18 SP 82.22 ± 1.14 56.18 ± 0.56 28.17 ± 0.64 62.02 ± 0.17 61.41 ± 0.32 Table 2: Comparison of classication accuracy against graph kernel methods with node labels or WL technique. Datasets PTC-MR ENZYMES PROTEINS NCI1 NCI019 RGE(ASG) 61.5 ± 2.34(1s) 48.27 ± 0.99(28s) 75.98 ± 0.71(20s) 76.46 ± 0.45(379s) 74.42 ± 0.30(526s) EMD 57.67 ± 2.11 (42s) 42.85 ± 0.72 (296s) 76.03 ± 0.28 (1936s) 75.89 ± 0.16 (7942s) 73.63 ± 0.33 (8073s) PM 60.38 ± 0.86 40.33 ± 0.34 74.39 ± 0.45 72.91 ± 0.53 71.97 ± 0.15 OA-E (A) 58.76 ± 0.92 43.56 ± 0.66 — 69.83 ± 0.30 68.96 ± 0.35 V-OA 56.4 ± 1.8 35.1 ± 1.1 73.8 ± 0.5 65.6 ± 0.4 65.1 ± 0.4 RW 57.06 ± 0.86 19.33 ± 0.62 71.67 ± 0.78 63.34 ± 0.27 63.51 ± 0.18 GL 59.41 ± 0.94 32.70 ± 1.20 71.63 ± 0.33 66.00 ± 0.07 66.59 ± 0.08 SP 60.00 ± 0.72 41.68 ± 1.79 73.32 ± 0.45 73.47 ± 0.11 73.07 ± 0.11 WL-RGE(ASG) 62.20 ± 1.67(1s) 57.97 ± 1.16(38s) 76.63 ± 0.82(30s) 85.85 ± 0.42(401s) 85.32 ± 0.29(798s) WL-ST 57.64 ± 0.68 52.22 ± 0.71 72.92 ± 0.67 82.19 ± 0.18 82.46 ± 0.24 ▶ RGE is much faster than EMD 29 / 35
  • 30. Table 2: Comparison of classication accuracy against graph kernel methods with node labels or WL technique. Datasets PTC-MR ENZYMES PROTEINS NCI1 NCI019 RGE(ASG) 61.5 ± 2.34(1s) 48.27 ± 0.99(28s) 75.98 ± 0.71(20s) 76.46 ± 0.45(379s) 74.42 ± 0.30(526s) EMD 57.67 ± 2.11 (42s) 42.85 ± 0.72 (296s) 76.03 ± 0.28 (1936s) 75.89 ± 0.16 (7942s) 73.63 ± 0.33 (8073s) PM 60.38 ± 0.86 40.33 ± 0.34 74.39 ± 0.45 72.91 ± 0.53 71.97 ± 0.15 OA-E (A) 58.76 ± 0.92 43.56 ± 0.66 — 69.83 ± 0.30 68.96 ± 0.35 V-OA 56.4 ± 1.8 35.1 ± 1.1 73.8 ± 0.5 65.6 ± 0.4 65.1 ± 0.4 RW 57.06 ± 0.86 19.33 ± 0.62 71.67 ± 0.78 63.34 ± 0.27 63.51 ± 0.18 GL 59.41 ± 0.94 32.70 ± 1.20 71.63 ± 0.33 66.00 ± 0.07 66.59 ± 0.08 SP 60.00 ± 0.72 41.68 ± 1.79 73.32 ± 0.45 73.47 ± 0.11 73.07 ± 0.11 WL-RGE(ASG) 62.20 ± 1.67(1s) 57.97 ± 1.16(38s) 76.63 ± 0.82(30s) 85.85 ± 0.42(401s) 85.32 ± 0.29(798s) WL-ST 57.64 ± 0.68 52.22 ± 0.71 72.92 ± 0.67 82.19 ± 0.18 82.46 ± 0.24 WL-SP 56.76 ± 0.78 59.05 ± 1.05 74.49 ± 0.74 84.55 ± 0.36 83.53 ± 0.30 WL-OA-E (A) 59.72 ± 1.10 53.76 ± 0.82 — 84.75 ± 0.21 84.23 ± 0.19 Table 3: Comparison of classication accuracy against recent deep learning models on graphs. Datasets PTC-MR PROTEINS NCI1 IMDB-B IMDB-M COLLAB (WL-)RGE(ASG) 62.20 ± 1.67 76.63 ± 0.82 85.85 ± 0.42 71.48 ± 1.01 47.26 ± 0.89 76.85 ± 0.34 DGCNN 58.59 ± 2.47 75.54 ± 0.94 74.44 ± 0.47 70.03 ± 0.86 47.83 ± 0.85 73.76 ± 0.49 PSCN 62.30 ± 5.70 75.00 ± 2.51 76.34 ± 1.68 71.00 ± 2.29 45.23 ± 2.84 72.60 ± 2.15 DCNN 56.6 ± 1.20 61.29 ± 1.60 56.61 ± 1.04 49.06 ± 1.37 33.49 ± 1.42 52.11 ± 0.53 DGK 57.32 ± 1.13 71.68 ± 0.50 62.48 ±0.25 66.96 ± 0.56 44.55 ± 0.52 73.09 ± 0.25 aph in the range of n = [8 1024], respectively. When generating ndom adjacency matrices, we set the number of edges always be ice the number of nodes in a graph. We report the runtime for mputing node embeddings using a state-of-the-art eigensolver 0], generating RGE graph embeddings, and the overall computa- n of graph classication, accordingly. Fig. 3(a) shows the linear alability of RGE when increasing the number of graphs, conrm- g our complexity analysis in the previous Section. In addition, as property of our RGE embeddings, which open the door to lar scale applications of graph kernels for various applications such social networks analysis and computational biology. Comparison with All Baselines. Tables 1, 2, and 3 show th RGE consistently outperforms or matches other state-of-the- graph kernels and deep learning approaches in terms of clas cation accuracy. There are several further observations wor making here. First, EMD, the closest method to RGE, shows go ▶ Outperforms other graph kernels and deep learning approaches ▶ RGE is much faster than EMD ▶ WL-technique makes good performance 30 / 35
  • 31. Conclusion Proposed good graph kernel! ▶ Be scalable ▶ Take into account global property thank you. 31 / 35
  • 32. Appendix I ▶ グラフが同型ならば, 隣接行列の固有値は一致するが, 逆は成り立た ない Normalized Laplacian Matrix: Li,j :=    1 if i = j and deg (vi ) ̸= 0 − 1√ deg(vi ) deg(vj ) if i ̸= j and vi is adjacent to vj 0 otherwise. deg(v): Degree of node (vertex) v 32 / 35
  • 34. Appendix III Table 4: Properties of the datasets. Dataset MUTAG PTC ENZYMES PROTEINS NCI1 NCI109 IMDB-B IMDB-M COLLAB Max # Nodes 28 109 126 620 111 111 136 89 492 Min # Nodes 10 2 2 4 3 4 12 7 32 Ave # Nodes 17.9 25.6 32.6 39.05 29.9 29.7 19.77 13.0 74.49 Max # Edges 33 108 149 1049 119 119 1249 1467 40119 Min # Edges 10 1 1 5 2 3 26 12 60 Ave # Edges 19.8 26.0 62.1 72.81 32.3 32.1 96.53 65.93 2457.34 # Graph 188 344 600 1113 4110 4127 1000 1500 5000 # Graph Labels 2 2 6 2 2 2 2 3 3 # Node Labels 7 19 3 3 37 38 — — — wice the number of nodes in a graph. We use the size of ding d = 6 just like in the previous sections. We set the eters related to RGE itself are DMax = 10 and R = 128. e runtime for computing node embeddings using state- gensolver [33, 40] and RGE graph embeddings, and the me, respectively. ditional Results and Discussions on mparisons Against All Baselines e RGE is a graph embedding, we directly employ a lin- plemented in LIBLIBNEAR [7] since it can faithfully eectiveness of our feature representation from the nonlinear learning solvers. Following the convention experiments ten times (thus 100 runs per dataset) an average prediction accuracies and standard deviations of hyperparameters and D_max are [1e-3 1e-2 1e [3:3:30], respectively. All parameters of the SVM and eters of our method were optimized only on the train The node embedding size is set to either 4, 6 or 8 bu the same number for all variants of RGE on the same eliminate the random eects, we repeat the whole exp times and report the average prediction accuracies a deviations. For all baselines we take the best number the papers except EMD, where we rerun the experim comparisons in terms of both accuracy and runtime. Sin EMD, and PM are essentially built on the same node Terms WL test: ▶ Technique to improve kernel with node labels RGE(ASG)-NodeLab: ▶ Data-dependent random graph + Incorporating Label information WL-RGE: ▶ Data-dependent random graph + WL test 34 / 35
  • 35. 引用 I Giannis Nikolentzos, Polykarpos Meladianos, and Michalis Vazirgiannis. Matching node embeddings for graph similarity. In Thirty-First AAAI Conference on Artificial Intelligence, 2017. 35 / 35