Omni-Prop: Seamless Node Classification on Arbitrary Label Correlation

Yuto Yamaguchi
University of Tsukuba
OMNI-Prop:
Seamless Node Classification on Arbitrary
Label Correlation
AAAI 2015
Hiroyuki Kitagawa
University of Tsukuba
Christos Faloutsos
Carnegie Mellon University

Background
– Given a partially labeled graph G=(V,E)
– Find correct label of unlabeled nodes
?
?
Node Classification
1

Background
Homophily Heterophily
Label Correlation
2

Motivation
?
?
”every node has the same label correlation”
Existing Algorithms makes strong assumption
3

Motivation
?
?
on real graph
3

Related Work
● Label Propagation
● Random Walk with Restarts
● Belief Propagation

Label Propagation
● Main Idea
min
Y
∑
k=1
K
Ai j(Yik−Y j k)2
● Objective
such that Y L=YL
(0)
If nodes are near, their label should be similiar
4

Label Propagation
Y(t)
=PT
Y(t−1)
● Iterative approach
until convergeupdate
4

Label Propagation
Y(t)
=PT
Y(t−1)
aggregation of in-edges
4

Label Propagation
Y(t)
=PT
Y(t−1)
4
A=
[AL L AL U
AU L AU U
]

Label Propagation
Y(t)
=PT
Y(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]

Label Propagation
Y(t)
=PT
Y(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
labeled nodes don't update

Label Propagation
Y(t)
=PT
Y(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
propagation

Label Propagation
Y(t)
=PT
Y(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
P=
[I PLU
0 PU U
] normalize on column

Label Propagation
Y(t)
=PT
Y(t−1)
?
?
G
4

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y(t)
=PT
Y(t−1)
?
?
G
Y
4

Random Walk with Restarts
● Main Idea
? = the most frequently encountered label
5
?start from , ends at labeled node

● Main Idea
Y
(t)
=(1−c)PY
(t−1)
+cY
(0)
until converge
● Keep update
P=
[ I 0
PU L PU U
]
5

● Main Idea
Y
(t)
=(1−c)PY
(t−1)
+cY
(0)
until converge
● Keep update
P=
[ I 0
PU L PU U
]
5
aggregation of out-edges

Belief Propagation
P(Y i )= ∑
Y j=lk , j≠i
P(Y 1,... ,Y n)
?
?
G
6

Belief Propagation
P(Y i )= ∑
Y j=lk , j≠i
P(Y 1,... ,Y n)
?
?
G
Homophily
from
to
0.10.9
0.90.1
6

Belief Propagation
P(Y i )= ∑
Y j=lk , j≠i
P(Y 1,... ,Y n)
?
?
G
Heterophily
from
to
0.90.1
0.10.9
6

Problem Formulation
● Node Classification
– Given a partially labeled graph
● Requirement
– works on mixed label correlation
G=(V , E)
of
7

Main Idea
● If most of neighbors of have same label,
→ so do the rest of neighbors
?
?
8

Main Idea
?
8

● Propagating two variables
si j = how likely node i has label j
ti j = how likely neighbors of node i has label j
● Output
– label of node i = argmax
j
(si j)
Algorithm
– until converge
9

Algorithm
● s-propagation (only for unlabeled nodes)
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
prior
10

Algorithm
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
out-edge
10in-edge

Algorithm
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
followee
10follower
follower-score

Algorithm
● Matrix formulation
11

Analysis
● Time complexity
– at each iteration O(K(N +M ))
13

Analysis
● Time complexity
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
13

Analysis
● Time complexity
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
● O(M) to get∀i ,sik
13

Analysis
● Time complexity
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
● O(N) to combined into a score
13

Analysis
● Time complexity
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
● O(K) ∀ k 13

Analysis
● Time complexity
– at each iteration
– if converges after h iteration
O(K(N +M ))
13
O(hK (N+M ))

Analysis
● OMNI-Prop always converges if λ>0
14

Analysis
‖QU U‖<1converges to if(I−QUU )
−1
15

Analysis
‖QUU‖<1converges if
15
out-degree ≥ 0
Dii=∑
j
Ai j

Analysis
15
If λ > 0, every element < 1
Dii=∑
j
Ai j

Analysis
15

Analysis
● Connection to Label Propagation
– LP is a special case of OMNI-Prop when λ=0
Y
(t)
=P
T
Y
(t−1)
● Keep update
until converge
1
2
3
G 3
2
S
3
1
2
T
1
16

Analysis
● Connection to Random Walk
– RWR is a special case of OMNI-Prop when λ=0
Y
(t)
=(1−c)PY
(t−1)
+cY
(0)
until converge
● Keep update
1
2
3
G 3
2
S
1
T
1
2
3
17

Analysis
● Connection to Random Walk when λ>0
1
2
3
G
17
3
2
S
1
T
1
2
3
b b

Experiment
● Q1 – Parameter
– How does parameter affect performance?
● Q2 – convergence
– How many iterations does OMNI-Prop need to
converge?
● Q3 – accuracy
– How accurate OMNI-Prop compared to LP and BP?
18

Experiment
● Dataset
19
● Evaluation
– hide 70% label

Experiment
20
top p% nodes ordered by max(self-score)

Conclusion
● Simple idea, deep insight
<3 <3

Background
– Given a partially labeled graph G=(V,E)
?
?
Node Classification
1
解 node classification 簡單的方法就直接去觀察 node
的 label 之間有沒有什麼規律
比如說這個 node 的 3 個 neighbor 都是紅色
那你如果觀察到同類相聚效應連在一起的 node 通常
label 都相同那你會說這個 node 是紅色的機率很大

Background
Homophily Heterophily
Label Correlation
2
剛剛我們說的規律就是在說 label correlation
代表你的 label 和你 neighbor 的 label 間的關係
Label correlation 有很多這篇會討論到的就這兩種最
常見的類型 homophily 和 heterophily
homophily ：連在一起的 node 有相同的 label
heterophily ：互補跟你有連結的 node label 都跟你相
反那常見的有 heterophily correlation 的圖比如說伴
侶的關係性別做 label 那 edge 表示這兩個人結過婚
那社會上大部份都是異性戀所以觀察伴侶的 graph
會觀察到 heterophily 的 label correlation

Motivation
?
?
3
目前 node classification 的方法基本上都是 based on
label correlation 的假設
但他們的假設太強，要求 graph 上所有 node 都有相
同的 label correlation

Motivation
?
?
on real graph
3
但這不合理比如舉剛剛的伴侶的例子大部份人是異性
戀但也存在同性雙性戀阿這會是個人選擇
所以目前 node classification 演算法在 real graph 上表
現不好很大原因是因為假設太強
關於這個缺陷甚至有 paper 特地分析如果 label
correlation 假設不對的話對 performance 有多少的
傷害
那這篇就是從根本解決這個缺點

Related Work
● Label Propagation
● Random Walk with Restarts
● Belief Propagation
他們是第 1 篇提出一個可以不假設 label correlation 做
node classification 的方法
下面這張表 summarize 了 node classification 中最主
流三種作法和 OMNI-Prop 的比較
LP,RWR 是 based on page rank 的演算法所以很快
但只能用在所以 node 都是 homophily correlated 的
圖上
BP 能多用在 heterophily 的情況但不能用在 mix 的情
況而且在有 cycle 的圖 belief propagation 演算法不
保證會收斂所以可能會跑很久
OMNI-Prop 不依賴 label correlation 所以在 label
correlation 混雜的圖上也表現地很好
而且和 LP,RWR 一樣是有保證會收斂的 iterative
algorithm 所以跑起來很快

Label Propagation
● Main Idea
min
Y
∑
k=1
K
Ai j(Yik−Y j k)
2
● Objective
such that Y L=YL
(0)
If nodes are near, their label should be similiar
4
LP 主要想法可以說是 KNN 他假設如果今天有兩個
node 在空間上很靠近那他們的 label 就會很像看
main idea 就知道假設整張圖都是 homophily
correlated
把 main idea 寫成 objective function 就是這樣： A 是
adjacency matrix 你也可以 generalize 成表示距離上
有多靠近的 weight matrix 愈靠近 weight 就愈大
Yik 表示第 i 個 node 是 label k 的機率
如果很靠近的 node 但 label 差很遠那 penality 就會很
大所以 minimize 這個 objective 找到的 label
assignment 就會讓愈近的 node label 愈像
唯一的 constraint 是本來就有 label 的 node 的 label
不能更改
重點來了要怎麼解這個 objective function 你可以做微
分解連立方程也可以用 iterative 方法

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
那跟這篇相關的是 iterative 的方法
Updating rule 長的像這樣

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
P 是 propagation matrix 是第 i 個 row 第 j 個 column
表示 node i 走到 node j 的機率每個 node i 更新時
會看所有可以走到他的其他 node 的 label
avg 起來取機率最高的當作是這個 node 的 label
這就是 LP propagation 的方式
現在來看怎麼得到 P

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
A=
[AL L AL U
AU L AU U
]
假設 A 是 adjacency matrix 把 label node 放前面
unlabel node 放後面 A_LL 是 label node 到 label
node 的 sub-adjacency matrix

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
根據 label propagtion 設計新的用來 propagate 的
adjacency matrix A'

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
labeled nodes don't update
先來看有 label 的 node 根據 constraint 不能改變他
的 label 所以 in-edge 就只有他自己

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
propagation
而 unlabeled 的 node 需要被 propagate 所以只要維
持不變他就能被有 labeled 的 node propage 到

Label Propagation
Y
(t)
=P
T
Y
(t−1)
4
A=
[AL L AL U
AU L AU U
] A'=
[I AL U
0 AU U
]
P=
[I PLU
0 PU U
] normalize on column
因為 Y 是 label 的機率所以把 A' 對 column 做
normalized 後就得到了更新 Y 的 update matrix
再來說明為何更新時是用 P 的 transpose
因為我們更新一個 node 是收集他所有的 in-edge
可是 P 的 row 比如第 i 個 row 是 i 這個 node 的 out-
edge 但我們要的是 i 的 in-edge 所以你就把 P 做
transpose 就會收集到 i 的 in-edge 了

Label Propagation
Y
(t)
=P
T
Y
(t−1)
?
?
G
4
舉例來說現在有張圖 G 上面有 3 個 node

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y
(t)
=P
T
Y
(t−1)
?
?
G
Y
4
根據 P 可以畫出右邊這張圖有 edge 表示 P 對應的那
個 element > 0

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y
(t)
=P
T
Y
(t−1)
?
?
G
Y
4
來看經過一次 iteration propagate 的情形第一個 node
有 label 所以還是紅色
看第二個 node 他是 unlabeled node 所以會看所有能
走到他的 node 的 label

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y
(t)
=P
T
Y
(t−1)
?
?
G
Y
4
avg 後機率最高的當他的 label 的話更新成紅色

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y
(t)
=P
T
Y
(t−1)
?
?
G
Y
4
第 3 個 node 也是

Label Propagation
?
?
P=
[I PLU
0 PU U
]
Y
(t)
=P
T
Y
(t−1)
?
?
G
Y
4
avg 後紅色機率最高更新成紅色

● Main Idea
5
RWR 跟 LP 想法剛好互補 LP 從有 label 的 node 做更
新 RWR 是從 unlabeled node 做更新
Main idea 是這樣的他從 unlabeled node 出發做
random walk 每次移動到一個 neighbor node 直到
移動到有 label 的 node
那為了讓他能走很多次所以每次能移動的 node 除了
neighbor 外也可以選擇回到出發點
就這樣走直到 converge 那 unlabeled node 的 label
就是在做 random walk 過程中最常碰到的 label

● Main Idea
Y(t)
=(1−c)PY(t−1)
+cY(0)
until converge
● Keep update
P=
[ I 0
PU L PU U
]
5
每次移動可以選擇走到 neighbor 或原點上 c 是回原點
的機率而 P 是跟 LP 很像的 update matrix

● Main Idea
Y(t)
=(1−c)PY(t−1)
+cY(0)
until converge
● Keep update
P=
[ I 0
PU L PU U
]
5
aggregation of out-edges
只差在 RWR 更新 node 時會看他能走到的 node
所以 P 在設計時有 label 的 node 除了本身外沒有 out-
edge

Belief Propagation
P(Y i )= ∑
Y j=lk , j≠i
P(Y 1,... ,Y n)
?
?
G
6
一個 node 是哪種 label 的機率比如說紅色那方法就
是 marginal on joint probability
對於每一種 label 的組合都可以算出一個 joint
probability 固定這個 node 是紅色把其他 node 所有
組合 sum 起來得到的就是這個 node 是紅色的機率

Belief Propagation
P(Y i )= ∑
Y j=lk , j≠i
P(Y 1,... ,Y n)
?
?
G
Homophily
from
to
0.10.9
0.90.1
6
所以就設計這張如果 neighbor 是紅色或藍色那這個
node 是紅色或藍色的機率

Problem Formulation
● Node Classification
– Given a partially labeled graph
● Requirement
– works on mixed label correlation
G=(V , E)
of
7
講完 related work 問題來了
要怎麼設計新的能在 label correlation 是 homophily 和
heterophily mix 的關係的圖上做 classification 呢？

Main Idea
?
?
8
把 neighbor 有什麼 label 和 node 本身是什麼 label 之
間不要假設直接關聯
主要想法就是這一句話：如果你看到大部份 neighbor
都有相同的 label 那麼只說明剩下的 neighbor 很可
能也會有相同的 label
舉例來說

Main Idea
?
?
8
中間這個 node 的 3 個 neighbor 都是紅色所以第 4 個
node 也會是紅色
那要怎麼決定中間這個 node 的 label 呢？
換位思考對於這 4 個 neighbor 中間這個 node 是他的
neighbor

Main Idea
?
?
8
所以比如左上角這個 node 他的 neighbor 都是藍色

Main Idea
?
?
8
你會猜中間這個 node 是藍色但你還有另外 3 個
neighbor

Main Idea
?
?
8
假設中間 node 的 neighbor 的 neighbor 都是紅色

Main Idea
?
8
那中間這個 node 就該是紅色

● Propagating two variables
si j = how likely node i has label j
ti j = how likely neighbors of node i has label j
● Output
– label of node i = argmax
j
(si j)
Algorithm
– until converge
9
為了把 neighbor 有什麼 label 和自己的 label 分開使
用兩個變數
S 是 self score 表示 node 有第 j 個 label 的機率
T 是 neighbor score 表示 neighbor 有第 j 個 label 的
機率
設計如何更新這兩個變數就能 iterative 的更新直到
converge
那每個 node 的 label 就是 self score 最高的那個 label

Algorithm
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
prior
10
self socre 的更新會考慮所有 neighbor 的 neighbor
score
而 neighbor score 則考慮 neighbor 的 self score
至於加上 lambda 倍的 prior 原因是有些 node 沒有
neighbor 無法被 propagate 到那解決方法就是讓他
們的 label 是預設的 prior 值
而 lambda 控制的是 prior 的強度

Algorithm
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
out-edge
10in-edge
你們可能會想為何 self-score 是 aggregate on out-
edge 而 neighbor-socre 是 aggregate on in-edge
原因是多考慮有向圖比如說 twitter 的 follower 和
followee 的關係
如果 a 有個 link 指向 b 表示 a follow b 的話

Algorithm
● t-propagation
sik
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
t jk
∑
i=1
N
Ai j si k +λbk
∑
i=1
N
Ai j+ λ
si, jt
t
t
t
ti, j
s
s
s
s
followee
10follower
follower-score
那 t 收集的就是 follower 的 label 所以 t 是在算 follow
這個 node 的那些 node 的 label 簡稱 follower-
score
而 s 就該 aggregation on 那些被他 follow 的 node 的
follower score
被一個 node follow 的那些 node 就看 node 的 out-
edge

Algorithm
● Matrix formulation
11
把 self-score, neighbor-score 更新式寫成 matrix form
就長這樣

Algorithm
12
演算法架構
一開始先 init prior,s,t
Init s 就把有 label node 的那個 label 設為 1 其他都是
0 unlabeled node 就設成某個初始值
而 follower score 的 init 沒理由特別偏好什麼所以所
有 node 都是初始值
然後持續更新直到 converge

Analysis
● Time complexity
13
那現在來分析演算法需要的時間一次 iteration 需要的
時間 linear to #node+#edge

Analysis
● Time complexity
si k
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
13
s 和 t 複雜度類似所以算 s 來說明

Analysis
● Time complexity
si k
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
13
看一次圖上所有 edge 就可以知道每個 node 有哪些
follower 能算出每個 node self score 的分子和分母

Analysis
● Time complexity
si k
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
13
花 O(number of nodes) 的時間把每個 node 分子分母
相除組合成 self-score

Analysis
● Time complexity
si k
∑
j=1
N
Ai jt jk +λbk
∑
j=1
N
Ai j+ λ
● O(K) ∀ k 13
以上是一種 label 的 time complexity 我們有 K 種那就
再 *K 不過 K 是定值所以一次 iteration time
complexity linear to #node+#edge

Analysis
● Time complexity
– at each iteration
– if converges after h iteration
O(K(N +M ))
13
O(hK (N +M ))
假設演算法會 converge 的話那整體複雜度就是再
*iteration 次數
O(iteration 次數 * 有幾種 Label*(node 數 +edge 數 ))
就是整個演算法的複雜度
那來看演算法是否會 converge

Analysis
14
先講結果只要唯一的參數那個用來控制 prior 強度的
lambda>0 就會 converge
證明方法很簡單
我們先來看經過一次 iteration 後 S 跟上個 iteration 的
S 有什麼關係
你就把 T 代入 S 中

Analysis
14
整理一下先把 S 前面那一駝係數寫成考慮了所有
label 和 unlabel node 的形式 Q 再挑出想要的就變
成前面兩項而其他跟 S 無關的全部丟在第 3 項

Analysis
15
那經過 N 個 iteration 的 S 你可以展開成以下的式子
看起來是矩陣版的等比級數

Analysis
‖QU U‖<1converges to if(I−QUU )−1
15
如果經過無限多 iteration S 存在的話就表示演算法會
converge
而 S 會 converge 的條件是後面的等比級數會收斂
我們學過的一維情況時等比級數收斂條件是比值要小
於 1 那矩陣版本對應的是相當於比值的那個矩陣的
無限次 norm 小於 1
所以我們來看 Q 的無限次 norm 是否小於一

Analysis
15
out-degree ≥ 0
Dii=∑
j
Ai j
Q 等於這個式子先看前面框起來的部份
D 是對角矩陣對應每個 node 的 out-degree >=0

Analysis
15
If λ > 0, every element < 1
Dii=∑
j
Ai j
如果 lambda>0 的話框起來這一駝乘出來的矩陣每個
element<1
因為 out-degree 一定 >=A 的任意一個 element
所以如果 lambda>0 的話 A 的 element /( 對應的 out-
edge+lambda) 一定 <1
而後面是 in-degree 的情況一樣也可以推得每個
element 都 <1

Analysis
15
最後再用 norm 的特性 A*B 的 norm 會小於 A 的
norm*B 的 norm
所以就完成演算法在 lambda>0 一定會收斂的證明

Analysis
● Connection to Label Propagation
– LP is a special case of OMNI-Prop when λ=0
Y
(t)
=P
T
Y
(t−1)
● Keep update
until converge
1
2
3
G 3
2
S
3
1
2
T
1
16
OMNI-Prop 和 LP 前面提到都是 page rank like 的
iterative algorithm 那來講講他們之間的關聯 LP 是
OMNI-Prop 的 special case
可以由這張圖說明主要想法是 LP 只算 self-score 而
OMNI-Prop 算兩個 score
那你就把 node 複製一份原本那份維持原樣算 self-
score 新複製的那份的 score 算的是 neighbor score
所以 OMNI-Prop 可以 degerate 成 LP

Analysis
● Connection to Random Walk
– RWR is a special case of OMNI-Prop when λ=0
Y
(t)
=(1−c)PY
(t−1)
+cY
(0)
until converge
● Keep update
1
2
3
G 3
2
S
1
T
1
2
3
17
同理當然也可以 degenerate 成 RWR

Experiment
● Q1 – Parameter
– How does parameter affect performance?
● Q2 – convergence
– How many iterations does OMNI-Prop need to
converge?
● Q3 – accuracy
– How accurate OMNI-Prop compared to LP and BP?
18
他想觀察的有三件事
首先是演算法中唯一的 parameter 控制 prior 強度的
lambda 對 performance 的影響
第 2 是觀察對實際 dataset 演算法多久會 converge
第 3 是跟 baseline LP 和 BP 比較 OMNI-Prop 是否會
比較準

Experiment
● Dataset
19
● Evaluation
– hide 70% label
測試了三種 label correlation 類型是
POLBLOGS 是 blog 的 citation network label 是政治
傾向所以是 homophily 的表示 blog 通常會引用相
同政治傾向的 blog
COAUTHOR 如果兩個 author 一起寫過一篇 paper 就
有 edge label 是 research field 所以 COAUTHOR
呈現出 homophily 表示一起合作的人研究領域通常
比較相近
FACEBOOK POKEC-G 都是 SNS label 是性別
但 FB 偏向 homophily POKEC 的偏向 heterophily
而 POKEC-L 的 label 是 location
做實驗的方法是把 70% 的 label 藏起來所以實驗是用
30% 的 label 來 predict 剩下 70% 的 label

Experiment
20
top p% nodes ordered by max(self-score)
看 lambda 大小對 performance 的影響
不同顏色表示不同的 lambda 而 x 軸 p 指的是跑完實
驗得到所有 node 的 self-score 後只看最確定的前 p
% 的正確率
而 y 軸是 precision 愈高愈好
lambda 大 =100 的時候對少部份最確定的 node 的準
確率很高但對於較不確定的後面的 node 準確率較
差
lambda 小 =0.01 時就相反對較後面的 node 準確率較
高
因此取中間值 lambda=1 基本上都能得到不錯的
performance
因此他們下的結論是你把 lambda 用 =1 當預設值
OMNI-Prop 是 parameter-free

Experiment
21
第二個實驗測 OMNI-Prop 多久會 converge
不同顏色表示不同 iteration 數那可以看到 10 個
iteration 的結果其實就跟 20 很近了

Omni-Prop: Seamless Node Classification on Arbitrary Label Correlation

Recommended

Recommended

More Related Content

Recently uploaded

Recently uploaded (6)

Featured

Featured (20)

Omni-Prop: Seamless Node Classification on Arbitrary Label Correlation