SlideShare a Scribd company logo
2021/01/24
Sliced Wasserstein距離と生成モデル
1
@ohken322
目次
1. Wasserstein距離とWGAN
2. Sliced Wasserstein 距離と生成モデルSWG
3. Max-sliced Wasserstein距離
1. Sliced Wassersteinの統計的性質
2. 生成モデルへの応用
4. Generalized sliced Wasserstein距離
1. Radon変換
2. SWの一般化
5. Augmented sliced Wasserstein距離
2
Wasserstein距離
𝜇, 𝜈 : ℝ𝑑
上の確率測度 (e.g. d𝜇 = 𝑓d𝑥, 𝑓は確率密度関数)
Π 𝜇, 𝜈 ≔ {𝜋: ℝ𝑑
× ℝ𝑑
上の測度, 𝜋 𝐴 × ℝ𝑑
= 𝜇 𝐴 , 𝜋 ℝ𝑑
× 𝐵 = 𝜈 𝐵 }
 Wasserstein距離
𝑝 ≥ 1に対し
𝑊
𝑝 𝜇, 𝜈 ≔ min
𝜋∈Π(𝜇,𝜈) 𝒳×𝒳
𝑥 − 𝑦 𝑝 d𝜋 𝑥, 𝑦
1/𝑝
命題
𝑊
𝑝 は ℝ𝑑 上の確率測度間の距離を定める.
3
Wasserstein GAN
 双対性
𝑊1 𝜇, 𝜈 = inf
𝜋∈Π(𝜇,𝜈) ℝ𝑑
𝑥 − 𝑦 d𝜋
= sup
𝑓∈𝐿𝑖𝑝1
ℝ𝑑
𝑓d𝜇 −
ℝ𝑑
𝑓d𝜈
 Wasserstein GAN (Arjovsky et al. ICML 2017)
𝜇:生成分布、𝜈: データ分布として、min 𝑊1 を解くように生成器と 𝑓 を学習する
 課題: 𝑊1を正確に推定するために識別器 𝑓 がある程度学習できてないといけない
e.g. 生成器のパラメータ更新1回に対して 𝑓 を5回更新
4
Generative Modeling using the
Sliced Wasserstein Distance
DESHPANDE, ZHANG, SCHWING @ CVPR 2018
5
Sliced Wasserstein distance
Sliced Wasserstein (Rabin 2011, Bonneel 2015)
SW2 𝜇, 𝜈 2
≔
𝕊𝑑−1
𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈
2
d𝜃
𝑃𝜃 ∶ 𝒳 = ℝ𝑑 → ℝ は𝜃 ∈ 𝕊𝑑−1 = {𝑥 ∈ ℝ𝑑 ∣ 𝑥 = 1} 方向の直線への射影
 1次元でのOTが陽に解けるので計算が楽
 積分は方向ベクトルのサンプリングで行う
 SWも距離(しかもWasserstein距離と同値)
 混合ガウス分布によるモデリングでの応用 Kolouri, et al. 2018
6
Sliced Wasserstein Generator
Deshpande, et al. CVPR 2018
 Sliced Wasserstein距離を損失関数に用いた生成モデル
 距離を推定するために識別器を学習させずに済む
 方向ベクトルのサンプル数は10000くらい(MNISTで)
 生成器の更新が1.5 ~ 2倍くらいの時間になるらしい
(識別機はないのでWGANより高速)
7
SWGの実験結果
あ
8
ロスの収束先が
バッチサイズに反比例
(SWの統計的性質:後述)
バッチサイズ=128 でも
十分な生成品質と多様性
NNの構造に依らずに学習が安定
SWG : 高次元の場合
課題: 高次元になるほど、SWの近似に必要な方向ベクトルの数が増える
→ できるだけ「情報の多い」方向ベクトルを選びたい
「情報の多い低次元空間」に移す役割としてDiscriminator を復活
Discriminator 𝑓𝜃′
′
𝑓
生成分布 𝐺𝜃(𝑃𝑧)
データ分布 𝑃𝑑
𝒟
ℱ
サンプリング
𝑓𝜃′ 𝒟
𝑓𝜃′(ℱ)
SW Loss
Discriminator Loss
識別しやすい空間 = SWの推定が簡単な空間
というヒューリスティック
生成結果は論文を参照
8
(Sliced) Wassersteinの統計的性質
Wasserstein距離の推定は、次元が高くなるほどサンプル効率が悪い
 経験分布の弱収束
確率測度 𝜇 ∈ 𝒫𝑝 ℝ𝑑 の経験分布 𝜇𝑛 に対し、𝑊
𝑝 𝜇𝑛, 𝜇 → 0 a.s.
しかしながら収束の速さは (𝜇:絶対連続, 𝑑 > 2𝑞のとき)
𝔼 𝑊
𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−
1
𝑑
 Sliced Wassersteinのサンプル効率
(Nadjahi, et al. 2020, Lin, et al. 2020)
(適当な条件の下で) )
𝔼 𝑆𝑊
𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−1
10
SWD (Deshpande 2018)での実験結果
Max-Sliced Wasserstein Distance
and its use for GANs
DESHPANDE, HU, SUN, PYRROS, SIDDIQUI @ CVPR 2019
11
SWの推定効率
 Sliced Wassersteinのサンプル効率性に加え、方向ベクトルのサンプル効率を検証
→ 重要な方向だけを採用するのが良さそう
12
𝜇 = 𝒩(0, 𝐼) を
𝜈 = 𝒩(𝛽𝑒, 𝐼)で推定:
𝛽 ← 𝛽 − 𝛼𝛻𝛽𝑆𝑊2 𝜇, 𝜈
max-𝑊 : 𝑒を方向ベクトルに使う
Max-sliced Wasserstein距離
 max-sliced Wasserstein
maxSW2 𝜇, 𝜈 ≔ max
𝜃∈𝕊𝑑−1
𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈
分布間の距離を与える (Wasserstein距離と同値)
 Sliced Wasserstein とほぼ同じサンプル効率
13
Max-sliced GAN
 maxをどうやって計算するのか?
Sliced Wasserstein Generator の時と同じアイデア : 特徴量写像 + 良い方向ベクトル = Discriminator
14
特徴量写像のパラメータ
ちょっと難しい…
Surrogateモデル導入
e.g.
Max-sliced GAN
15
生成結果は論文を参照
Generalized Sliced Wasserstein
Distance
KOLOURI, NADJAHIM, ŞIMŞEKLI, BADEAU, ROHDE @ NEURIPS 2019
16
ラドン変換とSliced Wasserstein
 Radon Transform (Radon, 1917)
𝐼 ∈ 𝐿1 ℝ𝑑 = 𝐼: ℝ𝑑 → ℝ ℝ𝑑 𝐼 𝑥 d𝑥 < ∞ , 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑−1
𝐼 ↦ ℛ𝐼 𝑡, 𝜃 ∶=
ℝ𝑑
𝐼 𝑥 𝛿(𝑡 − 𝑥, 𝜃 )d𝑥
※ CTスキャンなどの断層映像法(トモグラフィ)で使われる
これを使うと密度 d𝜇 = 𝐼𝜇(𝑥)d𝑥, d𝜈 = 𝐼𝜈(𝑥)d𝑥 を持つ 𝜇, 𝜈 に対して
𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 =
𝕊𝑑−1
𝑊
𝑝
𝑝
(ℛ𝐼𝜇 ⋅, 𝜃 , ℛ𝐼𝜈 ⋅, 𝜃 )d𝜃
と書ける。
17
一般化ラドン変換
 Generalized Radon Transform (Beylkin, 1984)
𝒢𝐼 𝑡, 𝜃 =
ℝ𝑑
𝐼 𝑥 𝛿(𝑡 − 𝑔 𝑥, 𝜃 )d𝑥
𝑔: ℝ𝑑 × (ℝ𝑛∖ 0 ) → ℝ は
いくつかの条件を満たす定義関数
18
一般化(max-)Sliced Wasserstein
 Generalized (max-)sliced Wasserstein distance
𝐺𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 ≔
Ω𝜃
𝑊
𝑝
𝑝
(𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )d𝜃
max𝐺𝑆𝑊
𝑝 𝜇, 𝜈 ≔ max
𝜃∈Ω𝜃
𝑊
𝑝( 𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )
 命題
𝒢 が単射のとき、𝐺𝑆𝑊, max𝐺𝑆𝑊 は確率分布間の距離を与える
※ 𝑔 : circular, polynomial(奇数次のみで斉次) などが単射を与えることが知られている
19
(max-)GSWの計算アルゴリズム
20
𝜃はexactに最適化していることに注意
(さっきはDiscriminatorに織り込んでいた)
実験結果
 Toy example でSWと比較
 より柔軟な射影を計算できるので効率よく分布マッチングできる
 Sliced Wasserstein Auto-Encoder (Kolouri, et al. 2019) に適用
 実用的よりも実験的な設定(右図)
 GANとの組み合わせは試してない
 𝑔 をNNで構成できるか微妙
21
Augmented Sliced Wasserstein
Distance
CHEN, YANG, LI @ ICLR 2021 REJECTED(6,7,4)
22
Spatial Radon Transform
 Spatial Radon Transform
𝑡, 𝜃 ∈ ℝ × 𝕊𝑑𝜃−1, 𝑔: ℝ𝑑 → ℝ𝑑𝜃
ℋ𝐼 𝑡, 𝜃; 𝑔 =
ℝ𝑑
𝐼 𝑥 𝛿 𝑡 − 𝑔 𝑥 , 𝜃 d𝑥 = ℛ 𝑔∗
𝐼 𝑡, 𝜃
※ 多項式によるGRTを含む
 命題
𝑔 :単射 ⇔ ℋ:単射
23
Augmented Sliced Wasserstein Distance
 Augmented (max-)sliced Wasserstein distance
𝐴𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 ≔
𝕊𝑑𝜃−1
𝑊
𝑝
𝑝
(ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )d𝜃
max𝐴𝑆𝑊
𝑝 𝜇, 𝜈 ≔ max
𝜃∈𝕊𝑑𝜃−1
𝑊
𝑝( ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )
𝑔 が単射でさえあれば良いので、NNでも表現できる : 𝑔 = [𝑥, 𝜙𝑁𝑁 𝑥 ]
良い𝑔を得るための最適化の目的関数:
24
実験では1層、ReLU
𝑑 = 𝑑𝜃
実験結果
 Toy Problem (KolouriのGSWと同じ)
 標準正規分布から勾配法で他の分布を目指す
25
実際のW2が最小
GANへの適用
 CIFAR10 (64*64), CELEBA (64*64)
 モデルやロスの設計はDeshpande 2018と同じ?:
26
方向ベクトル
のサンプル数
Distributed SWD
(NeurIPS 2019)
紹介した論文
1. Deshpande, Zhang, Schwing, “Generative Modeling Using the Sliced Wasserstein Distance”,
CVPR 2018.
2. Deshpande, Hu, Sun, Pyrros, Siddiqui, Koyejo, Zhao, Forsyth, Schwing, “Max-Sliced
Wasserstein distance and its use for GANs”, CVPR 2019.
3. Kolouri, Nadjahi, Simsekli, Badeau, Rohde “Generalized Sliced Wasserstein Distances”,
NeurIPS 2019.
4. Chen, Yang, Li, “Augmented Sliced Wasserstein Distances”, arXiv:2006.08812, 2020.
27
参考文献
1. Arjovsky, Chintala, Bottou, “Wasserstein Generative Adversarial Networks”, ICML 2017.
2. Rabin, Peyre, Delon, Marc, “Wasserstein Barycenter and its Application to Texture Mixing”,
SSVM’11, 435-446, 2011.
3. Bonneel, Rabin, Peyre, Pfister, “Sliced and Radon Wasserstein Barycenters of Measures”, Journal
of Mathematical Imaging and Vision, Springer Verlag, 1 (51), 22-45, 2015.
4. Kolouri, Rohde, Hoffman, “Sliced Wasserstein Distance for Learning Gaussian Mixture Models”,
CVPR 2018.
5. Kolouri, Pope, Martin, Rohde, “Sliced Wasserstein Auto-Encoders”, ICLR 2019.
6. Nadjahi, Durmus, Chizat, Kolouri, Shahranpour, Şimsekli, “Statistical and Topological Properties of
Sliced Probability Divergences”, arXiv:2003.05783, 2020.
7. Lin, Zheng, Chen, Cuturi, Jordan, “On Projection Robust Optimal Transport: Sample Complexity
and Model Misspecification”, arXiv:2006.12301, 2020.
28

More Related Content

What's hot

変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)Takao Yamanaka
 
実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE
ぱんいち すみもと
 
最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向
ohken
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models
Deep Learning JP
 
Graph Neural Networks
Graph Neural NetworksGraph Neural Networks
Graph Neural Networks
tm1966
 
グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題
joisino
 
ようやく分かった!最尤推定とベイズ推定
ようやく分かった!最尤推定とベイズ推定ようやく分かった!最尤推定とベイズ推定
ようやく分かった!最尤推定とベイズ推定
Akira Masuda
 
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度
Seiichi Uchida
 
機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化
gree_tech
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理
Taiji Suzuki
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
 
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
Deep Learning JP
 
深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習
Masahiro Suzuki
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
KCS Keio Computer Society
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選
Yusuke Uchida
 
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
Deep Learning JP
 
深層生成モデルと世界モデル
深層生成モデルと世界モデル深層生成モデルと世界モデル
深層生成モデルと世界モデル
Masahiro Suzuki
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門
tmtm otm
 
PRML 8.2 条件付き独立性
PRML 8.2 条件付き独立性PRML 8.2 条件付き独立性
PRML 8.2 条件付き独立性
sleepy_yoshi
 

What's hot (20)

変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)
 
実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE実装レベルで学ぶVQVAE
実装レベルで学ぶVQVAE
 
最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向
 
【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models【DL輪読会】Scaling Laws for Neural Language Models
【DL輪読会】Scaling Laws for Neural Language Models
 
Graph Neural Networks
Graph Neural NetworksGraph Neural Networks
Graph Neural Networks
 
グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題グラフニューラルネットワークとグラフ組合せ問題
グラフニューラルネットワークとグラフ組合せ問題
 
ようやく分かった!最尤推定とベイズ推定
ようやく分かった!最尤推定とベイズ推定ようやく分かった!最尤推定とベイズ推定
ようやく分かった!最尤推定とベイズ推定
 
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
 
4 データ間の距離と類似度
4 データ間の距離と類似度4 データ間の距離と類似度
4 データ間の距離と類似度
 
機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化
 
深層学習の数理
深層学習の数理深層学習の数理
深層学習の数理
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
[DL輪読会]Model soups: averaging weights of multiple fine-tuned models improves ...
 
深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習深層生成モデルを用いたマルチモーダル学習
深層生成モデルを用いたマルチモーダル学習
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
 
モデル高速化百選
モデル高速化百選モデル高速化百選
モデル高速化百選
 
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
 
深層生成モデルと世界モデル
深層生成モデルと世界モデル深層生成モデルと世界モデル
深層生成モデルと世界モデル
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門
 
PRML 8.2 条件付き独立性
PRML 8.2 条件付き独立性PRML 8.2 条件付き独立性
PRML 8.2 条件付き独立性
 

Similar to Sliced Wasserstein距離と生成モデル

IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphs
Akisato Kimura
 
MIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdfMIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdf
Vania Najah
 
Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07
Syoyo Fujita
 
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Atsushi Nitanda
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSO
Kazuki Yoshida
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
Sara Asher
 
Research_Poster_Final
Research_Poster_FinalResearch_Poster_Final
Research_Poster_Final
Sauvik Chakraborty
 
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Atsushi Nitanda
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Universitat Politècnica de Catalunya
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
Sungchul Kim
 
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
The Statistical and Applied Mathematical Sciences Institute
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
Venkata Suhas Maringanti
 
Split block domination in graphs
Split block domination in graphsSplit block domination in graphs
Split block domination in graphs
eSAT Journals
 
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Florian Cartuta
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayes
NAVER Engineering
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
taeseon ryu
 
Optimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimationOptimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimation
Alessandro Samuel-Rosa
 
about power system operation and control13197214.ppt
about power system operation and control13197214.pptabout power system operation and control13197214.ppt
about power system operation and control13197214.ppt
MohammedAhmed66819
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
David Gleich
 
Caret max kuhn
Caret max kuhnCaret max kuhn
Caret max kuhn
kmettler
 

Similar to Sliced Wasserstein距離と生成モデル (20)

IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphs
 
MIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdfMIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdf
 
Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07
 
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSO
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
Research_Poster_Final
Research_Poster_FinalResearch_Poster_Final
Research_Poster_Final
 
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
Split block domination in graphs
Split block domination in graphsSplit block domination in graphs
Split block domination in graphs
 
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayes
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
 
Optimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimationOptimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimation
 
about power system operation and control13197214.ppt
about power system operation and control13197214.pptabout power system operation and control13197214.ppt
about power system operation and control13197214.ppt
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Caret max kuhn
Caret max kuhnCaret max kuhn
Caret max kuhn
 

Recently uploaded

Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
Texas Alliance of Groundwater Districts
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
LengamoLAppostilic
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
PirithiRaju
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 

Recently uploaded (20)

Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Katherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdfKatherine Romanak - Geologic CO2 Storage.pdf
Katherine Romanak - Geologic CO2 Storage.pdf
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdfwaterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
waterlessdyeingtechnolgyusing carbon dioxide chemicalspdf
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Pests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdfPests of Storage_Identification_Dr.UPR.pdf
Pests of Storage_Identification_Dr.UPR.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 

Sliced Wasserstein距離と生成モデル

  • 2. 目次 1. Wasserstein距離とWGAN 2. Sliced Wasserstein 距離と生成モデルSWG 3. Max-sliced Wasserstein距離 1. Sliced Wassersteinの統計的性質 2. 生成モデルへの応用 4. Generalized sliced Wasserstein距離 1. Radon変換 2. SWの一般化 5. Augmented sliced Wasserstein距離 2
  • 3. Wasserstein距離 𝜇, 𝜈 : ℝ𝑑 上の確率測度 (e.g. d𝜇 = 𝑓d𝑥, 𝑓は確率密度関数) Π 𝜇, 𝜈 ≔ {𝜋: ℝ𝑑 × ℝ𝑑 上の測度, 𝜋 𝐴 × ℝ𝑑 = 𝜇 𝐴 , 𝜋 ℝ𝑑 × 𝐵 = 𝜈 𝐵 }  Wasserstein距離 𝑝 ≥ 1に対し 𝑊 𝑝 𝜇, 𝜈 ≔ min 𝜋∈Π(𝜇,𝜈) 𝒳×𝒳 𝑥 − 𝑦 𝑝 d𝜋 𝑥, 𝑦 1/𝑝 命題 𝑊 𝑝 は ℝ𝑑 上の確率測度間の距離を定める. 3
  • 4. Wasserstein GAN  双対性 𝑊1 𝜇, 𝜈 = inf 𝜋∈Π(𝜇,𝜈) ℝ𝑑 𝑥 − 𝑦 d𝜋 = sup 𝑓∈𝐿𝑖𝑝1 ℝ𝑑 𝑓d𝜇 − ℝ𝑑 𝑓d𝜈  Wasserstein GAN (Arjovsky et al. ICML 2017) 𝜇:生成分布、𝜈: データ分布として、min 𝑊1 を解くように生成器と 𝑓 を学習する  課題: 𝑊1を正確に推定するために識別器 𝑓 がある程度学習できてないといけない e.g. 生成器のパラメータ更新1回に対して 𝑓 を5回更新 4
  • 5. Generative Modeling using the Sliced Wasserstein Distance DESHPANDE, ZHANG, SCHWING @ CVPR 2018 5
  • 6. Sliced Wasserstein distance Sliced Wasserstein (Rabin 2011, Bonneel 2015) SW2 𝜇, 𝜈 2 ≔ 𝕊𝑑−1 𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈 2 d𝜃 𝑃𝜃 ∶ 𝒳 = ℝ𝑑 → ℝ は𝜃 ∈ 𝕊𝑑−1 = {𝑥 ∈ ℝ𝑑 ∣ 𝑥 = 1} 方向の直線への射影  1次元でのOTが陽に解けるので計算が楽  積分は方向ベクトルのサンプリングで行う  SWも距離(しかもWasserstein距離と同値)  混合ガウス分布によるモデリングでの応用 Kolouri, et al. 2018 6
  • 7. Sliced Wasserstein Generator Deshpande, et al. CVPR 2018  Sliced Wasserstein距離を損失関数に用いた生成モデル  距離を推定するために識別器を学習させずに済む  方向ベクトルのサンプル数は10000くらい(MNISTで)  生成器の更新が1.5 ~ 2倍くらいの時間になるらしい (識別機はないのでWGANより高速) 7
  • 9. SWG : 高次元の場合 課題: 高次元になるほど、SWの近似に必要な方向ベクトルの数が増える → できるだけ「情報の多い」方向ベクトルを選びたい 「情報の多い低次元空間」に移す役割としてDiscriminator を復活 Discriminator 𝑓𝜃′ ′ 𝑓 生成分布 𝐺𝜃(𝑃𝑧) データ分布 𝑃𝑑 𝒟 ℱ サンプリング 𝑓𝜃′ 𝒟 𝑓𝜃′(ℱ) SW Loss Discriminator Loss 識別しやすい空間 = SWの推定が簡単な空間 というヒューリスティック 生成結果は論文を参照 8
  • 10. (Sliced) Wassersteinの統計的性質 Wasserstein距離の推定は、次元が高くなるほどサンプル効率が悪い  経験分布の弱収束 確率測度 𝜇 ∈ 𝒫𝑝 ℝ𝑑 の経験分布 𝜇𝑛 に対し、𝑊 𝑝 𝜇𝑛, 𝜇 → 0 a.s. しかしながら収束の速さは (𝜇:絶対連続, 𝑑 > 2𝑞のとき) 𝔼 𝑊 𝑝 𝜇𝑛, 𝜇 ≃ 𝑛− 1 𝑑  Sliced Wassersteinのサンプル効率 (Nadjahi, et al. 2020, Lin, et al. 2020) (適当な条件の下で) ) 𝔼 𝑆𝑊 𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−1 10 SWD (Deshpande 2018)での実験結果
  • 11. Max-Sliced Wasserstein Distance and its use for GANs DESHPANDE, HU, SUN, PYRROS, SIDDIQUI @ CVPR 2019 11
  • 12. SWの推定効率  Sliced Wassersteinのサンプル効率性に加え、方向ベクトルのサンプル効率を検証 → 重要な方向だけを採用するのが良さそう 12 𝜇 = 𝒩(0, 𝐼) を 𝜈 = 𝒩(𝛽𝑒, 𝐼)で推定: 𝛽 ← 𝛽 − 𝛼𝛻𝛽𝑆𝑊2 𝜇, 𝜈 max-𝑊 : 𝑒を方向ベクトルに使う
  • 13. Max-sliced Wasserstein距離  max-sliced Wasserstein maxSW2 𝜇, 𝜈 ≔ max 𝜃∈𝕊𝑑−1 𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈 分布間の距離を与える (Wasserstein距離と同値)  Sliced Wasserstein とほぼ同じサンプル効率 13
  • 14. Max-sliced GAN  maxをどうやって計算するのか? Sliced Wasserstein Generator の時と同じアイデア : 特徴量写像 + 良い方向ベクトル = Discriminator 14 特徴量写像のパラメータ ちょっと難しい… Surrogateモデル導入 e.g.
  • 16. Generalized Sliced Wasserstein Distance KOLOURI, NADJAHIM, ŞIMŞEKLI, BADEAU, ROHDE @ NEURIPS 2019 16
  • 17. ラドン変換とSliced Wasserstein  Radon Transform (Radon, 1917) 𝐼 ∈ 𝐿1 ℝ𝑑 = 𝐼: ℝ𝑑 → ℝ ℝ𝑑 𝐼 𝑥 d𝑥 < ∞ , 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑−1 𝐼 ↦ ℛ𝐼 𝑡, 𝜃 ∶= ℝ𝑑 𝐼 𝑥 𝛿(𝑡 − 𝑥, 𝜃 )d𝑥 ※ CTスキャンなどの断層映像法(トモグラフィ)で使われる これを使うと密度 d𝜇 = 𝐼𝜇(𝑥)d𝑥, d𝜈 = 𝐼𝜈(𝑥)d𝑥 を持つ 𝜇, 𝜈 に対して 𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 = 𝕊𝑑−1 𝑊 𝑝 𝑝 (ℛ𝐼𝜇 ⋅, 𝜃 , ℛ𝐼𝜈 ⋅, 𝜃 )d𝜃 と書ける。 17
  • 18. 一般化ラドン変換  Generalized Radon Transform (Beylkin, 1984) 𝒢𝐼 𝑡, 𝜃 = ℝ𝑑 𝐼 𝑥 𝛿(𝑡 − 𝑔 𝑥, 𝜃 )d𝑥 𝑔: ℝ𝑑 × (ℝ𝑛∖ 0 ) → ℝ は いくつかの条件を満たす定義関数 18
  • 19. 一般化(max-)Sliced Wasserstein  Generalized (max-)sliced Wasserstein distance 𝐺𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 ≔ Ω𝜃 𝑊 𝑝 𝑝 (𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )d𝜃 max𝐺𝑆𝑊 𝑝 𝜇, 𝜈 ≔ max 𝜃∈Ω𝜃 𝑊 𝑝( 𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )  命題 𝒢 が単射のとき、𝐺𝑆𝑊, max𝐺𝑆𝑊 は確率分布間の距離を与える ※ 𝑔 : circular, polynomial(奇数次のみで斉次) などが単射を与えることが知られている 19
  • 21. 実験結果  Toy example でSWと比較  より柔軟な射影を計算できるので効率よく分布マッチングできる  Sliced Wasserstein Auto-Encoder (Kolouri, et al. 2019) に適用  実用的よりも実験的な設定(右図)  GANとの組み合わせは試してない  𝑔 をNNで構成できるか微妙 21
  • 22. Augmented Sliced Wasserstein Distance CHEN, YANG, LI @ ICLR 2021 REJECTED(6,7,4) 22
  • 23. Spatial Radon Transform  Spatial Radon Transform 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑𝜃−1, 𝑔: ℝ𝑑 → ℝ𝑑𝜃 ℋ𝐼 𝑡, 𝜃; 𝑔 = ℝ𝑑 𝐼 𝑥 𝛿 𝑡 − 𝑔 𝑥 , 𝜃 d𝑥 = ℛ 𝑔∗ 𝐼 𝑡, 𝜃 ※ 多項式によるGRTを含む  命題 𝑔 :単射 ⇔ ℋ:単射 23
  • 24. Augmented Sliced Wasserstein Distance  Augmented (max-)sliced Wasserstein distance 𝐴𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 ≔ 𝕊𝑑𝜃−1 𝑊 𝑝 𝑝 (ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )d𝜃 max𝐴𝑆𝑊 𝑝 𝜇, 𝜈 ≔ max 𝜃∈𝕊𝑑𝜃−1 𝑊 𝑝( ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 ) 𝑔 が単射でさえあれば良いので、NNでも表現できる : 𝑔 = [𝑥, 𝜙𝑁𝑁 𝑥 ] 良い𝑔を得るための最適化の目的関数: 24 実験では1層、ReLU 𝑑 = 𝑑𝜃
  • 25. 実験結果  Toy Problem (KolouriのGSWと同じ)  標準正規分布から勾配法で他の分布を目指す 25 実際のW2が最小
  • 26. GANへの適用  CIFAR10 (64*64), CELEBA (64*64)  モデルやロスの設計はDeshpande 2018と同じ?: 26 方向ベクトル のサンプル数 Distributed SWD (NeurIPS 2019)
  • 27. 紹介した論文 1. Deshpande, Zhang, Schwing, “Generative Modeling Using the Sliced Wasserstein Distance”, CVPR 2018. 2. Deshpande, Hu, Sun, Pyrros, Siddiqui, Koyejo, Zhao, Forsyth, Schwing, “Max-Sliced Wasserstein distance and its use for GANs”, CVPR 2019. 3. Kolouri, Nadjahi, Simsekli, Badeau, Rohde “Generalized Sliced Wasserstein Distances”, NeurIPS 2019. 4. Chen, Yang, Li, “Augmented Sliced Wasserstein Distances”, arXiv:2006.08812, 2020. 27
  • 28. 参考文献 1. Arjovsky, Chintala, Bottou, “Wasserstein Generative Adversarial Networks”, ICML 2017. 2. Rabin, Peyre, Delon, Marc, “Wasserstein Barycenter and its Application to Texture Mixing”, SSVM’11, 435-446, 2011. 3. Bonneel, Rabin, Peyre, Pfister, “Sliced and Radon Wasserstein Barycenters of Measures”, Journal of Mathematical Imaging and Vision, Springer Verlag, 1 (51), 22-45, 2015. 4. Kolouri, Rohde, Hoffman, “Sliced Wasserstein Distance for Learning Gaussian Mixture Models”, CVPR 2018. 5. Kolouri, Pope, Martin, Rohde, “Sliced Wasserstein Auto-Encoders”, ICLR 2019. 6. Nadjahi, Durmus, Chizat, Kolouri, Shahranpour, Şimsekli, “Statistical and Topological Properties of Sliced Probability Divergences”, arXiv:2003.05783, 2020. 7. Lin, Zheng, Chen, Cuturi, Jordan, “On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification”, arXiv:2006.12301, 2020. 28