SlideShare a Scribd company logo
1 of 28
2021/01/24
Sliced Wasserstein距離と生成モデル
1
@ohken322
目次
1. Wasserstein距離とWGAN
2. Sliced Wasserstein 距離と生成モデルSWG
3. Max-sliced Wasserstein距離
1. Sliced Wassersteinの統計的性質
2. 生成モデルへの応用
4. Generalized sliced Wasserstein距離
1. Radon変換
2. SWの一般化
5. Augmented sliced Wasserstein距離
2
Wasserstein距離
𝜇, 𝜈 : ℝ𝑑
上の確率測度 (e.g. d𝜇 = 𝑓d𝑥, 𝑓は確率密度関数)
Π 𝜇, 𝜈 ≔ {𝜋: ℝ𝑑
× ℝ𝑑
上の測度, 𝜋 𝐴 × ℝ𝑑
= 𝜇 𝐴 , 𝜋 ℝ𝑑
× 𝐵 = 𝜈 𝐵 }
 Wasserstein距離
𝑝 ≥ 1に対し
𝑊
𝑝 𝜇, 𝜈 ≔ min
𝜋∈Π(𝜇,𝜈) 𝒳×𝒳
𝑥 − 𝑦 𝑝 d𝜋 𝑥, 𝑦
1/𝑝
命題
𝑊
𝑝 は ℝ𝑑 上の確率測度間の距離を定める.
3
Wasserstein GAN
 双対性
𝑊1 𝜇, 𝜈 = inf
𝜋∈Π(𝜇,𝜈) ℝ𝑑
𝑥 − 𝑦 d𝜋
= sup
𝑓∈𝐿𝑖𝑝1
ℝ𝑑
𝑓d𝜇 −
ℝ𝑑
𝑓d𝜈
 Wasserstein GAN (Arjovsky et al. ICML 2017)
𝜇:生成分布、𝜈: データ分布として、min 𝑊1 を解くように生成器と 𝑓 を学習する
 課題: 𝑊1を正確に推定するために識別器 𝑓 がある程度学習できてないといけない
e.g. 生成器のパラメータ更新1回に対して 𝑓 を5回更新
4
Generative Modeling using the
Sliced Wasserstein Distance
DESHPANDE, ZHANG, SCHWING @ CVPR 2018
5
Sliced Wasserstein distance
Sliced Wasserstein (Rabin 2011, Bonneel 2015)
SW2 𝜇, 𝜈 2
≔
𝕊𝑑−1
𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈
2
d𝜃
𝑃𝜃 ∶ 𝒳 = ℝ𝑑 → ℝ は𝜃 ∈ 𝕊𝑑−1 = {𝑥 ∈ ℝ𝑑 ∣ 𝑥 = 1} 方向の直線への射影
 1次元でのOTが陽に解けるので計算が楽
 積分は方向ベクトルのサンプリングで行う
 SWも距離(しかもWasserstein距離と同値)
 混合ガウス分布によるモデリングでの応用 Kolouri, et al. 2018
6
Sliced Wasserstein Generator
Deshpande, et al. CVPR 2018
 Sliced Wasserstein距離を損失関数に用いた生成モデル
 距離を推定するために識別器を学習させずに済む
 方向ベクトルのサンプル数は10000くらい(MNISTで)
 生成器の更新が1.5 ~ 2倍くらいの時間になるらしい
(識別機はないのでWGANより高速)
7
SWGの実験結果
あ
8
ロスの収束先が
バッチサイズに反比例
(SWの統計的性質:後述)
バッチサイズ=128 でも
十分な生成品質と多様性
NNの構造に依らずに学習が安定
SWG : 高次元の場合
課題: 高次元になるほど、SWの近似に必要な方向ベクトルの数が増える
→ できるだけ「情報の多い」方向ベクトルを選びたい
「情報の多い低次元空間」に移す役割としてDiscriminator を復活
Discriminator 𝑓𝜃′
′
𝑓
生成分布 𝐺𝜃(𝑃𝑧)
データ分布 𝑃𝑑
𝒟
ℱ
サンプリング
𝑓𝜃′ 𝒟
𝑓𝜃′(ℱ)
SW Loss
Discriminator Loss
識別しやすい空間 = SWの推定が簡単な空間
というヒューリスティック
生成結果は論文を参照
8
(Sliced) Wassersteinの統計的性質
Wasserstein距離の推定は、次元が高くなるほどサンプル効率が悪い
 経験分布の弱収束
確率測度 𝜇 ∈ 𝒫𝑝 ℝ𝑑 の経験分布 𝜇𝑛 に対し、𝑊
𝑝 𝜇𝑛, 𝜇 → 0 a.s.
しかしながら収束の速さは (𝜇:絶対連続, 𝑑 > 2𝑞のとき)
𝔼 𝑊
𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−
1
𝑑
 Sliced Wassersteinのサンプル効率
(Nadjahi, et al. 2020, Lin, et al. 2020)
(適当な条件の下で) )
𝔼 𝑆𝑊
𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−1
10
SWD (Deshpande 2018)での実験結果
Max-Sliced Wasserstein Distance
and its use for GANs
DESHPANDE, HU, SUN, PYRROS, SIDDIQUI @ CVPR 2019
11
SWの推定効率
 Sliced Wassersteinのサンプル効率性に加え、方向ベクトルのサンプル効率を検証
→ 重要な方向だけを採用するのが良さそう
12
𝜇 = 𝒩(0, 𝐼) を
𝜈 = 𝒩(𝛽𝑒, 𝐼)で推定:
𝛽 ← 𝛽 − 𝛼𝛻𝛽𝑆𝑊2 𝜇, 𝜈
max-𝑊 : 𝑒を方向ベクトルに使う
Max-sliced Wasserstein距離
 max-sliced Wasserstein
maxSW2 𝜇, 𝜈 ≔ max
𝜃∈𝕊𝑑−1
𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈
分布間の距離を与える (Wasserstein距離と同値)
 Sliced Wasserstein とほぼ同じサンプル効率
13
Max-sliced GAN
 maxをどうやって計算するのか?
Sliced Wasserstein Generator の時と同じアイデア : 特徴量写像 + 良い方向ベクトル = Discriminator
14
特徴量写像のパラメータ
ちょっと難しい…
Surrogateモデル導入
e.g.
Max-sliced GAN
15
生成結果は論文を参照
Generalized Sliced Wasserstein
Distance
KOLOURI, NADJAHIM, ŞIMŞEKLI, BADEAU, ROHDE @ NEURIPS 2019
16
ラドン変換とSliced Wasserstein
 Radon Transform (Radon, 1917)
𝐼 ∈ 𝐿1 ℝ𝑑 = 𝐼: ℝ𝑑 → ℝ ℝ𝑑 𝐼 𝑥 d𝑥 < ∞ , 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑−1
𝐼 ↦ ℛ𝐼 𝑡, 𝜃 ∶=
ℝ𝑑
𝐼 𝑥 𝛿(𝑡 − 𝑥, 𝜃 )d𝑥
※ CTスキャンなどの断層映像法(トモグラフィ)で使われる
これを使うと密度 d𝜇 = 𝐼𝜇(𝑥)d𝑥, d𝜈 = 𝐼𝜈(𝑥)d𝑥 を持つ 𝜇, 𝜈 に対して
𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 =
𝕊𝑑−1
𝑊
𝑝
𝑝
(ℛ𝐼𝜇 ⋅, 𝜃 , ℛ𝐼𝜈 ⋅, 𝜃 )d𝜃
と書ける。
17
一般化ラドン変換
 Generalized Radon Transform (Beylkin, 1984)
𝒢𝐼 𝑡, 𝜃 =
ℝ𝑑
𝐼 𝑥 𝛿(𝑡 − 𝑔 𝑥, 𝜃 )d𝑥
𝑔: ℝ𝑑 × (ℝ𝑛∖ 0 ) → ℝ は
いくつかの条件を満たす定義関数
18
一般化(max-)Sliced Wasserstein
 Generalized (max-)sliced Wasserstein distance
𝐺𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 ≔
Ω𝜃
𝑊
𝑝
𝑝
(𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )d𝜃
max𝐺𝑆𝑊
𝑝 𝜇, 𝜈 ≔ max
𝜃∈Ω𝜃
𝑊
𝑝( 𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )
 命題
𝒢 が単射のとき、𝐺𝑆𝑊, max𝐺𝑆𝑊 は確率分布間の距離を与える
※ 𝑔 : circular, polynomial(奇数次のみで斉次) などが単射を与えることが知られている
19
(max-)GSWの計算アルゴリズム
20
𝜃はexactに最適化していることに注意
(さっきはDiscriminatorに織り込んでいた)
実験結果
 Toy example でSWと比較
 より柔軟な射影を計算できるので効率よく分布マッチングできる
 Sliced Wasserstein Auto-Encoder (Kolouri, et al. 2019) に適用
 実用的よりも実験的な設定(右図)
 GANとの組み合わせは試してない
 𝑔 をNNで構成できるか微妙
21
Augmented Sliced Wasserstein
Distance
CHEN, YANG, LI @ ICLR 2021 REJECTED(6,7,4)
22
Spatial Radon Transform
 Spatial Radon Transform
𝑡, 𝜃 ∈ ℝ × 𝕊𝑑𝜃−1, 𝑔: ℝ𝑑 → ℝ𝑑𝜃
ℋ𝐼 𝑡, 𝜃; 𝑔 =
ℝ𝑑
𝐼 𝑥 𝛿 𝑡 − 𝑔 𝑥 , 𝜃 d𝑥 = ℛ 𝑔∗
𝐼 𝑡, 𝜃
※ 多項式によるGRTを含む
 命題
𝑔 :単射 ⇔ ℋ:単射
23
Augmented Sliced Wasserstein Distance
 Augmented (max-)sliced Wasserstein distance
𝐴𝑆𝑊
𝑝
𝑝
𝜇, 𝜈 ≔
𝕊𝑑𝜃−1
𝑊
𝑝
𝑝
(ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )d𝜃
max𝐴𝑆𝑊
𝑝 𝜇, 𝜈 ≔ max
𝜃∈𝕊𝑑𝜃−1
𝑊
𝑝( ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )
𝑔 が単射でさえあれば良いので、NNでも表現できる : 𝑔 = [𝑥, 𝜙𝑁𝑁 𝑥 ]
良い𝑔を得るための最適化の目的関数:
24
実験では1層、ReLU
𝑑 = 𝑑𝜃
実験結果
 Toy Problem (KolouriのGSWと同じ)
 標準正規分布から勾配法で他の分布を目指す
25
実際のW2が最小
GANへの適用
 CIFAR10 (64*64), CELEBA (64*64)
 モデルやロスの設計はDeshpande 2018と同じ?:
26
方向ベクトル
のサンプル数
Distributed SWD
(NeurIPS 2019)
紹介した論文
1. Deshpande, Zhang, Schwing, “Generative Modeling Using the Sliced Wasserstein Distance”,
CVPR 2018.
2. Deshpande, Hu, Sun, Pyrros, Siddiqui, Koyejo, Zhao, Forsyth, Schwing, “Max-Sliced
Wasserstein distance and its use for GANs”, CVPR 2019.
3. Kolouri, Nadjahi, Simsekli, Badeau, Rohde “Generalized Sliced Wasserstein Distances”,
NeurIPS 2019.
4. Chen, Yang, Li, “Augmented Sliced Wasserstein Distances”, arXiv:2006.08812, 2020.
27
参考文献
1. Arjovsky, Chintala, Bottou, “Wasserstein Generative Adversarial Networks”, ICML 2017.
2. Rabin, Peyre, Delon, Marc, “Wasserstein Barycenter and its Application to Texture Mixing”,
SSVM’11, 435-446, 2011.
3. Bonneel, Rabin, Peyre, Pfister, “Sliced and Radon Wasserstein Barycenters of Measures”, Journal
of Mathematical Imaging and Vision, Springer Verlag, 1 (51), 22-45, 2015.
4. Kolouri, Rohde, Hoffman, “Sliced Wasserstein Distance for Learning Gaussian Mixture Models”,
CVPR 2018.
5. Kolouri, Pope, Martin, Rohde, “Sliced Wasserstein Auto-Encoders”, ICLR 2019.
6. Nadjahi, Durmus, Chizat, Kolouri, Shahranpour, Şimsekli, “Statistical and Topological Properties of
Sliced Probability Divergences”, arXiv:2003.05783, 2020.
7. Lin, Zheng, Chen, Cuturi, Jordan, “On Projection Robust Optimal Transport: Sample Complexity
and Model Misspecification”, arXiv:2006.12301, 2020.
28

More Related Content

What's hot

【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised LearningまとめDeep Learning JP
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法Deep Learning JP
 
ドメイン適応の原理と応用
ドメイン適応の原理と応用ドメイン適応の原理と応用
ドメイン適応の原理と応用Yoshitaka Ushiku
 
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...Deep Learning JP
 
変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)Takao Yamanaka
 
機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化gree_tech
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? Deep Learning JP
 
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習cvpaper. challenge
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報Deep Learning JP
 
論文紹介 "DARTS: Differentiable Architecture Search"
論文紹介 "DARTS: Differentiable Architecture Search"論文紹介 "DARTS: Differentiable Architecture Search"
論文紹介 "DARTS: Differentiable Architecture Search"Yuta Koreeda
 
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
【論文読み会】Deep Clustering for Unsupervised Learning of Visual FeaturesARISE analytics
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)Masahiro Suzuki
 
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational AutoencoderDeep Learning JP
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep LearningSeiya Tokui
 
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)Deep Learning JP
 
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習Deep Learning JP
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Yoshitaka Ushiku
 

What's hot (20)

【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
 
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
 
ドメイン適応の原理と応用
ドメイン適応の原理と応用ドメイン適応の原理と応用
ドメイン適応の原理と応用
 
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
[DL輪読会]Wasserstein GAN/Towards Principled Methods for Training Generative Adv...
 
変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)変分推論法(変分ベイズ法)(PRML第10章)
変分推論法(変分ベイズ法)(PRML第10章)
 
機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化機械学習モデルのハイパパラメータ最適化
機械学習モデルのハイパパラメータ最適化
 
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks? 【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
 
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
 
[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報[DL輪読会]ICLR2020の分布外検知速報
[DL輪読会]ICLR2020の分布外検知速報
 
論文紹介 "DARTS: Differentiable Architecture Search"
論文紹介 "DARTS: Differentiable Architecture Search"論文紹介 "DARTS: Differentiable Architecture Search"
論文紹介 "DARTS: Differentiable Architecture Search"
 
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
【論文読み会】Deep Clustering for Unsupervised Learning of Visual Features
 
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
 
GAN(と強化学習との関係)
GAN(と強化学習との関係)GAN(と強化学習との関係)
GAN(と強化学習との関係)
 
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
[DL輪読会]NVAE: A Deep Hierarchical Variational Autoencoder
 
ELBO型VAEのダメなところ
ELBO型VAEのダメなところELBO型VAEのダメなところ
ELBO型VAEのダメなところ
 
生成モデルの Deep Learning
生成モデルの Deep Learning生成モデルの Deep Learning
生成モデルの Deep Learning
 
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
 
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習
 
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 

Similar to Sliced Wasserstein距離と生成モデル

IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsAkisato Kimura
 
MIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdfMIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdfVania Najah
 
Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07Syoyo Fujita
 
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...Atsushi Nitanda
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOKazuki Yoshida
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slidesSara Asher
 
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Atsushi Nitanda
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Universitat Politècnica de Catalunya
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsSungchul Kim
 
Split block domination in graphs
Split block domination in graphsSplit block domination in graphs
Split block domination in graphseSAT Journals
 
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...Florian Cartuta
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesNAVER Engineering
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matchingtaeseon ryu
 
Optimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimationOptimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimationAlessandro Samuel-Rosa
 
about power system operation and control13197214.ppt
about power system operation and control13197214.pptabout power system operation and control13197214.ppt
about power system operation and control13197214.pptMohammedAhmed66819
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphsDavid Gleich
 
Caret Package for R
Caret Package for RCaret Package for R
Caret Package for Rkmettler
 

Similar to Sliced Wasserstein距離と生成モデル (20)

IJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphsIJCAI13 Paper review: Large-scale spectral clustering on graphs
IJCAI13 Paper review: Large-scale spectral clustering on graphs
 
MIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdfMIA 2 - Vector Concept Review.pdf
MIA 2 - Vector Concept Review.pdf
 
Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07Vertex Culling illustration at SBR07
Vertex Culling illustration at SBR07
 
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
Sharp Characterization of Optimal Minibatch Size for Stochastic Finite Sum Co...
 
Visual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSOVisual Explanation of Ridge Regression and LASSO
Visual Explanation of Ridge Regression and LASSO
 
Svm map reduce_slides
Svm map reduce_slidesSvm map reduce_slides
Svm map reduce_slides
 
Research_Poster_Final
Research_Poster_FinalResearch_Poster_Final
Research_Poster_Final
 
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
Stochastic Gradient Descent with Exponential Convergence Rates of Expected Cl...
 
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
Backpropagation (DLAI D3L1 2017 UPC Deep Learning for Artificial Intelligence)
 
Score based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential EquationsScore based Generative Modeling through Stochastic Differential Equations
Score based Generative Modeling through Stochastic Differential Equations
 
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
MUMS: Transition & SPUQ Workshop - Gradient-Free Construction of Active Subsp...
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
Split block domination in graphs
Split block domination in graphsSplit block domination in graphs
Split block domination in graphs
 
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
Multi-Step-Ahead Simultaneously Forecasting For Multiple Time-Series, Using T...
 
is anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayesis anyone_interest_in_auto-encoding_variational-bayes
is anyone_interest_in_auto-encoding_variational-bayes
 
Distributional RL via Moment Matching
Distributional RL via Moment MatchingDistributional RL via Moment Matching
Distributional RL via Moment Matching
 
Optimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimationOptimization of sample configurations for variogram estimation
Optimization of sample configurations for variogram estimation
 
about power system operation and control13197214.ppt
about power system operation and control13197214.pptabout power system operation and control13197214.ppt
about power system operation and control13197214.ppt
 
Localized methods for diffusions in large graphs
Localized methods for diffusions in large graphsLocalized methods for diffusions in large graphs
Localized methods for diffusions in large graphs
 
Caret Package for R
Caret Package for RCaret Package for R
Caret Package for R
 

More from ohken

Qiskit Advocate 自己紹介
Qiskit Advocate 自己紹介Qiskit Advocate 自己紹介
Qiskit Advocate 自己紹介ohken
 
最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向ohken
 
ICLR2020読み会 Stable Rank Normalization
ICLR2020読み会 Stable Rank NormalizationICLR2020読み会 Stable Rank Normalization
ICLR2020読み会 Stable Rank Normalizationohken
 
Associative Memory Model について
Associative Memory Model についてAssociative Memory Model について
Associative Memory Model についてohken
 
Multivariate Time series analysis via interpretable RNNs
Multivariate Time series analysis via interpretable RNNsMultivariate Time series analysis via interpretable RNNs
Multivariate Time series analysis via interpretable RNNsohken
 
ICML 2020 最適輸送まとめ
ICML 2020 最適輸送まとめICML 2020 最適輸送まとめ
ICML 2020 最適輸送まとめohken
 

More from ohken (6)

Qiskit Advocate 自己紹介
Qiskit Advocate 自己紹介Qiskit Advocate 自己紹介
Qiskit Advocate 自己紹介
 
最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向最適輸送の計算アルゴリズムの研究動向
最適輸送の計算アルゴリズムの研究動向
 
ICLR2020読み会 Stable Rank Normalization
ICLR2020読み会 Stable Rank NormalizationICLR2020読み会 Stable Rank Normalization
ICLR2020読み会 Stable Rank Normalization
 
Associative Memory Model について
Associative Memory Model についてAssociative Memory Model について
Associative Memory Model について
 
Multivariate Time series analysis via interpretable RNNs
Multivariate Time series analysis via interpretable RNNsMultivariate Time series analysis via interpretable RNNs
Multivariate Time series analysis via interpretable RNNs
 
ICML 2020 最適輸送まとめ
ICML 2020 最適輸送まとめICML 2020 最適輸送まとめ
ICML 2020 最適輸送まとめ
 

Recently uploaded

User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squaresusmanzain586
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxsubscribeus100
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxRitchAndruAgustin
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptJoemSTuliba
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 

Recently uploaded (20)

Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?Let’s Say Someone Did Drop the Bomb. Then What?
Let’s Say Someone Did Drop the Bomb. Then What?
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
trihybrid cross , test cross chi squares
trihybrid cross , test cross chi squarestrihybrid cross , test cross chi squares
trihybrid cross , test cross chi squares
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Ai in communication electronicss[1].pptx
Ai in communication electronicss[1].pptxAi in communication electronicss[1].pptx
Ai in communication electronicss[1].pptx
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptxGENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
GENERAL PHYSICS 2 REFRACTION OF LIGHT SENIOR HIGH SCHOOL GENPHYS2.pptx
 
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
Four Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.pptFour Spheres of the Earth Presentation.ppt
Four Spheres of the Earth Presentation.ppt
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 

Sliced Wasserstein距離と生成モデル

  • 2. 目次 1. Wasserstein距離とWGAN 2. Sliced Wasserstein 距離と生成モデルSWG 3. Max-sliced Wasserstein距離 1. Sliced Wassersteinの統計的性質 2. 生成モデルへの応用 4. Generalized sliced Wasserstein距離 1. Radon変換 2. SWの一般化 5. Augmented sliced Wasserstein距離 2
  • 3. Wasserstein距離 𝜇, 𝜈 : ℝ𝑑 上の確率測度 (e.g. d𝜇 = 𝑓d𝑥, 𝑓は確率密度関数) Π 𝜇, 𝜈 ≔ {𝜋: ℝ𝑑 × ℝ𝑑 上の測度, 𝜋 𝐴 × ℝ𝑑 = 𝜇 𝐴 , 𝜋 ℝ𝑑 × 𝐵 = 𝜈 𝐵 }  Wasserstein距離 𝑝 ≥ 1に対し 𝑊 𝑝 𝜇, 𝜈 ≔ min 𝜋∈Π(𝜇,𝜈) 𝒳×𝒳 𝑥 − 𝑦 𝑝 d𝜋 𝑥, 𝑦 1/𝑝 命題 𝑊 𝑝 は ℝ𝑑 上の確率測度間の距離を定める. 3
  • 4. Wasserstein GAN  双対性 𝑊1 𝜇, 𝜈 = inf 𝜋∈Π(𝜇,𝜈) ℝ𝑑 𝑥 − 𝑦 d𝜋 = sup 𝑓∈𝐿𝑖𝑝1 ℝ𝑑 𝑓d𝜇 − ℝ𝑑 𝑓d𝜈  Wasserstein GAN (Arjovsky et al. ICML 2017) 𝜇:生成分布、𝜈: データ分布として、min 𝑊1 を解くように生成器と 𝑓 を学習する  課題: 𝑊1を正確に推定するために識別器 𝑓 がある程度学習できてないといけない e.g. 生成器のパラメータ更新1回に対して 𝑓 を5回更新 4
  • 5. Generative Modeling using the Sliced Wasserstein Distance DESHPANDE, ZHANG, SCHWING @ CVPR 2018 5
  • 6. Sliced Wasserstein distance Sliced Wasserstein (Rabin 2011, Bonneel 2015) SW2 𝜇, 𝜈 2 ≔ 𝕊𝑑−1 𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈 2 d𝜃 𝑃𝜃 ∶ 𝒳 = ℝ𝑑 → ℝ は𝜃 ∈ 𝕊𝑑−1 = {𝑥 ∈ ℝ𝑑 ∣ 𝑥 = 1} 方向の直線への射影  1次元でのOTが陽に解けるので計算が楽  積分は方向ベクトルのサンプリングで行う  SWも距離(しかもWasserstein距離と同値)  混合ガウス分布によるモデリングでの応用 Kolouri, et al. 2018 6
  • 7. Sliced Wasserstein Generator Deshpande, et al. CVPR 2018  Sliced Wasserstein距離を損失関数に用いた生成モデル  距離を推定するために識別器を学習させずに済む  方向ベクトルのサンプル数は10000くらい(MNISTで)  生成器の更新が1.5 ~ 2倍くらいの時間になるらしい (識別機はないのでWGANより高速) 7
  • 9. SWG : 高次元の場合 課題: 高次元になるほど、SWの近似に必要な方向ベクトルの数が増える → できるだけ「情報の多い」方向ベクトルを選びたい 「情報の多い低次元空間」に移す役割としてDiscriminator を復活 Discriminator 𝑓𝜃′ ′ 𝑓 生成分布 𝐺𝜃(𝑃𝑧) データ分布 𝑃𝑑 𝒟 ℱ サンプリング 𝑓𝜃′ 𝒟 𝑓𝜃′(ℱ) SW Loss Discriminator Loss 識別しやすい空間 = SWの推定が簡単な空間 というヒューリスティック 生成結果は論文を参照 8
  • 10. (Sliced) Wassersteinの統計的性質 Wasserstein距離の推定は、次元が高くなるほどサンプル効率が悪い  経験分布の弱収束 確率測度 𝜇 ∈ 𝒫𝑝 ℝ𝑑 の経験分布 𝜇𝑛 に対し、𝑊 𝑝 𝜇𝑛, 𝜇 → 0 a.s. しかしながら収束の速さは (𝜇:絶対連続, 𝑑 > 2𝑞のとき) 𝔼 𝑊 𝑝 𝜇𝑛, 𝜇 ≃ 𝑛− 1 𝑑  Sliced Wassersteinのサンプル効率 (Nadjahi, et al. 2020, Lin, et al. 2020) (適当な条件の下で) ) 𝔼 𝑆𝑊 𝑝 𝜇𝑛, 𝜇 ≃ 𝑛−1 10 SWD (Deshpande 2018)での実験結果
  • 11. Max-Sliced Wasserstein Distance and its use for GANs DESHPANDE, HU, SUN, PYRROS, SIDDIQUI @ CVPR 2019 11
  • 12. SWの推定効率  Sliced Wassersteinのサンプル効率性に加え、方向ベクトルのサンプル効率を検証 → 重要な方向だけを採用するのが良さそう 12 𝜇 = 𝒩(0, 𝐼) を 𝜈 = 𝒩(𝛽𝑒, 𝐼)で推定: 𝛽 ← 𝛽 − 𝛼𝛻𝛽𝑆𝑊2 𝜇, 𝜈 max-𝑊 : 𝑒を方向ベクトルに使う
  • 13. Max-sliced Wasserstein距離  max-sliced Wasserstein maxSW2 𝜇, 𝜈 ≔ max 𝜃∈𝕊𝑑−1 𝑊2 𝑃𝜃 #𝜇, 𝑃𝜃 #𝜈 分布間の距離を与える (Wasserstein距離と同値)  Sliced Wasserstein とほぼ同じサンプル効率 13
  • 14. Max-sliced GAN  maxをどうやって計算するのか? Sliced Wasserstein Generator の時と同じアイデア : 特徴量写像 + 良い方向ベクトル = Discriminator 14 特徴量写像のパラメータ ちょっと難しい… Surrogateモデル導入 e.g.
  • 16. Generalized Sliced Wasserstein Distance KOLOURI, NADJAHIM, ŞIMŞEKLI, BADEAU, ROHDE @ NEURIPS 2019 16
  • 17. ラドン変換とSliced Wasserstein  Radon Transform (Radon, 1917) 𝐼 ∈ 𝐿1 ℝ𝑑 = 𝐼: ℝ𝑑 → ℝ ℝ𝑑 𝐼 𝑥 d𝑥 < ∞ , 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑−1 𝐼 ↦ ℛ𝐼 𝑡, 𝜃 ∶= ℝ𝑑 𝐼 𝑥 𝛿(𝑡 − 𝑥, 𝜃 )d𝑥 ※ CTスキャンなどの断層映像法(トモグラフィ)で使われる これを使うと密度 d𝜇 = 𝐼𝜇(𝑥)d𝑥, d𝜈 = 𝐼𝜈(𝑥)d𝑥 を持つ 𝜇, 𝜈 に対して 𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 = 𝕊𝑑−1 𝑊 𝑝 𝑝 (ℛ𝐼𝜇 ⋅, 𝜃 , ℛ𝐼𝜈 ⋅, 𝜃 )d𝜃 と書ける。 17
  • 18. 一般化ラドン変換  Generalized Radon Transform (Beylkin, 1984) 𝒢𝐼 𝑡, 𝜃 = ℝ𝑑 𝐼 𝑥 𝛿(𝑡 − 𝑔 𝑥, 𝜃 )d𝑥 𝑔: ℝ𝑑 × (ℝ𝑛∖ 0 ) → ℝ は いくつかの条件を満たす定義関数 18
  • 19. 一般化(max-)Sliced Wasserstein  Generalized (max-)sliced Wasserstein distance 𝐺𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 ≔ Ω𝜃 𝑊 𝑝 𝑝 (𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )d𝜃 max𝐺𝑆𝑊 𝑝 𝜇, 𝜈 ≔ max 𝜃∈Ω𝜃 𝑊 𝑝( 𝒢𝐼𝜇 ⋅, 𝜃 , 𝒢𝐼𝜈 ⋅, 𝜃 )  命題 𝒢 が単射のとき、𝐺𝑆𝑊, max𝐺𝑆𝑊 は確率分布間の距離を与える ※ 𝑔 : circular, polynomial(奇数次のみで斉次) などが単射を与えることが知られている 19
  • 21. 実験結果  Toy example でSWと比較  より柔軟な射影を計算できるので効率よく分布マッチングできる  Sliced Wasserstein Auto-Encoder (Kolouri, et al. 2019) に適用  実用的よりも実験的な設定(右図)  GANとの組み合わせは試してない  𝑔 をNNで構成できるか微妙 21
  • 22. Augmented Sliced Wasserstein Distance CHEN, YANG, LI @ ICLR 2021 REJECTED(6,7,4) 22
  • 23. Spatial Radon Transform  Spatial Radon Transform 𝑡, 𝜃 ∈ ℝ × 𝕊𝑑𝜃−1, 𝑔: ℝ𝑑 → ℝ𝑑𝜃 ℋ𝐼 𝑡, 𝜃; 𝑔 = ℝ𝑑 𝐼 𝑥 𝛿 𝑡 − 𝑔 𝑥 , 𝜃 d𝑥 = ℛ 𝑔∗ 𝐼 𝑡, 𝜃 ※ 多項式によるGRTを含む  命題 𝑔 :単射 ⇔ ℋ:単射 23
  • 24. Augmented Sliced Wasserstein Distance  Augmented (max-)sliced Wasserstein distance 𝐴𝑆𝑊 𝑝 𝑝 𝜇, 𝜈 ≔ 𝕊𝑑𝜃−1 𝑊 𝑝 𝑝 (ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 )d𝜃 max𝐴𝑆𝑊 𝑝 𝜇, 𝜈 ≔ max 𝜃∈𝕊𝑑𝜃−1 𝑊 𝑝( ℋ𝐼𝜇 ⋅, 𝜃; 𝑔 , ℋ𝐼𝜈 ⋅, 𝜃; 𝑔 ) 𝑔 が単射でさえあれば良いので、NNでも表現できる : 𝑔 = [𝑥, 𝜙𝑁𝑁 𝑥 ] 良い𝑔を得るための最適化の目的関数: 24 実験では1層、ReLU 𝑑 = 𝑑𝜃
  • 25. 実験結果  Toy Problem (KolouriのGSWと同じ)  標準正規分布から勾配法で他の分布を目指す 25 実際のW2が最小
  • 26. GANへの適用  CIFAR10 (64*64), CELEBA (64*64)  モデルやロスの設計はDeshpande 2018と同じ?: 26 方向ベクトル のサンプル数 Distributed SWD (NeurIPS 2019)
  • 27. 紹介した論文 1. Deshpande, Zhang, Schwing, “Generative Modeling Using the Sliced Wasserstein Distance”, CVPR 2018. 2. Deshpande, Hu, Sun, Pyrros, Siddiqui, Koyejo, Zhao, Forsyth, Schwing, “Max-Sliced Wasserstein distance and its use for GANs”, CVPR 2019. 3. Kolouri, Nadjahi, Simsekli, Badeau, Rohde “Generalized Sliced Wasserstein Distances”, NeurIPS 2019. 4. Chen, Yang, Li, “Augmented Sliced Wasserstein Distances”, arXiv:2006.08812, 2020. 27
  • 28. 参考文献 1. Arjovsky, Chintala, Bottou, “Wasserstein Generative Adversarial Networks”, ICML 2017. 2. Rabin, Peyre, Delon, Marc, “Wasserstein Barycenter and its Application to Texture Mixing”, SSVM’11, 435-446, 2011. 3. Bonneel, Rabin, Peyre, Pfister, “Sliced and Radon Wasserstein Barycenters of Measures”, Journal of Mathematical Imaging and Vision, Springer Verlag, 1 (51), 22-45, 2015. 4. Kolouri, Rohde, Hoffman, “Sliced Wasserstein Distance for Learning Gaussian Mixture Models”, CVPR 2018. 5. Kolouri, Pope, Martin, Rohde, “Sliced Wasserstein Auto-Encoders”, ICLR 2019. 6. Nadjahi, Durmus, Chizat, Kolouri, Shahranpour, Şimsekli, “Statistical and Topological Properties of Sliced Probability Divergences”, arXiv:2003.05783, 2020. 7. Lin, Zheng, Chen, Cuturi, Jordan, “On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification”, arXiv:2006.12301, 2020. 28