SlideShare a Scribd company logo
Your Classifier is Secretly an Energy Based Model
and You Should Treat It Like One (ICLR 2020)
Will Grathwohl1,2
, Jackson Wang1
, Jörn Jacobsen1
, David Duvenaud1
Mohammad Norouzi2
, Kevin Swersky2
1. University of Toronto & Vector Institute
2. Google Research
Conference page:https://iclr.cc/virtual_2020/poster_Hkxzx0NtDB.html
ICLR2020読み会@オンライン Presenter:廣岡大吾(@daigo_hirooka)
自己紹介:廣岡大吾(@daigo_hirooka)
● 機械学習エンジニア @ブレインパッド
● 関心
○ 確率推論、Bayesian NN
○ 修士の頃はGAN、ドメイン適応など
● その他
○ 白金鉱業.FM(@shirokane_fm)
podcastで配信中
2
Summary
識別モデルをもとに energy-based model(EBM)を定義
通常のクラス分類と並行してデータの生成分布を学習する
=JEM:Joint Energy-based Model
● (ほぼ)どんな識別モデルでも適用可能
● 生成モデルに由来するメリットを保持
○ 予測確率のキャリブレーション
○ 分布外検知
○ 敵対的サンプルに対するロバスト性
3
Why generative model?
● メリット
○ 高次元データの潜在的な構造を抽出できる
○ 教師なしデータを活用できる
○ 半教師あり学習、欠損値の補完、不確実性の
キャリブレーションに有用
● 最近の動向
○ 生成サンプルの品質や尤度を目標とした研究が多い
○ 一方で応用(downstream applications)に対する
関心は比較的少ない
4
Hand-tailored solutions work/scale better
生成モデルの使いどころ
● 分布外検知
● ロバスト分類
● 半教師あり学習
実際はそれぞれ特化したアプローチが広く利用さ
れている
● 分布外検知用の分類モデルを構築
● 敵対的学習
● データ拡張+正則化
5
Why?
● 深層生成モデルのアーキテクチャが識別モデルに比べて乱立している
● 深層生成モデルが識別タスクを意識して設計されていない
EBM: Energy-based models
エネルギー関数ああを用いて確率密度関数を表現する
● 例:正規分布
○ ああああああああとおくと
● エネルギー関数ああは柔軟にモデル化できる
● 分配関数あああは実際は計算困難な場合が多い
6
Training EBM
分配関数が計算できないため最尤推定によるパラメータ推定は難しい
対数尤度の微分に基づく学習(エネルギー関数のみで計算可能)が主流
7
Approximate likelihood gradient
データ分布を再現するように EBMを学習する
8
エネルギー関数の値を減少
=データの密度を増加
Approximate likelihood gradient
データ分布を再現するように EBMを学習する
9
エネルギー関数の値を増加
=生成サンプルの密度を減少
データ分布を再現するように EBMを学習する
Approximate likelihood gradient
10
モデル分布による期待値
→MCMCでサンプル生成・近似
Your classifier is secretly an EBM
一般的な分類タスク
● 条件付き分布ああああのモデル化
● モデル
● パラメータ推定:尤度の最大化
11
Your classifier is secretly an EBM
分類モデルの出力を用いて同時分布の EBMを定義する
12
Your classifier is secretly an EBM
yを周辺化することでxのEBMも定義する
13
Your classifier is secretly an EBM
条件付き分布も自然に定義される
14
Your classifier is secretly an EBM
分類モデルに制約を加えることなく各分布を定義できた
we have found a Joint Energy Model inside of your
classifier… a hidden JEM 😏
15
Optimization
目標:
● 第1項:xの尤度最大化
○ 確率的ランジュバン動力学法( SGLD:MCMC法の一種)によってモデ
ル分布のサンプルを生成
○ モデル分布による期待値を近似・勾配計算
● 第2項:条件付き尤度の最大化
○ 通常の分類誤差
16
Optimization
17
通常の分類誤差
SGLDによるサンプル生成、 xの尤度最大化
Experiments: Hybrid modeling
● 分類精度・生成品質ともに高い
○ 分類モデル:Wide-ResNet(BatchNormは除外)
○ データ:CIFAR-10
18
Experiments: Calibration
● 運用上は予測確率( predictive uncertainty)と
正解率の整合性が重要
● ECE(Expected Calibration Error)
○ 予測確率と正解率の整合性を評価
● JEMによって予測確率がキャリブレーションされ
ることを確認
19
Experiments: Out-of-distribution detection
● ああを用いた分布外検知が可能
○ 学習データ:CIFAR-10
○ 分布外(OOD)データ:SVHN、CIFAR-100、
CelebA
● ああああのヒストグラム
○ CIFAR-10が→(尤度大)
○ OODデータが←(尤度小)だと良い
● GlowよりもOODデータを分離できている
○ 定量評価も論文に掲載
20
Experiments: Adversarial robustness
● 敵対的サンプル
○ 元のサンプルとのノルム制約下で誤分類するように
摂動を加える
● JEMはベースラインよりもロバスト
○ 入力データを起点に MCMCサンプリング
を行うことでさらにロバスト( JEM-1,10)
21
Experiments: Adversarial robustness
● Distal adversarials
○ ランダムな初期値から分類確率が高くなる
サンプルを生成する
● 各モデルであああああああああ  となるサンプル
○ CNN:ほぼノイズ
○ ADV(ResNet+敵対的学習)
:車らしい構造はあるがノイズが多い
○ JEM:他モデルよりも自然な画像が出現
22
Limitations & Discussions
● 正規化尤度が計算できないので学習が適切に進んでいるかを確かめづらい
● EBMの学習が不安定
○ MCMCサンプリングのパラメータチューニングが必要
○ 正則化の導入によって学習を安定化できるかもしれない
● MCMCを用いるため学習・評価が面倒になりやすい
● Follow-up work
○ “Cutting out the Middle-Man: Training and Evaluating Energy-Based Models without Sampling”
○ EBMの学習・評価に関する問題に言及
23
References
● 内容に関する図は全て論文・公開スライドから引用
○ 論文:https://openreview.net/pdf?id=Hkxzx0NtDB
○ スライド:https://iclr.cc/virtual_2020/poster_Hkxzx0NtDB.html
● EBMの学習、MCMCベースの学習法
○ Goodfellow, Ian, Yoshua Bengio, and Aaron Courville. Deep learning. MIT press, 2016.
● 確率的ランジュバン動力学法( SGLD)
○ 須山敦志. ベイズ深層学習. 講談社, 2019
24

More Related Content

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Your classifier is secretly an energy based model and you should treat it like one