Submit Search
Upload
High-Dimensional Bayesian Optimization with Constraints: Application to Powder weighing (PDPAT2022/MPS139)
•
0 likes
•
104 views
Shoki Miyagawa
Follow
Slide used in a conference PDPTA'22/MPS'139.
Read less
Read more
Technology
Report
Share
Report
Share
1 of 34
Recommended
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)
RCCSRENKEI
【DL輪読会】SimCSE: Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021)
【DL輪読会】SimCSE: Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021)
Deep Learning JP
[DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions
Deep Learning JP
機械学習モデルの判断根拠の説明(Ver.2)
機械学習モデルの判断根拠の説明(Ver.2)
Satoshi Hara
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
Deep Learning JP
モデルではなく、データセットを蒸留する
モデルではなく、データセットを蒸留する
Takahiro Kubo
【メタサーベイ】基盤モデル / Foundation Models
【メタサーベイ】基盤モデル / Foundation Models
cvpaper. challenge
機械学習で嘘をつく話
機械学習で嘘をつく話
Satoshi Hara
Recommended
第10回 配信講義 計算科学技術特論A(2021)
第10回 配信講義 計算科学技術特論A(2021)
RCCSRENKEI
【DL輪読会】SimCSE: Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021)
【DL輪読会】SimCSE: Simple Contrastive Learning of Sentence Embeddings (EMNLP 2021)
Deep Learning JP
[DL輪読会]Understanding Black-box Predictions via Influence Functions
[DL輪読会]Understanding Black-box Predictions via Influence Functions
Deep Learning JP
機械学習モデルの判断根拠の説明(Ver.2)
機械学習モデルの判断根拠の説明(Ver.2)
Satoshi Hara
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
【DL輪読会】WIRE: Wavelet Implicit Neural Representations
Deep Learning JP
モデルではなく、データセットを蒸留する
モデルではなく、データセットを蒸留する
Takahiro Kubo
【メタサーベイ】基盤モデル / Foundation Models
【メタサーベイ】基盤モデル / Foundation Models
cvpaper. challenge
機械学習で嘘をつく話
機械学習で嘘をつく話
Satoshi Hara
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
Yoshitaka Ushiku
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
Satoshi Hara
【DL輪読会】A Path Towards Autonomous Machine Intelligence
【DL輪読会】A Path Towards Autonomous Machine Intelligence
Deep Learning JP
モンテカルロ法と情報量
モンテカルロ法と情報量
Shohei Miyashita
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
Counterfaual Machine Learning(CFML)のサーベイ
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
Morpho, Inc.
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
MLSE
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
Kouji Kozaki
ナレッジグラフ入門
ナレッジグラフ入門
KnowledgeGraph
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
cvpaper. challenge
Active Learning 入門
Active Learning 入門
Shuyo Nakatani
“機械学習の説明”の信頼性
“機械学習の説明”の信頼性
Satoshi Hara
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
Deep Learning JP
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
SSII
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
Deep Learning JP
Direct feedback alignment provides learning in Deep Neural Networks
Direct feedback alignment provides learning in Deep Neural Networks
Deep Learning JP
【DL輪読会】Reward Design with Language Models
【DL輪読会】Reward Design with Language Models
Deep Learning JP
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Deep Learning JP
ABAQUS LEC.ppt
ABAQUS LEC.ppt
AdalImtiaz
PF_MAO2010 Souma
PF_MAO2010 Souma
Souma Chowdhury
More Related Content
What's hot
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
Yoshitaka Ushiku
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
Satoshi Hara
【DL輪読会】A Path Towards Autonomous Machine Intelligence
【DL輪読会】A Path Towards Autonomous Machine Intelligence
Deep Learning JP
モンテカルロ法と情報量
モンテカルロ法と情報量
Shohei Miyashita
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Deep Learning JP
Counterfaual Machine Learning(CFML)のサーベイ
Counterfaual Machine Learning(CFML)のサーベイ
ARISE analytics
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
Morpho, Inc.
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
MLSE
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
Kouji Kozaki
ナレッジグラフ入門
ナレッジグラフ入門
KnowledgeGraph
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
cvpaper. challenge
Active Learning 入門
Active Learning 入門
Shuyo Nakatani
“機械学習の説明”の信頼性
“機械学習の説明”の信頼性
Satoshi Hara
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
Deep Learning JP
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
SSII
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
Deep Learning JP
Direct feedback alignment provides learning in Deep Neural Networks
Direct feedback alignment provides learning in Deep Neural Networks
Deep Learning JP
【DL輪読会】Reward Design with Language Models
【DL輪読会】Reward Design with Language Models
Deep Learning JP
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Deep Learning JP
What's hot
(20)
Curriculum Learning (関東CV勉強会)
Curriculum Learning (関東CV勉強会)
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
【論文調査】XAI技術の効能を ユーザ実験で評価する研究
【DL輪読会】A Path Towards Autonomous Machine Intelligence
【DL輪読会】A Path Towards Autonomous Machine Intelligence
モンテカルロ法と情報量
モンテカルロ法と情報量
[DL輪読会]Flow-based Deep Generative Models
[DL輪読会]Flow-based Deep Generative Models
Counterfaual Machine Learning(CFML)のサーベイ
Counterfaual Machine Learning(CFML)のサーベイ
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
(文献紹介)Deep Unrolling: Learned ISTA (LISTA)
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
【基調講演】『深層学習の原理の理解に向けた理論の試み』 今泉 允聡(東大)
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
オントロジー工学に基づくセマンティック技術(1)オントロジー工学入門
ナレッジグラフ入門
ナレッジグラフ入門
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
Active Learning 入門
Active Learning 入門
“機械学習の説明”の信頼性
“機械学習の説明”の信頼性
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
[DL輪読会]Weakly-Supervised Disentanglement Without Compromises
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
SSII2021 [OS2-01] 転移学習の基礎:異なるタスクの知識を利用するための機械学習の方法
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
【DL輪読会】HyperTree Proof Search for Neural Theorem Proving
Direct feedback alignment provides learning in Deep Neural Networks
Direct feedback alignment provides learning in Deep Neural Networks
【DL輪読会】Reward Design with Language Models
【DL輪読会】Reward Design with Language Models
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Similar to High-Dimensional Bayesian Optimization with Constraints: Application to Powder weighing (PDPAT2022/MPS139)
ABAQUS LEC.ppt
ABAQUS LEC.ppt
AdalImtiaz
PF_MAO2010 Souma
PF_MAO2010 Souma
Souma Chowdhury
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
Jihun Yun
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
Steady state CFD analysis of C-D nozzle
Steady state CFD analysis of C-D nozzle
Vishnu R
Radioss analysis quality_ht
Radioss analysis quality_ht
AltairKorea
Self-dependent 3D face rotational alignment using the nose region
Self-dependent 3D face rotational alignment using the nose region
Mehryar (Mike) E., Ph.D.
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Sangwoo Mo
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
Borhan Kazimipour
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
Tutorial_01_Quick_Start.pdf
Tutorial_01_Quick_Start.pdf
ChunaramChoudhary1
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Sangmin Woo
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Vinayak Hegde
Metric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN Graphs
joisino
FEA Report
FEA Report
Khashayar Tayebi
Linear programming models - U2.pptx
Linear programming models - U2.pptx
MariaBurgos55
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
Borhan Kazimipour
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
Sungchul Kim
EDM_SEMINAR.pptx
EDM_SEMINAR.pptx
JiaJunWang17
Benders Decomposition
Benders Decomposition
LIFE GreenYourMove
Similar to High-Dimensional Bayesian Optimization with Constraints: Application to Powder weighing (PDPAT2022/MPS139)
(20)
ABAQUS LEC.ppt
ABAQUS LEC.ppt
PF_MAO2010 Souma
PF_MAO2010 Souma
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
ProxGen: Adaptive Proximal Gradient Methods for Structured Neural Networks (N...
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Steady state CFD analysis of C-D nozzle
Steady state CFD analysis of C-D nozzle
Radioss analysis quality_ht
Radioss analysis quality_ht
Self-dependent 3D face rotational alignment using the nose region
Self-dependent 3D face rotational alignment using the nose region
Deep Implicit Layers: Learning Structured Problems with Neural Networks
Deep Implicit Layers: Learning Structured Problems with Neural Networks
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Tutorial_01_Quick_Start.pdf
Tutorial_01_Quick_Start.pdf
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Acm Tech Talk - Decomposition Paradigms for Large Scale Systems
Metric Recovery from Unweighted k-NN Graphs
Metric Recovery from Unweighted k-NN Graphs
FEA Report
FEA Report
Linear programming models - U2.pptx
Linear programming models - U2.pptx
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
A sensitivity analysis of contribution-based cooperative co-evolutionary algo...
PR-305: Exploring Simple Siamese Representation Learning
PR-305: Exploring Simple Siamese Representation Learning
EDM_SEMINAR.pptx
EDM_SEMINAR.pptx
Benders Decomposition
Benders Decomposition
Recently uploaded
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
Sinan KOZAK
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
OnBoard
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
null - The Open Security Community
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Neo4j
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
BookNet Canada
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Padma Pradeep
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
2toLead Limited
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Enjoy Anytime
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Hyundai Motor Group
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
Deakin University
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
LBM Solutions
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
Slack Application Development 101 Slides
Slack Application Development 101 Slides
praypatel2
Recently uploaded
(20)
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
Key Features Of Token Development (1).pptx
Key Features Of Token Development (1).pptx
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Slack Application Development 101 Slides
Slack Application Development 101 Slides
High-Dimensional Bayesian Optimization with Constraints: Application to Powder weighing (PDPAT2022/MPS139)
1.
© Mitsubishi Electric
Corporation High-Dimensional Bayesian Optimization with Constraints: Application to Powder Weighing Shoki Miyagawa, Atsuyoshi Yano, Naoko Sawada, Isamu Ogawa (Mitsubishi Electric Corporation)
2.
© Mitsubishi Electric
Corporation Background Bayesian optimization (BO) can explore optimal parameters in black-box problems in limited trials. 2/32 Black-box model 𝑥new Bayesian optimization Input parameters Output (to be maximized) 𝑦new 𝑦 𝑦 𝒙𝐧𝐞𝐰 𝑥 𝑥 𝑦 𝒙𝐧𝐞𝐰 𝑥 𝒚𝐧𝐞𝐰
3.
© Mitsubishi Electric
Corporation Background Bayesian optimization (BO) can explore optimal parameters in black-box problems in limited trials. However, it cannot work for high-dimensional parameters (typically > 10) because the exploration area is too wide. 3/32 Black-box model 𝑥new Bayesian optimization Input parameters Output (to be maximized) 𝑦new 𝑦 𝑦 𝒙𝐧𝐞𝐰 𝑥 𝑥 𝑦 𝒙𝐧𝐞𝐰 𝑥 𝒚𝐧𝐞𝐰
4.
© Mitsubishi Electric
Corporation Background Related works explore parameters in a low-dimensional space acquired by the following methods. 4/32 REMBO [IJCAI’13] LINEBO [ICML’19] Efficient exploration Constraints cannot be explicitly expressed in the latent space. Constraints can be easily introduced Not efficient exploration (particularly for image and NLP) Dropout [IJCAI’17] Original Dropped Subspace extraction / Linear embedding 𝒙𝐧𝐞𝐰 𝒙𝐢𝐧𝐢𝐭 BO in non-dropped dimensions. BO in a single dimension. BO in random embedded dimensions. Nonlinear embedding 2. BO in the latent space 𝑦 𝒛𝐧𝐞𝐰 𝒛 Encoder (DNN) 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒛𝟏 𝒛𝟐 𝒛𝟑 Decoder (DNN) 𝒛𝐧𝐞𝐰 𝒙𝐧𝐞𝐰 1. Encode datasets 3. Decode latent parameters We tackle this problem !! 𝒙 𝒛 Random matrix 𝑦 𝒛𝐧𝐞𝐰 𝒛 𝒙
5.
© Mitsubishi Electric
Corporation Proposed method: Key idea This study focus on two types of constraint. • Known equality constraint → Decomposition into variable and fixed parameters is useful. • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful. 5/32 Nonlinear embedding 2. Bayesian optimization on the latent space 𝑦 𝒛𝐧𝐞𝐰 𝒛 Encoder (DNN) 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒛𝟏 𝒛𝟐 𝒛𝟑 Decoder (DNN) 𝒛𝐧𝐞𝐰 𝒙𝐧𝐞𝐰 1. Encode datasets 3. Decode datasets Disentangled representation learning • Latent parameter is interpretable and independent 𝒛 Decoder (DNN) “rotation” “smile” The Figure is from 𝛽-VAE [ICLR’17] • DRL is generally used to control generative models (VAE, GAN, …) Variable parameters w/o equality constraints Fixed parameters w/ equality constraints Parameters Bayesian optimization Explored parameters condition 𝒙𝐯 𝒙𝐟 𝒙𝐟 𝒙𝐟 𝒙 𝒙𝐯 𝐧𝐞𝐰 𝒙𝐯 𝐧𝐞𝐰
6.
© Mitsubishi Electric
Corporation Proposed method: Key idea This study focus on two types of constraint. • Known equality constraint → Decomposition into variable and fixed parameters is useful. • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful. 6/32 Nonlinear embedding 2. Bayesian optimization on the latent space 𝑦 𝒛𝐧𝐞𝐰 𝒛 Encoder (DNN) 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒛𝟏 𝒛𝟐 𝒛𝟑 Decoder (DNN) 𝒛𝐧𝐞𝐰 𝒙𝐧𝐞𝐰 1. Encode datasets 3. Decode datasets Disentangled representation learning • Latent parameter is interpretable and independent 𝒛 Decoder (DNN) “rotation” “smile” The Figure is from 𝛽-VAE [ICLR’17] • DRL is generally used to control generative models (VAE, GAN, …) Variable parameters w/o equality constraints Fixed parameters w/ equality constraints Parameters Bayesian optimization Explored parameters condition 𝒙𝐯 𝒙𝐟 𝒙𝐟 𝒙𝐟 𝒙 𝒙𝐯 𝐧𝐞𝐰 𝒙𝐯 𝐧𝐞𝐰
7.
© Mitsubishi Electric
Corporation Proposed method: Key idea This study focus on two types of constraint. • Known equality constraint → Decomposition into variable and fixed parameters is useful. • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful. 7/32 Nonlinear embedding 2. Bayesian optimization on the latent space 𝑦 𝒛𝐧𝐞𝐰 𝒛 Encoder (DNN) 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒛𝟏 𝒛𝟐 𝒛𝟑 Decoder (DNN) 𝒛𝐧𝐞𝐰 𝒙𝐧𝐞𝐰 1. Encode datasets 3. Decode datasets Disentangled representation learning • Latent parameter is interpretable and independent 𝒛 Decoder (DNN) “rotation” “smile” The Figure is from 𝛽-VAE [ICLR’17] • DRL is generally used to control generative models (VAE, GAN, …) Variable parameters w/o equality constraints Fixed parameters w/ equality constraints Parameters Bayesian optimization Explored parameters condition 𝒙𝐯 𝒙𝐟 𝒙𝐟 𝒙𝐟 𝒙 𝒙𝐯 𝐧𝐞𝐰 𝒙𝐯 𝐧𝐞𝐰
8.
© Mitsubishi Electric
Corporation • Unknown inequality constraint Proposed method: Key idea 8/32 Problem in the previous methods axis #2 axis #1 Nonlinear embedding Latent parameter space Original parameter space axis #2 axis #1 We can just apply BO in a region that satisfy constraints if the inequality constraints are known. Region that satisfy constraints Region that does not satisfy constraints BO possibly generate parameters that does not satisfy constraints.
9.
© Mitsubishi Electric
Corporation Proposed method: Key idea • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful because users need only check whether the constraints are satisfied for data in each axis. 9/32 Example 1: Generating face with a constraint of man face. Decoder This region locally satisfy the constraint ! This axis is not related to the constraint (to be a man face) axis #1 axis #1 axis #2 axis #2 rotation
10.
© Mitsubishi Electric
Corporation Proposed method: Key idea • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful because users need only check whether the constraints are satisfied for data in each axis. 10/32 axis #1 axis #1 Decoder This axis is not related to the constraint (to be a man face) axis #2 axis #2 Example 1: Generating face with a constraint of man face. smiling
11.
© Mitsubishi Electric
Corporation Proposed method: Key idea • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful because users need only check whether the constraints are satisfied for data in each axis. 11/32 Mixed features also satisfy constraints axis #1 axis #1 axis #2 axis #2 Example 1: Generating face with a constraint of man face.
12.
© Mitsubishi Electric
Corporation Proposed method: Key idea • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful because users need only check whether the constraints are satisfied for data in each axis. 12/32 Decoder Exploration area is restricted to axis #1. This axis is related to the constraint (to be smiling face) Example 1: Generating face with a constraint of similing face. This region possibly does not satisfy the constraint. axis #1 axis #1 axis #2 axis #2 smiling !
13.
© Mitsubishi Electric
Corporation Proposed method: Key idea • Unknown inequality constraint → Introducing disentangled representation learning (DRL) into the nonlinear embedding is useful because users need only check whether the constraints are satisfied for data in each axis. 13/32 axis #1 axis #2 axis #1 axis #2 Exploration area in the example 1 Exploration area in the example 2 We can control the exploration are in the latent space even if the inequality constraints are unknown.
14.
© Mitsubishi Electric
Corporation Proposed method: Overview 14/32
15.
© Mitsubishi Electric
Corporation Step1: Dimensionality reduction For variable parameters, we used 𝛽-VAE to introduce DRL into VAE and acquired the latent space 𝑧v ∈ ℝ𝒅𝐯. (For fixed parameters, we used PCA for simplicity.) 15/32 Hyperparameters (the dimensionality of the latent space 𝒅𝐯 and the coefficient 𝜷) control a tradeoff between two losses. Reconstruction loss KL-divergence loss large 𝛽 × BO generates rough-grained features. (-> hard to optimize parameters) 〇 Features are more disentangled. small 𝛽 〇 BO generates fine-grained features × Features are less disentangled. (-> hard to consider constraints) Reconstruction loss + 𝛽 ∗ KL-divergence loss 𝛽-VAE loss = 𝑧 𝑧 loss 𝑧 𝑥 𝑥′ loss 𝒩(0, 1) encoder decoder
16.
© Mitsubishi Electric
Corporation Step2: Bayesian optimization We used Gaussian process regression (GPR) and maximized the UCB (upper confidence bound) acquisition function 𝑎(𝑧v, 𝑧f = 𝑧f target ). 16/32 𝑦 𝑦 𝑧v, 𝑧f 𝑎UCB(𝑧v, 𝑧f) 𝑎UCB 𝑧v, 𝑧f = 𝜇 𝑧v, 𝑧f + 𝛼 ∗ 𝜎 𝑧v, 𝑧f 𝜇 𝑧v, 𝑧f 𝜎 𝑧v, 𝑧f Variance Acquisition function Mean We generated three parameters 𝒛𝐯 𝐧𝐞𝐰 and let user to select one of them. - exploitation-oriented (𝛼 = 0.001) - intermediate (𝛼 = 0.5) - exploration-oriented (𝛼 = 1.0) 𝒛𝐯 𝐧𝐞𝐰 , 𝒛𝐟 𝐭𝐚𝐫𝐠𝐞𝐭 𝑧v, 𝑧f 𝑧v, 𝑧f Gaussian process regression
17.
© Mitsubishi Electric
Corporation Usage Scenario: Powder weighing system System overview The system needs to precisely weigh a powder by changing a valve opening degree 𝑣𝑖 → 𝑣𝑖+1 if the scale value reached a corresponding switching weight 𝑠𝑖+1 (0 ≤ 𝑖 ≤ 8). 17/32 Valve opening degree 9 steps Switching weight 𝑣0 𝑣1 𝑠1 𝑠9 𝑣9 (start) (end)
18.
© Mitsubishi Electric
Corporation Usage Scenario: Powder weighing system Two types of inequality constraints • Non-negative constraints : 𝑣𝑖 > 0, 𝑠𝑖 > 0 • Monotonically decreasing constraints : 𝑣𝑖 > 𝑣𝑖+1, 𝑠𝑖 < 𝑠𝑖+1 18/32 Valve opening degree 9 steps Switching weight 𝑣0 𝑣1 𝑠1 𝑠9 𝑣9 (start) (end)
19.
© Mitsubishi Electric
Corporation Usage Scenario: Powder weighing system Preprocessing 19/32 • Normalization • Outlier removal • Duplication removal to prevent imbalanced learning • Train/Test split • Normalization • Outlier removal • Data filtering to restrict the exploration area locally • Train/Test split
20.
© Mitsubishi Electric
Corporation Usage Scenario: Powder weighing system Datasets 20/32 contained 60 types of powder and consisted of 1,792 trials (the average is 31.33±19.48). Parameters 𝒙𝐟 w/ equality constraints (used for learning PCA and GPR) Parameters 𝒙𝐯 w/o equality constraints (used for learning 𝛽-VAE and GPR) ൠ Objective value 𝑦 representing an error between the measured and required weight. (used for learning GPR) ൠ
21.
© Mitsubishi Electric
Corporation Experiments overview Experiment 1-1, 1-2 Experiment 2 21/32 We verify the effect of hyperparameters in 𝛽-VAE learning on considering inequality constraints. We verify whether the proposed method could determine optimum parameters within a reasonable number of trials. the weighing error 𝑦 is less than 1% of the required weight. (manual tuning needs typically about 20 trials in practice.) Hyperparameters ? 𝑑v and 𝛽 Considering inequality constraints Considering inequality constraints ? The number of required trials
22.
© Mitsubishi Electric
Corporation Experiments overview Experiment 1-1, 1-2 Experiment 2 22/32 We verify the effect of hyperparameters in 𝛽-VAE learning on considering inequality constraints. We verify whether the proposed method could determine optimum parameters within a reasonable number of trials. the weighing error 𝑦 is less than 1% of the required weight. (manual tuning needs typically about 20 trials in practice.) Hyperparameters ? 𝑑v and 𝛽 Considering inequality constraints Considering inequality constraints ? The number of required trials
23.
© Mitsubishi Electric
Corporation Experiment 1-1: Evaluation on hyperparameter effects quantitatively. 23/32 ቊ 𝑑v ∈ 2, 4, 6, 8, 10 𝛽 ∈ {0.1, 0.2, … , 1.5} Hyperparameters value selection 𝛽-VAE learning evaluation The number of the unsuitable data which does not satisfy constraints Procedure (× 75 times for all hyperparameter combinations) Sampled randomly in the latent space (𝑛 = 1000) 𝑧 𝑥 Decoder (DNN) Satisfy constraints? Suitable Unsuitable Sampled randomly in the original space (𝑛 = 1000) ℝ𝒅𝐯
24.
© Mitsubishi Electric
Corporation Experiment 1-1: Evaluation on hyperparameter effects quantitatively. Result 24/32 Findings • Larger 𝛽 decreases the unsuitable data. -> We guess that DRL enable us to consider the inequality constraints. • Larger 𝑑v in the latent space increases the unsuitable data. -> We guess that samples far from the origin of the latent space tend to be the unsuitable data because fine-grained features are emphasized in the area far from the origin. The number of unsuitable data Regions where undesirable features are emphasized 𝑧 Exploration area
25.
© Mitsubishi Electric
Corporation Experiment 1-2: Evaluation on hyperparameter effects qualitatively. 25/32 ቊ 𝑑v = 2 𝛽 ∈ {0.1, 0.5, 1.0} Hyperparameters value selection 𝛽-VAE learning Visualization The meaning of disentangled features Procedure (× 3 times for all hyperparameter combinations) Sampled at equal intervals along the axes in the latent space (𝑛 = 15) 𝑧 𝑥 Decoder (DNN) Check whether the disentangled features satisfy constraints Sampled at equal intervals along the axes in the original space (𝑛 = 15)
26.
© Mitsubishi Electric
Corporation Experiment 1-2: Evaluation on hyperparameter effects qualitatively. Result 26/32 𝑧 Sufficient consideration of constraints, poor diversity Lack of consideration of constraints, rich diversity Initial point changes Initial point and the gradient change Valve opening degree Valve opening degree Valve opening degree Switching weight Switching weight
27.
© Mitsubishi Electric
Corporation Experiment 1-2: Evaluation on hyperparameter effects qualitatively. Result 27/32 𝑧 Sufficient consideration of constraints, poor diversity Lack of consideration of constraints, rich diversity Initial point changes We used this setting in the next experiment 2. Initial point and the gradient change
28.
© Mitsubishi Electric
Corporation Experiment 1 Discussion • Can DRL consider inequality constraints ? ➢ YES. • How should we set the hyperparameter values ? ➢ To determine the hyperparameter values, the visualization of the effect quantitatively and qualitatively is helpful. ➢ We recommend to determine the value of 𝑑v first because the suitable value of 𝛽 depends on 𝑑v value. 28 Acceptable parameters area 𝑑v smaller 𝑑v larger smaller 𝛽 larger 𝛽 Reconstruction loss is too high (-> parameters have poor diversity) Lack of consideration of constraints
29.
© Mitsubishi Electric
Corporation Experiments overview Experiment 1-1, 1-2 Experiment 2 29/32 We verify the effect of hyperparameters in 𝛽-VAE learning on considering inequality constraints. We verify whether the proposed method could determine optimum parameters within a reasonable number of trials. the weighing error 𝑦 is less than 1% of the required weight. (manual tuning needs typically about 20 trials in practice.) Hyperparameters ? 𝑑v and 𝛽 Considering inequality constraints Considering inequality constraints ? The number of required trials
30.
© Mitsubishi Electric
Corporation Experiment 2 Procedure Result 30/32 • We used three types of powder A, B, and C not included in the dataset. • we can see that powders A, B, and C are not outliers. • From the result of the experiment 1, we set 𝑑v = 2 , 𝛽 = 0.1 which leads rich diversity and low reconstruction error. The proposed method contributes to reducing the number of trials (from 20 to around 5) compared to the baseline. Baseline (manual tuning) Features of the generated parameters • For powders B and C, we successfully satisfy the constraints in all trials. • For powder A, we generated the unsuitable data in one trial because the proposed method seems to have explored areas far from the origin of the latent space (from the observation in the experiment 1). PCA visualization of fixed parameters
31.
© Mitsubishi Electric
Corporation Limitations • The relationship between hyperparameters and the number of required trial is still unclear. • The exploration area needs to be set by manual. 31/32 Considering inequality constraints The number of required trials Hyperparameters 𝑑v and 𝛽 Experiment 1 Experiment 2 ? Size of the exploration area small 𝑧 𝑧 𝑧 Optimal parameters large Generating parameters that does not satisfy constraints Optimal parameters cannot be explored. Exploration area (bounding box) Future work Best area !!
32.
© Mitsubishi Electric
Corporation Conclusion • We proposed methods to handle two types of constraints in Bayesian optimization even after the nonlinear embedding. ➢ Known equality constraints : Parameter decomposition is useful. ➢ Unknown inequality constraints : Disentangled representation learning is useful. • We conducted two experiments. ➢ Experiment 1 showed the effect of hyperparameters on considering inequality constraints and the visualization to determine the values. ➢ Experiment 2 demonstrated that the proposed method contributes to reducing the number of trials by approximately 66% compared to the manual tuning. 32/32
33.
© Mitsubishi Electric
Corporation Do you have any questions ? 33/32