SlideShare a Scribd company logo
1 of 18
Semi-supervised Learning with
Variational Bayesian Inference and
Maximum Uncertainty Regularization
Kien Do, Truyen Tran, Svetha Venkatesh
Applied AI Institute (A2I2), Deakin University, Australia
1
Introduction
• Many big systems nowadays need a lot of labeled data to learn well.
• However, manual label annotation is expensive and time consuming.
• Semi-supervised learning (SSL) mitigates the need for labels by
leveraging similar patterns in unlabeled data to improve classification.
• Recent SOTA methods for SSL are mainly based on consistency
regularization.
2
Consistency Regularization for SSL
3
Two types of perturbation
4
data perturbation weight perturbation
Existing CR-based methods focus mainly on data perturbation
Some well-known CR based methods
• Pi-model:
• Mean Teacher:
5
is the exponential moving average of
Can we achieve a better perturbation of data?
• Under weak data perturbation, is often close to .
The classifier can only learn a locally smooth mapping from to .
• We want to be: i) not too close to , and ii) difficult for the
classifier to predict correctly.
• We choose to be a maximum uncertain (w.r.t. ) virtual point:
6
Approximating
• Recall that defined as follows:
• However, optimizing the above objective is difficult since it usually
has multiple local minima. To address this problem, we approximate
by optimizing the first-order Taylor expansion of :
where is the gradient of at .
7
Approximating (cont.)
• We can also approximate using projected gradient descent. The
update formula at step t+1 is given by:
• Solving the above equations give us:
8
Maximum Uncertainty Regularization
• The maximum uncertainty regularization (MUR) loss is defined as:
where is the maximum uncertain virtual point.
9
Weight Perturbation via Variational Bayesian
Inference
• Unlike data perturbation, weight perturbation is not straightforward
• We need some way to generate random weights
Variational Bayesian Inference (VBI) is a principled way to do that
• VBI objective:
10
Force weights to match the prior
Ensure faithful reconstruction
Consistency under Weight Perturbation
• The consistency loss under weight perturbation is given below:
where is the mean of .
11
Final Objective
The final objective when combining weight perturbation (via VBI) and
data perturbation (via MUR) is given by:
where can be an arbitrary consistency regularization based
method like Pi-model, Mean Teacher or ICT.
12
Results on CIFAR-10/100 and SVHN
13
Ablation Study
14
Different coefficient values of ( )
Ablation Study (cont.)
15
Performance with different radiuses Random perturbation vs. MUR
Visualization of most uncertain samples
16
Conclusion
• We have proposed two new consistency regularization based
methods: MUR and CWP
• MUR finds the most uncertain virtual point and forces its class
prediction to be similar to that of .
• CWP leverages Variational Bayesian Inference to perturb weights and
forces a noisy classifier to produce consistent outputs.
• Both MUR and CWP lead to better performances on SSL.
17
18
Thank you for your attention!

More Related Content

What's hot

Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductoryAbhimanyu Dwivedi
 
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...Husna Zayadi
 
House Price Prediction An AI Approach.
House Price Prediction An AI Approach.House Price Prediction An AI Approach.
House Price Prediction An AI Approach.Nahian Ahmed
 
G6 m2-a-lesson 7-s
G6 m2-a-lesson 7-sG6 m2-a-lesson 7-s
G6 m2-a-lesson 7-smlabuski
 
slides
slidesslides
slidesbutest
 
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...Nexgen Technology
 
Meta learned Confidence for Few-shot Learning
Meta learned Confidence for Few-shot LearningMeta learned Confidence for Few-shot Learning
Meta learned Confidence for Few-shot LearningKIMMINHA3
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEIJCSEA Journal
 
Parallel and distributed genetic algorithm with multiple objectives to impro...
Parallel and distributed genetic algorithm  with multiple objectives to impro...Parallel and distributed genetic algorithm  with multiple objectives to impro...
Parallel and distributed genetic algorithm with multiple objectives to impro...khalil IBRAHIM
 
Multi-Task Learning With Deep Neural Networks
Multi-Task Learning With Deep Neural NetworksMulti-Task Learning With Deep Neural Networks
Multi-Task Learning With Deep Neural NetworksAbhishekBais8
 
A graph based consensus maximization approach for combining multiple supervis...
A graph based consensus maximization approach for combining multiple supervis...A graph based consensus maximization approach for combining multiple supervis...
A graph based consensus maximization approach for combining multiple supervis...Ecway Technologies
 

What's hot (14)

Data analytics with python introductory
Data analytics with python introductoryData analytics with python introductory
Data analytics with python introductory
 
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...
WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...
 
Sota
SotaSota
Sota
 
House Price Prediction An AI Approach.
House Price Prediction An AI Approach.House Price Prediction An AI Approach.
House Price Prediction An AI Approach.
 
G6 m2-a-lesson 7-s
G6 m2-a-lesson 7-sG6 m2-a-lesson 7-s
G6 m2-a-lesson 7-s
 
slides
slidesslides
slides
 
Ijcatr04071005
Ijcatr04071005Ijcatr04071005
Ijcatr04071005
 
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
 
Meta learned Confidence for Few-shot Learning
Meta learned Confidence for Few-shot LearningMeta learned Confidence for Few-shot Learning
Meta learned Confidence for Few-shot Learning
 
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE
 
A Three-Layer Visual Hash Function Using Adler-32
A Three-Layer Visual Hash Function Using Adler-32A Three-Layer Visual Hash Function Using Adler-32
A Three-Layer Visual Hash Function Using Adler-32
 
Parallel and distributed genetic algorithm with multiple objectives to impro...
Parallel and distributed genetic algorithm  with multiple objectives to impro...Parallel and distributed genetic algorithm  with multiple objectives to impro...
Parallel and distributed genetic algorithm with multiple objectives to impro...
 
Multi-Task Learning With Deep Neural Networks
Multi-Task Learning With Deep Neural NetworksMulti-Task Learning With Deep Neural Networks
Multi-Task Learning With Deep Neural Networks
 
A graph based consensus maximization approach for combining multiple supervis...
A graph based consensus maximization approach for combining multiple supervis...A graph based consensus maximization approach for combining multiple supervis...
A graph based consensus maximization approach for combining multiple supervis...
 

Similar to Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization

Continual learning: Variational continual learning
Continual learning: Variational continual learningContinual learning: Variational continual learning
Continual learning: Variational continual learningWonjun Jeong
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutSeunghyun Hwang
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 butest
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer LearningAshok Venkatesan
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter TuningJon Lederman
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Simplilearn
 
machine learning for engineering students
machine learning for engineering studentsmachine learning for engineering students
machine learning for engineering studentsKavitabani1
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learningNimrita Koul
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learningKien Le
 
SocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with ConfidenceSocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with ConfidenceYuto Yamaguchi
 
Stopped Training and Other Remedies for OverFITtting
Stopped Training and Other Remedies for OverFITttingStopped Training and Other Remedies for OverFITtting
Stopped Training and Other Remedies for OverFITttingESCOM
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
planning and decision making
planning and decision making planning and decision making
planning and decision making AdengappaUnavu
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so WellChun-Ming Chang
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learningNAVER Engineering
 

Similar to Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization (20)

Continual learning: Variational continual learning
Continual learning: Variational continual learningContinual learning: Variational continual learning
Continual learning: Variational continual learning
 
Learning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted DropoutLearning Sparse Networks using Targeted Dropout
Learning Sparse Networks using Targeted Dropout
 
Lectura seis
Lectura seisLectura seis
Lectura seis
 
November, 2006 CCKM'06 1
November, 2006 CCKM'06 1 November, 2006 CCKM'06 1
November, 2006 CCKM'06 1
 
Boosting based Transfer Learning
Boosting based Transfer LearningBoosting based Transfer Learning
Boosting based Transfer Learning
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
Deep Learning Interview Questions And Answers | AI & Deep Learning Interview ...
 
machine learning for engineering students
machine learning for engineering studentsmachine learning for engineering students
machine learning for engineering students
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
5954987.ppt
5954987.ppt5954987.ppt
5954987.ppt
 
SocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with ConfidenceSocNL: Bayesian Label Propagation with Confidence
SocNL: Bayesian Label Propagation with Confidence
 
Stopped Training and Other Remedies for OverFITtting
Stopped Training and Other Remedies for OverFITttingStopped Training and Other Remedies for OverFITtting
Stopped Training and Other Remedies for OverFITtting
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
planning and decision making
planning and decision making planning and decision making
planning and decision making
 
effect of learning rate
effect of learning rateeffect of learning rate
effect of learning rate
 
Why Batch Normalization Works so Well
Why Batch Normalization Works so WellWhy Batch Normalization Works so Well
Why Batch Normalization Works so Well
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
Issues in DTL.pptx
Issues in DTL.pptxIssues in DTL.pptx
Issues in DTL.pptx
 
Learning loss for active learning
Learning loss for active learningLearning loss for active learning
Learning loss for active learning
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization

  • 1. Semi-supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization Kien Do, Truyen Tran, Svetha Venkatesh Applied AI Institute (A2I2), Deakin University, Australia 1
  • 2. Introduction • Many big systems nowadays need a lot of labeled data to learn well. • However, manual label annotation is expensive and time consuming. • Semi-supervised learning (SSL) mitigates the need for labels by leveraging similar patterns in unlabeled data to improve classification. • Recent SOTA methods for SSL are mainly based on consistency regularization. 2
  • 4. Two types of perturbation 4 data perturbation weight perturbation Existing CR-based methods focus mainly on data perturbation
  • 5. Some well-known CR based methods • Pi-model: • Mean Teacher: 5 is the exponential moving average of
  • 6. Can we achieve a better perturbation of data? • Under weak data perturbation, is often close to . The classifier can only learn a locally smooth mapping from to . • We want to be: i) not too close to , and ii) difficult for the classifier to predict correctly. • We choose to be a maximum uncertain (w.r.t. ) virtual point: 6
  • 7. Approximating • Recall that defined as follows: • However, optimizing the above objective is difficult since it usually has multiple local minima. To address this problem, we approximate by optimizing the first-order Taylor expansion of : where is the gradient of at . 7
  • 8. Approximating (cont.) • We can also approximate using projected gradient descent. The update formula at step t+1 is given by: • Solving the above equations give us: 8
  • 9. Maximum Uncertainty Regularization • The maximum uncertainty regularization (MUR) loss is defined as: where is the maximum uncertain virtual point. 9
  • 10. Weight Perturbation via Variational Bayesian Inference • Unlike data perturbation, weight perturbation is not straightforward • We need some way to generate random weights Variational Bayesian Inference (VBI) is a principled way to do that • VBI objective: 10 Force weights to match the prior Ensure faithful reconstruction
  • 11. Consistency under Weight Perturbation • The consistency loss under weight perturbation is given below: where is the mean of . 11
  • 12. Final Objective The final objective when combining weight perturbation (via VBI) and data perturbation (via MUR) is given by: where can be an arbitrary consistency regularization based method like Pi-model, Mean Teacher or ICT. 12
  • 13. Results on CIFAR-10/100 and SVHN 13
  • 15. Ablation Study (cont.) 15 Performance with different radiuses Random perturbation vs. MUR
  • 16. Visualization of most uncertain samples 16
  • 17. Conclusion • We have proposed two new consistency regularization based methods: MUR and CWP • MUR finds the most uncertain virtual point and forces its class prediction to be similar to that of . • CWP leverages Variational Bayesian Inference to perturb weights and forces a noisy classifier to produce consistent outputs. • Both MUR and CWP lead to better performances on SSL. 17
  • 18. 18 Thank you for your attention!

Editor's Notes

  1. The error of MT+VD is always smaller than the error of MT