Clustering by Maximizing Mutual Information Across Views

•Download as PPTX, PDF•

0 likes•71 views

Kien Duc Do

This is the slide for my paper accepted in ICCV-2021

Technology

Clustering by Maximizing Mutual
Information Across Views
Kien Do, Truyen Tran, Svetha Venkatesh
Applied AI Institute (A2I2), Deakin University, Australia
1

Image Clustering Problem
2
The explosion of unlabelled data has led to the growing demand for unsupervised clustering

Clustering Assumptions
3
Inter-cluster distance
should be large
Intra-cluster distance
should be small

Existing Clustering Methods
4
Enc Dec
Clustering the latent code
Autoencoder-based methods (e.g., DCN, VaDE, DGG)
DCN [1]
Closer in the latent space of the AE
The latent should only capture semantic information from the input
[1] Towards k-means-friendly spaces: Simultaneous deep learning and clustering, Yang et al., ICML 2017

Existing Clustering Methods (cont.)
5
IIC [1]
Methods that only use the cluster-assignment probability (e.g., IIC, PICA)
Problem: May not capture enough useful
information from data => over-clustering is
often required.
[1] Invariant Information Clustering for Unsupervised Image Classification and Segmentation, Ji et al., ICCV 2019

Motivation
• We need a method that can model the cluster-level and the instance-
level semantics.
• The InfoMax/Contrastive Learning principle can be applied to this
scenario.
6

Overview about InfoMax/Contrastive Learning
• A principle for learning view-invariant representations. These
representations often capture the data semantics.
• The idea is maximizing the mutual information (MI) between 2
different views.
• Since direct computation of the MI is hard, we maximize its
variational lower bound instead.
7

The InfoNCE bound
• InfoNCE [1] is a lower bound of MI
• It is biased but has low variance
• Maximizing InfoNCE is equivalent to minimizing a contrastive loss:
8
[1] On Variational Bounds of Mutual Information, Poole et al., ICML 2019
is a “critic” measuring the similarity between and

Contrastive Representation Learning and
Clustering (CRLC)
9
Image representation vector
Cluster-assignment probability vector

Choosing an optimal critic
• A critic is optimal ( ) if it leads to the tightest InfoNCE bound.
• It can be shown that
• In continuous cases, cosine similarity is the optimal critic
• In discrete cases, “log-of-dot-product” is the optimal critic
11

A Simple extension to Semi-supervised Learning
12
Assume that we also have access to some labeled set . The training loss is:

Learned Representation Visualization
15
CRLC SimCLR
In CRLC, the learned representations are more separate than in SimCLR

Comparison with FixMatch
CRLC-semi is much more stable and converges much faster than
FixMatch when only few label data are available
17

What's hot

Internship project presentation_final_uploadSuraj Rathore

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...MLAI2

AI: Learning in AI DataminingTools Inc

Online Coreset Selection for Rehearsal-based Continual LearningMLAI2

LearningAG.pptbutest

Facial Emoji Recognitionijtsrd

2-IJCSE-00536Boshra Albayaty

WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...Nexgen Technology

Sotaguesta4fafe

Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...IJECEIAES

ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULEIJCSEA Journal

WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...Husna Zayadi

Expandable bayesianAhmad Amri

Mis End Term Exam Theory ConceptsVidya sagar Sharma

Design Pattern Explained CH1Jamie (Taka) Wang

Chaptr 7 (final)Nateshwar Kamlesh

LearningAmit Pandey

Design Pattern Explained CH8Jamie (Taka) Wang

Parallel and distributed genetic algorithm with multiple objectives to impro...khalil IBRAHIM

What's hot (19)

Internship project presentation_final_upload

Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Unce...

AI: Learning in AI

Online Coreset Selection for Rehearsal-based Continual Learning

LearningAG.ppt

Facial Emoji Recognition

2-IJCSE-00536

WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...

Sota

Hybrid Method HVS-MRMR for Variable Selection in Multilayer Artificial Neural...

ROLE OF CERTAINTY FACTOR IN GENERATING ROUGH-FUZZY RULE

WXGB6108_Article Review_The Effect of Attitudes, Goal Setting and Self-Effica...

Expandable bayesian

Mis End Term Exam Theory Concepts

Design Pattern Explained CH1

Chaptr 7 (final)

Learning

Design Pattern Explained CH8

Parallel and distributed genetic algorithm with multiple objectives to impro...

Similar to Clustering by Maximizing Mutual Information Across Views

Representational Continuity for Unsupervised Continual LearningMLAI2

imageclassification-160206090009.pdfKammetaJoshna

Image classification with Deep Neural NetworksYogendra Tamang

An Empirical Study of Training Self-Supervised Vision Transformers.pptxSangmin Woo

[ICIP 2022] ACT-NET: Asymmetric Co-Teacher Network for Semi-Supervised Memory...Ziyuan Zhao

Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...Seunghyun Hwang

IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...IRJET Journal

Predicting More from Less: Synergies of LearningCS, NcState

Data Mining Un-Compressed Images from cloud with Clustering Compression techn...ijaia

Defending against label-flipping attacks in federated learning systems using ...IAESIJAI

COVID-19 detection from scarce chest X-Ray image data using few-shot deep lea...Shruti Jadon

Learning where to look: focus and attention in deep visionUniversitat Politècnica de Catalunya

Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...IJCSIS Research Publications

MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATIONijaia

A Mixture Model of Hubness and PCA for Detection of Projected OutliersZac Darcy

A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERSZac Darcy

A Mixture Model of Hubness and PCA for Detection of Projected OutliersZac Darcy

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya

An Iterative Improved k-means ClusteringIDES Editor

End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES

Similar to Clustering by Maximizing Mutual Information Across Views (20)

Representational Continuity for Unsupervised Continual Learning

imageclassification-160206090009.pdf

Image classification with Deep Neural Networks

An Empirical Study of Training Self-Supervised Vision Transformers.pptx

[ICIP 2022] ACT-NET: Asymmetric Co-Teacher Network for Semi-Supervised Memory...

Energy-based Model for Out-of-Distribution Detection in Deep Medical Image Se...

IRJET - Factors Affecting Deployment of Deep Learning based Face Recognition ...

Predicting More from Less: Synergies of Learning

Data Mining Un-Compressed Images from cloud with Clustering Compression techn...

Defending against label-flipping attacks in federated learning systems using ...

COVID-19 detection from scarce chest X-Ray image data using few-shot deep lea...

Learning where to look: focus and attention in deep vision

Improved K-mean Clustering Algorithm for Prediction Analysis using Classifica...

MULTI-LEVEL FEATURE FUSION BASED TRANSFER LEARNING FOR PERSON RE-IDENTIFICATION

A Mixture Model of Hubness and PCA for Detection of Projected Outliers

A MIXTURE MODEL OF HUBNESS AND PCA FOR DETECTION OF PROJECTED OUTLIERS

A Mixture Model of Hubness and PCA for Detection of Projected Outliers

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018

An Iterative Improved k-means Clustering

End-to-end deep auto-encoder for segmenting a moving object with limited tra...

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Partners Life - Insurer Innovation Award 2024The Digital Insurer

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Artificial Intelligence: Facts and MythsJoaquim Jorge

A Domino Admins Adventures (Engage 2024)Gabriella Davis

How to convert PDF to text with Nanonetsnaman860154

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Scaling API-first – The story of a global engineering organizationRadu Cotescu

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

GenAI Risks & Security Meetup 01052024.pdflior mazor

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Partners Life - Insurer Innovation Award 2024

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Artificial Intelligence: Facts and Myths

A Domino Admins Adventures (Engage 2024)

How to convert PDF to text with Nanonets

🐬 The future of MySQL is Postgres 🐘

Boost Fertility New Invention Ups Success Rates.pdf

Driving Behavioral Change for Information Management through Data-Driven Gree...

Scaling API-first – The story of a global engineering organization

08448380779 Call Girls In Civil Lines Women Seeking Men

GenCyber Cyber Security Day Presentation

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

Automating Google Workspace (GWS) & more with Apps Script

GenAI Risks & Security Meetup 01052024.pdf

CNv6 Instructor Chapter 6 Quality of Service

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Presentation on how to chat with PDF using ChatGPT code interpreter

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Clustering by Maximizing Mutual Information Across Views

1. Clustering by Maximizing Mutual Information Across Views Kien Do, Truyen Tran, Svetha Venkatesh Applied AI Institute (A2I2), Deakin University, Australia 1

2. Image Clustering Problem 2 The explosion of unlabelled data has led to the growing demand for unsupervised clustering

3. Clustering Assumptions 3 Inter-cluster distance should be large Intra-cluster distance should be small

4. Existing Clustering Methods 4 Enc Dec Clustering the latent code Autoencoder-based methods (e.g., DCN, VaDE, DGG) DCN [1] Closer in the latent space of the AE The latent should only capture semantic information from the input [1] Towards k-means-friendly spaces: Simultaneous deep learning and clustering, Yang et al., ICML 2017

5. Existing Clustering Methods (cont.) 5 IIC [1] Methods that only use the cluster-assignment probability (e.g., IIC, PICA) Problem: May not capture enough useful information from data => over-clustering is often required. [1] Invariant Information Clustering for Unsupervised Image Classification and Segmentation, Ji et al., ICCV 2019

6. Motivation • We need a method that can model the cluster-level and the instance- level semantics. • The InfoMax/Contrastive Learning principle can be applied to this scenario. 6

7. Overview about InfoMax/Contrastive Learning • A principle for learning view-invariant representations. These representations often capture the data semantics. • The idea is maximizing the mutual information (MI) between 2 different views. • Since direct computation of the MI is hard, we maximize its variational lower bound instead. 7

8. The InfoNCE bound • InfoNCE [1] is a lower bound of MI • It is biased but has low variance • Maximizing InfoNCE is equivalent to minimizing a contrastive loss: 8 [1] On Variational Bounds of Mutual Information, Poole et al., ICML 2019 is a “critic” measuring the similarity between and

9. Contrastive Representation Learning and Clustering (CRLC) 9 Image representation vector Cluster-assignment probability vector

10. Training Loss 10 where:

11. Choosing an optimal critic • A critic is optimal ( ) if it leads to the tightest InfoNCE bound. • It can be shown that • In continuous cases, cosine similarity is the optimal critic • In discrete cases, “log-of-dot-product” is the optimal critic 11

12. A Simple extension to Semi-supervised Learning 12 Assume that we also have access to some labeled set . The training loss is:

13. Results on Clustering 13

14. Results w.r.t. different critics 14

15. Learned Representation Visualization 15 CRLC SimCLR In CRLC, the learned representations are more separate than in SimCLR

16. Results on SSL 16

17. Comparison with FixMatch CRLC-semi is much more stable and converges much faster than FixMatch when only few label data are available 17

18. 18 Thank you for your attention!

Editor's Notes

The explosion of unlabelled data has led to the growing demand for unsupervised clustering

Clustering by Maximizing Mutual Information Across Views

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Clustering by Maximizing Mutual Information Across Views

Similar to Clustering by Maximizing Mutual Information Across Views (20)

Recently uploaded

Recently uploaded (20)

Clustering by Maximizing Mutual Information Across Views

Editor's Notes