SlideShare a Scribd company logo
1 of 17
Download to read offline
MCSE: Multimodal Contrastive Learning of
Sentence Embeddings
Miaoran Zhang, Marius Mosbach, David Ifeoluwa Adelani, Michael A. Hedderich, and Dietrich Klakow,
2022
Experiments
Conclusions
Introduction
Related Work
01
02
03
04
Introduction
• MCSE : Multimodal Contrastive Learning of Sentence Embeddings
 Background: Unsupervised SimCSE (Gao et al., 2021)
 Extend a multimodal contrastive objective
 Experiments on standard Semantic Textual Similarity (STS)
• Architecture of MCSE
Introduction
f v (·
) is a pre-trained image encoder such as ResNet
• Contrastive learning : background Unsupervised SimCSE
 data augmentation strategy : dropout noise
 pulling positive sentences closer and pushing apart negatives
Related Work
Cosine similarity
• Multimodal Contrastive Learning
 sentence-image pairs , sentence xi and image yi
• f v (·
) : pre-trained image encoder such as ResNet
• fθ(·
) : pre-trained language encoder such as BERT
 pull semantically close image-sentence pairs together and push away non-related pairs
Related Work
• Dataset
 Multimodal datasets : Flickr30k (29,783 images) and MS-COCO (82,783 images)
 text corpus : Wiki1M (English Wikipedia : 106 sentences)
• Encoder
 Language encoders : BERT and RoBERTa
 Image encoder : ResNet-50
Single layer MLPs
• Evaluation
 7 Semantic Textual Similarity (STS) : STS 2012-2016, STS Benchmark, SICK-Relatedness
 Spearman’s correlation
Experiments
Results
• MCSE : Wiki1M, Flickr30k
BERT (76.3 → 77.3)
RoBERTa (76.6 → 78.3)
• STS16 MCSE-BERT
-> the domain discrepancy
Performance comparison on STS tasks
Results
• the performances decrease
(without the large text-only corpus)
• MCSE models (0.9 – 3.8 points improvement)
• Spearman’s correlation(0.8 – 5.0 points reduction)
-> validating the efficacy of visual semantics
Average Spearman’s correlation on 7 STS tasks
Results
• Alignment-Uniformity
 Alignment : paired instance 사이의 거리
(짧을수록 좋음)
Similar samples have similar features
 Uniformity : embedding이 얼만큼 균일하게
분포하는지 (균일 할수록 좋음)
Preserve maximal information
* 참고 논문 : Understanding Contrastive Representation Learning through
Alignment and Uniformity on the Hypersphere (ICML 2020)
• Embedding space가 넓고, 고르게
분포하여 각 단어가 고유한 의미를
보존하는 것이 중요함.
• Contrastive learning을 통해
Negative Pair를 Positive Pair와 멀게
강제하는 과정에서 embedding space를
균일하게 분포하게 함.
Results
• Alignment-Uniformity
 PPOS : positive pairs distribution
 Pdata : data distribution
MCSE models : visually grounding
enhance by improving the
alignment property
The alignment-uniformity plot of models (BERT)
Results
• Improvements on Different Subsets
 different degrees from the visually grounding
because of domain discrepancy
Results
• SimCSE는 구문이 유사한 문장을 검색하는 반면
MCSE는 구문이 다양하고 의미 체계를 공유하는 문장을 검색
Results
• Cross-Modal Retrieval : metric Recall@K
 Recall@K : k개 추천 결과에 대한 recall
Conclusion
• MCSE 제안 : sentence embedding learning
• MCSE consistently improves the performance on STS tasks
• the superiority of method : by analyzing the alignment and uniformity properties
of the embedding space.
• SimCSE는 limited SAMPLE에서 MCSE 보다 나은 성능을 보임
MCSE는 큰 데이터에서는 SimCSE 성능을 능가함.
-> multimodal weight training 관련
MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정

More Related Content

Similar to MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정

Zeroshot multimodal named entity disambiguation for noisy social media posts
Zeroshot multimodal named entity disambiguation for noisy social media postsZeroshot multimodal named entity disambiguation for noisy social media posts
Zeroshot multimodal named entity disambiguation for noisy social media postsSyo Kyojin
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...Antonio Tejero de Pablos
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEEBEBTECHSTUDENTPROJECTS
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer betterLEE HOSEONG
 
Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Giorgio Orsi
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingSebastian Ruder
 
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionDistilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionGyeongman Kim
 
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionDistilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionGeonDoPark1
 
A reconstruction error based framework for multi label and multi-view learning
A reconstruction error based framework for multi label and multi-view learningA reconstruction error based framework for multi label and multi-view learning
A reconstruction error based framework for multi label and multi-view learningieeepondy
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categoriesWarNik Chow
 
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Symeon Papadopoulos
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Action Sequence Mining and Behavior Pattern Analysis for User Modeling
Action Sequence Mining and Behavior Pattern Analysis for User ModelingAction Sequence Mining and Behavior Pattern Analysis for User Modeling
Action Sequence Mining and Behavior Pattern Analysis for User ModelingPeter Brusilovsky
 
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachAhmed Hani Ibrahim
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPindico data
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generatorheedaeKwon
 

Similar to MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정 (20)

Zeroshot multimodal named entity disambiguation for noisy social media posts
Zeroshot multimodal named entity disambiguation for noisy social media postsZeroshot multimodal named entity disambiguation for noisy social media posts
Zeroshot multimodal named entity disambiguation for noisy social media posts
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
CVPR2022 paper reading - Balanced multimodal learning - All Japan Computer Vi...
 
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learningIEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
IEEE 2014 MATLAB IMAGE PROCESSING PROJECTS Scale adaptive dictionary learning
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5Dexa2007 Orsi V1.5
Dexa2007 Orsi V1.5
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionDistilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model Compression
 
Distilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model CompressionDistilling Linguistic Context for Language Model Compression
Distilling Linguistic Context for Language Model Compression
 
A reconstruction error based framework for multi label and multi-view learning
A reconstruction error based framework for multi label and multi-view learningA reconstruction error based framework for multi label and multi-view learning
A reconstruction error based framework for multi label and multi-view learning
 
VAEs for multimodal disentanglement
VAEs for multimodal disentanglementVAEs for multimodal disentanglement
VAEs for multimodal disentanglement
 
2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories2010 PACLIC - pay attention to categories
2010 PACLIC - pay attention to categories
 
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
Compact and Distinctive Visual Vocabularies for Efficient Multimedia Data Ind...
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Action Sequence Mining and Behavior Pattern Analysis for User Modeling
Action Sequence Mining and Behavior Pattern Analysis for User ModelingAction Sequence Mining and Behavior Pattern Analysis for User Modeling
Action Sequence Mining and Behavior Pattern Analysis for User Modeling
 
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding ApproachText Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
Text Representation & Fixed-Size Ordinally-Forgetting Encoding Approach
 
ODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLPODSC East: Effective Transfer Learning for NLP
ODSC East: Effective Transfer Learning for NLP
 
A neural image caption generator
A neural image caption generatorA neural image caption generator
A neural image caption generator
 

More from taeseon ryu

OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...taeseon ryu
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splattingtaeseon ryu
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptxtaeseon ryu
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdftaeseon ryu
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories taeseon ryu
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extractiontaeseon ryu
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learningtaeseon ryu
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Modelstaeseon ryu
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuningtaeseon ryu
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdftaeseon ryu
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdftaeseon ryu
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithmtaeseon ryu
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networkstaeseon ryu
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarizationtaeseon ryu
 
ProximalPolicyOptimization
ProximalPolicyOptimizationProximalPolicyOptimization
ProximalPolicyOptimizationtaeseon ryu
 

More from taeseon ryu (20)

VoxelNet
VoxelNetVoxelNet
VoxelNet
 
OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...OpineSum Entailment-based self-training for abstractive opinion summarization...
OpineSum Entailment-based self-training for abstractive opinion summarization...
 
3D Gaussian Splatting
3D Gaussian Splatting3D Gaussian Splatting
3D Gaussian Splatting
 
JetsonTX2 Python
 JetsonTX2 Python  JetsonTX2 Python
JetsonTX2 Python
 
Hyperbolic Image Embedding.pptx
Hyperbolic  Image Embedding.pptxHyperbolic  Image Embedding.pptx
Hyperbolic Image Embedding.pptx
 
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdfLLaMA Open and Efficient Foundation Language Models - 230528.pdf
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
 
YOLO V6
YOLO V6YOLO V6
YOLO V6
 
Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories Dataset Distillation by Matching Training Trajectories
Dataset Distillation by Matching Training Trajectories
 
RL_UpsideDown
RL_UpsideDownRL_UpsideDown
RL_UpsideDown
 
Packed Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation ExtractionPacked Levitated Marker for Entity and Relation Extraction
Packed Levitated Marker for Entity and Relation Extraction
 
MOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement LearningMOReL: Model-Based Offline Reinforcement Learning
MOReL: Model-Based Offline Reinforcement Learning
 
Scaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language ModelsScaling Instruction-Finetuned Language Models
Scaling Instruction-Finetuned Language Models
 
Visual prompt tuning
Visual prompt tuningVisual prompt tuning
Visual prompt tuning
 
mPLUG
mPLUGmPLUG
mPLUG
 
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdfvariBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
variBAD, A Very Good Method for Bayes-Adaptive Deep RL via Meta-Learning.pdf
 
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdfReinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs.pdf
 
The Forward-Forward Algorithm
The Forward-Forward AlgorithmThe Forward-Forward Algorithm
The Forward-Forward Algorithm
 
Towards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural NetworksTowards Robust and Reproducible Active Learning using Neural Networks
Towards Robust and Reproducible Active Learning using Neural Networks
 
BRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive SummarizationBRIO: Bringing Order to Abstractive Summarization
BRIO: Bringing Order to Abstractive Summarization
 
ProximalPolicyOptimization
ProximalPolicyOptimizationProximalPolicyOptimization
ProximalPolicyOptimization
 

Recently uploaded

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 

Recently uploaded (20)

Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 

MCSE_Multimodal Contrastive Learning of Sentence Embeddings_변현정

  • 1. MCSE: Multimodal Contrastive Learning of Sentence Embeddings Miaoran Zhang, Marius Mosbach, David Ifeoluwa Adelani, Michael A. Hedderich, and Dietrich Klakow, 2022
  • 3. Introduction • MCSE : Multimodal Contrastive Learning of Sentence Embeddings  Background: Unsupervised SimCSE (Gao et al., 2021)  Extend a multimodal contrastive objective  Experiments on standard Semantic Textual Similarity (STS)
  • 4. • Architecture of MCSE Introduction f v (· ) is a pre-trained image encoder such as ResNet
  • 5. • Contrastive learning : background Unsupervised SimCSE  data augmentation strategy : dropout noise  pulling positive sentences closer and pushing apart negatives Related Work Cosine similarity
  • 6. • Multimodal Contrastive Learning  sentence-image pairs , sentence xi and image yi • f v (· ) : pre-trained image encoder such as ResNet • fθ(· ) : pre-trained language encoder such as BERT  pull semantically close image-sentence pairs together and push away non-related pairs Related Work
  • 7. • Dataset  Multimodal datasets : Flickr30k (29,783 images) and MS-COCO (82,783 images)  text corpus : Wiki1M (English Wikipedia : 106 sentences) • Encoder  Language encoders : BERT and RoBERTa  Image encoder : ResNet-50 Single layer MLPs • Evaluation  7 Semantic Textual Similarity (STS) : STS 2012-2016, STS Benchmark, SICK-Relatedness  Spearman’s correlation Experiments
  • 8. Results • MCSE : Wiki1M, Flickr30k BERT (76.3 → 77.3) RoBERTa (76.6 → 78.3) • STS16 MCSE-BERT -> the domain discrepancy Performance comparison on STS tasks
  • 9. Results • the performances decrease (without the large text-only corpus) • MCSE models (0.9 – 3.8 points improvement) • Spearman’s correlation(0.8 – 5.0 points reduction) -> validating the efficacy of visual semantics Average Spearman’s correlation on 7 STS tasks
  • 10. Results • Alignment-Uniformity  Alignment : paired instance 사이의 거리 (짧을수록 좋음) Similar samples have similar features  Uniformity : embedding이 얼만큼 균일하게 분포하는지 (균일 할수록 좋음) Preserve maximal information * 참고 논문 : Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere (ICML 2020) • Embedding space가 넓고, 고르게 분포하여 각 단어가 고유한 의미를 보존하는 것이 중요함. • Contrastive learning을 통해 Negative Pair를 Positive Pair와 멀게 강제하는 과정에서 embedding space를 균일하게 분포하게 함.
  • 11. Results • Alignment-Uniformity  PPOS : positive pairs distribution  Pdata : data distribution MCSE models : visually grounding enhance by improving the alignment property The alignment-uniformity plot of models (BERT)
  • 12. Results • Improvements on Different Subsets  different degrees from the visually grounding because of domain discrepancy
  • 13. Results • SimCSE는 구문이 유사한 문장을 검색하는 반면 MCSE는 구문이 다양하고 의미 체계를 공유하는 문장을 검색
  • 14. Results • Cross-Modal Retrieval : metric Recall@K  Recall@K : k개 추천 결과에 대한 recall
  • 15.
  • 16. Conclusion • MCSE 제안 : sentence embedding learning • MCSE consistently improves the performance on STS tasks • the superiority of method : by analyzing the alignment and uniformity properties of the embedding space. • SimCSE는 limited SAMPLE에서 MCSE 보다 나은 성능을 보임 MCSE는 큰 데이터에서는 SimCSE 성능을 능가함. -> multimodal weight training 관련