SlideShare a Scribd company logo
1 of 14
A Single Domain Generalization
for Object Detection
1
2
source target
SDG
DG
DA
背景知識 (1/3)
Single Domain Generalization (SDG)
単一のソースドメインから未知のターゲットドメインに汎化する
物体検出分野では未だ研究が進んでいない
3
画像分類 物体検出
データ拡張[NeurIPS’18]
敵対的手法[CVPR’20]
正規化アプローチ[CVPR’21]
Domain Adaptation
Domain Generalization
Single Domain
Generalization
DA:敵対的手法[PMLR’15]
DG:敵対的手法[CVPR’18]
位置とクラス推定
そのまま使えない
訓練時に単一ドメイン
しか使えない
背景知識 (2/3)
物体検出における方針
4
特徴 検出
物体検出器内でたった一つのソースドメイン
のみで様々なドメインに対応する特徴を抽出
するように学習
背景知識 (3/3)
Single Domain Generalization for ODについてのサーベイ
・Single-Domain Generalized Object Detection in Urban Scene via Cyclic-
Disentangled Self-Distillation,
Aming Wu+ (Xidian) [CVPR2022]
5
(Learning Transferable Visual Models from Natural Language Supervision),
A,Radford+ (OpenAI) [ICML’22]
・CLIP the Gap: A Single Domain Generalization Approach for Object Detection,
V.Vidit+ (EPFL) [arXiv’23]
6
全体の損失:ℒ = ℒ𝑟𝑝𝑛 + ℒ𝑙𝑜𝑐 + ℒ𝑐𝑙𝑠 + 𝜆(ℒ𝑐𝑑 + ℒ𝑠𝑑)
全体のコンセプト:得られる特徴をドメイン不変(DIR)とドメイン固有(DSR)に解き
ほぐし,DIRの情報を使って学習する.
Faster RCNNにおける損失
Single-Domain Generalized Object Detection in Urban Scene via Cyclic-
Disentangled Self-Distillation,
Aming Wu+ (Xidian) [CVPR2022]
Single-Domain Generalized Object Detection in Urban Scene via Cyclic-
Disentangled Self-Distillation,
Aming Wu+ (Xidian) [CVPR2022]
7
全体の損失:ℒ = ℒ𝑟𝑝𝑛 + ℒ𝑙𝑜𝑐 + ℒ𝑐𝑙𝑠 + 𝜆(ℒ𝑐𝑑 + ℒ𝑠𝑑)
学習するもの:𝐸𝐷𝑆𝑅,𝐸𝐷𝐼𝑅
ℒ𝑐𝑑 = +
ℒ𝑔𝑐:グローバルレベルで
𝐸𝐷𝐼𝑅はより不変な特徴を,
𝐸𝐷𝑆𝑅はより固有な特徴を得る.
ℒ𝑖𝑐:インスタンスレベルで
𝐸𝐷𝐼𝑅がより不変な特徴を得る.
結果:ドメイン不変な特徴 𝐹𝑑𝑖 を獲得.
ℒ𝑠𝑑 =
Single-Domain Generalized Object Detection in Urban Scene via Cyclic-
Disentangled Self-Distillation,
Aming Wu+ (Xidian) [CVPR2022]
8
全体の損失:ℒ = ℒ𝑟𝑝𝑛 + ℒ𝑙𝑜𝑐 + ℒ𝑐𝑙𝑠 + 𝜆(ℒ𝑐𝑑 + ℒ𝑠𝑑)
学習するもの:𝐸𝑛, 𝑇𝑛, 𝐶𝑙𝑠𝑛
+
ℒ𝑓𝑐:𝐹
𝑛を𝐹𝑑𝑖 ドメイン不変 に近づける.
ℒ𝑖𝑐:クラス分類において 𝐹𝑑𝑖 ドメイン不変
に近づける.
結果:𝐸𝑛においてドメイン不変な特徴を得
られる.
CLIP the Gap: A Single Domain Generalization Approach for Object Detection,
V.Vidit+ (EPFL)[arXiv’23]
9
全体のコンセプト:CLIPを使ったテキストからの特徴に合うセマンティックな特徴
の拡張を獲得し,CLIPを使用して検出器の学習をさせる.
全体の損失:ℒ = ℒ𝑟𝑝𝑛 + ℒ𝑟𝑒𝑔 + ℒ𝑐𝑙𝑖𝑝−𝑡
Learning Transferable Visual Models from Natural Language Supervision,
A,Radford+ (OpenAI)[ICML’22]
10
CLIP(Contrastive Language-Image Pre-training)
自然言語に含まれる表現と画像表現の関係性を学習する.
4億組のテキストと画像のペアを入力してTransformerを介して類似度を測る.
11
Semantic Augmentations
コンセプト:事前学習したCLIPを使って,ドメインを表すテキストで表現された特徴
に近づくような特徴の拡張𝒜 = 𝒜𝑛 1
𝑀
を得る.
損失:
CLIP the Gap: A Single Domain Generalization Approach for Object Detection,
V.Vidit+ (EPFL)[arXiv’23]
12
コンセプト:得られた特徴の拡張を使って検出器の汎化性を高める.
損失:ℒ = ℒ𝑟𝑝𝑛 + ℒ𝑟𝑒𝑔 + ℒ𝑐𝑙𝑖𝑝−𝑡
Training step
テキストの特徴と,検出の特徴を一致させて
クラス分類する.
CLIP the Gap: A Single Domain Generalization Approach for Object Detection,
V.Vidit+ (EPFL)[arXiv’23]
13
2本の論文の実験結果
Baselineと比較して,2%~6%の向上
既存のDG手法(ターゲットを学習に
取り入れるタイプ)と比較しても精
度向上が見られる.
S-DGODとCLIP the GAPを比較してみ
ると,ソースドメインの精度は前者,
ターゲットドメインの方は後者の方
が精度が良い.
Dataset:Diverse Weather Dataset
(晴,雨:BDD-100k[CVPR’20],
霧:Foggy Cityscapes[IJCV’17]
+Adverse-Weather[T-ITS’21])
CLIP the GAP
まとめ
• 概要
• 物体検出のためのSingle Domain Generalizationのサーベイ
• 技術的な構成
• たった一つのドメインで複数のドメインに対応するために,ドメイン不変
な特徴を獲得する,または様々なドメインの特徴を得られるように学習す
る方針を立てている.
• 傾向と今後
• 去年のCVPRから注目され,まだ2本しか発表されていない.
• 様々なドメインに拡張する手法が優れており,新たな提案が期待される.
• 2本ともFaster RCNN basedなため,Detection Transformerを使った高精度な手
法も出てくるのではないか.
14

More Related Content

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

A Single Domain Generalization for Object Detection.pptx

Editor's Notes

  1. Domain invariant representation Domain Specific Representation