論文紹介：DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

•

0 likes•269 views

Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum, "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection" arXiv2022 https://arxiv.org/abs/2203.03605

Technology

DINO: DETR with Improved
DeNoising Anchor
Boxes for End-to-End Object
Detection
Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni,
Heung-Yeung Shum
arXiv2022
2023/6/8

◼DETR [Carion+, ECCV2020] DINO
• DETR with Improved deNoising anchOr boxes
• Transformer End-to-End

◼
• HTC [Chen+, IEEE2019] Dyhead [Dai+, IEEE2021]
• DETR End-to-End
◼
•
• DETR

◼Deformable DETR [Zhu+, ICLR2021]
• deformable attention
◼Efficient DETR [Yao+, arXiv2021]
• K
◼DAB-DETR [Liu+, arXiv2022]
• 2 4
◼DN-DETR [Li+, arXiv2022]
•

Contrastive DeNoising Training
◼ 2
◼
◼ 𝜆1 𝜆2 (𝜆1 < 𝜆2)
• 𝜆1
• bounding box
• 𝜆1 𝜆2
•
◼
•

Mixed Query Selection
◼ 2
◼positional queries
•
◼content queries
•
◼
• Static Queries
• DETR, DN-DETR
•
• Pure Query Selection
• Deformable DETR
• positional queries content queries

Mixed Query Selection
◼
• Mixed Query Aelection
• positional queries
• top-k
• content queries
•

Look Forward Twice
◼
• (Detach) (a)
•
◼DINO (i+1) (b)
• box

◼
• COCO 2017
◼
• Average Precision (AP)
• IoU
◼
• ResNet-50 He+, CVPR2016
• ImageNet-1k [Deng+, CVPR2009]
• COCO
• SwinL [Liu+, ICCV2021]
• ImageNet-22k 9
• Object365 [33]
• COCO

What's hot

SSII2021 [OS2-01] 転移学習の基礎：異なるタスクの知識を利用するための機械学習の方法SSII

【DL輪読会】ViT + Self Supervised LearningまとめDeep Learning JP

【DL輪読会】Code as Policies: Language Model Programs for Embodied ControlDeep Learning JP

[DL輪読会]Geometric Unsupervised Domain Adaptation for Semantic SegmentationDeep Learning JP

Action Recognitionの歴史と最新動向Ohnishi Katsunori

Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料Yusuke Uchida

論文紹介：Temporal Action Segmentation: An Analysis of Modern TechniquesToru Tamaki

【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersDeep Learning JP

continual learning surveyぱんいちすみもと

SSII2022 [TS1] Transformerの最前線〜畳込みニューラルネットワークの先へ〜SSII

[DL輪読会]YOLOv4: Optimal Speed and Accuracy of Object DetectionDeep Learning JP

[DL輪読会]画像を使ったSim2Realの現況Deep Learning JP

【DL輪読会】DINOv2: Learning Robust Visual Features without SupervisionDeep Learning JP

【メタサーベイ】Vision and Language のトップ研究室/研究者cvpaper. challenge

【DL輪読会】DayDreamer: World Models for Physical Robot LearningDeep Learning JP

近年のHierarchical Vision TransformerYusuke Uchida

【DL輪読会】Dropout Reduces UnderfittingDeep Learning JP

【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement LearningDeep Learning JP

【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...Deep Learning JP

[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and EditingDeep Learning JP

What's hot (20)

SSII2021 [OS2-01] 転移学習の基礎：異なるタスクの知識を利用するための機械学習の方法

【DL輪読会】ViT + Self Supervised Learningまとめ

【DL輪読会】Code as Policies: Language Model Programs for Embodied Control

[DL輪読会]Geometric Unsupervised Domain Adaptation for Semantic Segmentation

Action Recognitionの歴史と最新動向

Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料

論文紹介：Temporal Action Segmentation: An Analysis of Modern Techniques

【DL輪読会】ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders

continual learning survey

SSII2022 [TS1] Transformerの最前線〜畳込みニューラルネットワークの先へ〜

[DL輪読会]YOLOv4: Optimal Speed and Accuracy of Object Detection

[DL輪読会]画像を使ったSim2Realの現況

【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision

【メタサーベイ】Vision and Language のトップ研究室/研究者

【DL輪読会】DayDreamer: World Models for Physical Robot Learning

近年のHierarchical Vision Transformer

【DL輪読会】Dropout Reduces Underfitting

【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning

【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...

[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

A Call to Action for Generative AI in 2024Results

08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

A Domino Admins Adventures (Engage 2024)Gabriella Davis

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Breaking the Kubernetes Kill Chain: Host Path Mount

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Handwritten Text Recognition for manuscripts and early printed texts

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

A Call to Action for Generative AI in 2024

08448380779 Call Girls In Greater Kailash - I Women Seeking Men

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

My Hashitalk Indonesia April 2024 Presentation

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Injustice - Developers Among Us (SciFiDevCon 2024)

CNv6 Instructor Chapter 6 Quality of Service

A Domino Admins Adventures (Engage 2024)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

SQL Database Design For Developers at php[tek] 2024

Finology Group – Insurtech Innovation Award 2024

🐬 The future of MySQL is Postgres 🐘

論文紹介：DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection

1. DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection Hao Zhang, Feng Li, Shilong Liu, Lei Zhang, Hang Su, Jun Zhu, Lionel M. Ni, Heung-Yeung Shum arXiv2022 2023/6/8

2. ◼DETR [Carion+, ECCV2020] DINO • DETR with Improved deNoising anchOr boxes • Transformer End-to-End

3. ◼ • HTC [Chen+, IEEE2019] Dyhead [Dai+, IEEE2021] • DETR End-to-End ◼ • • DETR

4. ◼Deformable DETR [Zhu+, ICLR2021] • deformable attention ◼Efficient DETR [Yao+, arXiv2021] • K ◼DAB-DETR [Liu+, arXiv2022] • 2 4 ◼DN-DETR [Li+, arXiv2022] •

5. ◼ • • •

6. Contrastive DeNoising Training ◼ 2 ◼ ◼ 𝜆1 𝜆2 (𝜆1 < 𝜆2) • 𝜆1 • bounding box • 𝜆1 𝜆2 • ◼ •

7. Mixed Query Selection ◼ 2 ◼positional queries • ◼content queries • ◼ • Static Queries • DETR, DN-DETR • • Pure Query Selection • Deformable DETR • positional queries content queries

8. Mixed Query Selection ◼ • Mixed Query Aelection • positional queries • top-k • content queries •

9. Look Forward Twice ◼ • (Detach) (a) • ◼DINO (i+1) (b) • box

10. ◼ • COCO 2017 ◼ • Average Precision (AP) • IoU ◼ • ResNet-50 He+, CVPR2016 • ImageNet-1k [Deng+, CVPR2009] • COCO • SwinL [Liu+, ICCV2021] • ImageNet-22k 9 • Object365 [33] • COCO

11. ResNet-50 ◼ 4 5 ◼ • • • •

12. ResNet-50 ◼ ◼ •

13. SOTA ◼SwinL ◼ • DETR •

14. Ablation Study ◼3 ◼

15. ◼3 DINO • • • ◼