論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

•

0 likes•168 views

Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu, "Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers" CVPR2022 https://openaccess.thecvf.com/content/CVPR2022/html/Ke_Unsupervised_Hierarchical_Semantic_Segmentation_With_Multiview_Cosegmentation_and_Clustering_Transformers_CVPR_2022_paper.html

Technology

Unsupervised Hierarchical Semantic
Segmentation With
Multiview Cosegmentation and
Clustering Transformers
Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu, CVPR2022
2023/05/11

◼Hierarchical Segment Grouping (HSG)
•
•
• Multiview Cosegmentation Clustering Transformer

Pixel-Segment Contrastive Feature Learning
1. 𝑉
•
2.
• OWT-UCM [Arbelaez+, IEEEPAMI2010]
•
3.
• 𝑢 ∝ 𝑚𝑒𝑎𝑛 𝑣𝑖: 𝑖 𝑖𝑛 𝑠𝑒𝑔𝑚𝑒𝑛𝑡 𝑠 , 𝑢 = 1
4.
1
2
3

Consistent Segmentations by View & Hierarchy
◼
• fine~coarse level
• Grouping 𝐺𝑒
• OWT-UCM 𝐺0
• Grouping 𝐺𝑙+1
• 𝐺𝑙
• Base-cluster feature 𝑋0
• 𝑉
• Base-cluster feature 𝑋𝑙+1
• 𝑋𝑙 Clustering transformer

Next-level cluster feature 𝑿𝒍+𝟏 and grouping 𝑮𝒍+𝟏
◼𝑃𝑙+1
1.
• 𝑙 𝑥 𝑎
• 𝑃𝑙 𝑎 = 𝑃𝑟𝑜𝑏(𝐺𝑙 = 𝑎|𝑥)
2.
• 𝑙 𝑎 𝑙 + 1 𝑏
• 𝐶𝑙
𝑙+1
𝑎, 𝑏 = 𝑃𝑟𝑜𝑏 𝐺𝑙+1 = 𝑏 𝐺𝑙 = 𝑎
•
• 𝑃𝑙+1 𝑏 = σ𝑎 𝑃𝑙 𝑎 ∙ 𝐶𝑙
𝑙+1
𝑎, 𝑏
3. 𝑋0 Pl+1
• 𝑃𝑙+1 = 𝑃𝑙 × 𝐶𝑙
𝑙+1
= 𝑃0 × 𝐶0
1
× 𝐶1
2
×⋅⋅⋅× 𝐶𝑙
𝑙+1

Clustering Transformer
◼𝑋𝑙+1 𝐶𝑙+1
•
◼Statistical feature mappings
• 𝑌𝑙 fc
• 𝑄𝑙
◼Softmax
• 𝐶𝑙
𝑙+1
= 𝑠𝑜𝑓𝑡𝑚𝑎𝑥
1
𝑚
𝑌𝑙
⊺
𝑍𝑙+1
• 𝑚:

Goodness of Grouping
◼
• Modularity maximization
[Newman, Phys. Rev2006]
•
• Collapse regularization
•
•
• maximize centroid separation
•
•
◼

Training and Testing
◼
•
◼
• 𝐺𝑙
• pixel feature CNN clustering transformers
• OWT-UCM

◼
• Moco [He+, CVPR2020]
• DenseCL [Wang+, CVPR2021]
• Revisit [Gansbeke+, arXiv2021]
• SegSort [Hwang+, ICCV2019]
• DeepCluster 2018
[Caron+, ECCV2018]
• Doersch 2015
[Doersch+, ECCV2015]
• [Isora+, arXiv2016]
• IIC [Ji+, ICCV2019]
• AC [Ouali+, ECCV2020] 47
◼
• Pascal VOC 2012
[Everingham+, IJCV2010]
• MSCOCO [Lin+, ECCV2014]
• Cityscapes [Cordts+, CVPR2016]
• KITTI-STEP [Weber+, arXiv2021]
• COCO-stuff [Caesar+, CVPR2018]
• Potsdam [Gerke+, 2014] 17

Results on unsupervised semantic segmentation
◼
• mIoU, Accuracy

Results on hierarchical segmentation
◼Normalized Foreground Covering
•
𝑠𝑓𝑔: ground truth

◼HSG
• Multiview Cosegmentation Clustering Transformer
•
◼
•
•

What's hot

【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...Deep Learning JP

[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted WindowsDeep Learning JP

[DL輪読会]Dense Captioning分野のまとめDeep Learning JP

最近強化学習の良記事がたくさん出てきたので勉強しながらまとめたKatsuya Ito

3D CNNによる人物行動認識の動向Kensho Hara

PyMCがあれば，ベイズ推定でもう泣いたりなんかしないToshihiro Kamishima

畳み込みニューラルネットワークの高精度化と高速化Yusuke Uchida

画像認識の初歩、SIFT,SURF特徴量takaya imai

[DL輪読会]An Image is Worth 16x16 Words: Transformers for Image Recognition at S...Deep Learning JP

強化学習その2nishio

近年のHierarchical Vision TransformerYusuke Uchida

[DL輪読会]Flow-based Deep Generative ModelsDeep Learning JP

論文紹介：DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object D...Toru Tamaki

12. Diffusion Model の数学的基礎.pdf幸太朗岩澤

動作認識の最前線：手法，タスク，データセットToru Tamaki

[Dl輪読会]introduction of reinforcement learningDeep Learning JP

Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...Yusuke Uchida

[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIPDeep Learning JP

正準相関分析Akisato Kimura

coordinate descent 法について京都大学大学院情報学研究科数理工学専攻

What's hot (20)

【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...

[DL輪読会]Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

[DL輪読会]Dense Captioning分野のまとめ

最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた

3D CNNによる人物行動認識の動向

PyMCがあれば，ベイズ推定でもう泣いたりなんかしない

畳み込みニューラルネットワークの高精度化と高速化

画像認識の初歩、SIFT,SURF特徴量

[DL輪読会]An Image is Worth 16x16 Words: Transformers for Image Recognition at S...

強化学習その2

近年のHierarchical Vision Transformer

[DL輪読会]Flow-based Deep Generative Models

論文紹介：DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object D...

12. Diffusion Model の数学的基礎.pdf

動作認識の最前線：手法，タスク，データセット

[Dl輪読会]introduction of reinforcement learning

Image Retrieval Overview (from Traditional Local Features to Recent Deep Lear...

[DL輪読会]Wav2CLIP: Learning Robust Audio Representations From CLIP

正準相関分析

coordinate descent 法について

Similar to 論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

is anyone_interest_in_auto-encoding_variational-bayesNAVER Engineering

Dueling network architectures for deep reinforcement learningTaehoon Kim

Learning a nonlinear embedding by preserving class neibourhood structure 최종WooSung Choi

Parallelizing Pruning-based Graph Structural Clustering煜林车

Face verification techniques: how to speed up dataset creationDeep Learning Italia

230727_HB_JointJournalClub.pptxNetwork Science Lab, The Catholic University of Korea

Paper Study: OptNet: Differentiable Optimization as a Layer in Neural NetworksChenYiHuang5

碩士論文投影片Yi-hau Chen

NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...ssuser4b1f48

V2 final presentation 08-12-2014 (akash gupta's conflicted copy 2014-12-08)0309akash

Cvpr 2018 papers review (efficient computing)DonghyunKang12

Paper study: Learning to solve circuit satChenYiHuang5

Introduction to Genetic algorithm and its significance in VLSI design and aut...Centre for Electronics, Computer, Self development

Parallel Machine Learning- DSGD and SystemMLJanani C

"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...ssuser2624f71

Feature recognition and classificationSooraz Sresta

مدخل إلى تعلم الآلةFares Al-Qunaieer

V2 final presentation 08-12-20140309akash

Iiwas19 yamazaki slideKotaro Yamazaki

Image anaysisGetinet Sintayehu

Similar to 論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers (20)

is anyone_interest_in_auto-encoding_variational-bayes

Dueling network architectures for deep reinforcement learning

Learning a nonlinear embedding by preserving class neibourhood structure 최종

Parallelizing Pruning-based Graph Structural Clustering

Face verification techniques: how to speed up dataset creation

230727_HB_JointJournalClub.pptx

Paper Study: OptNet: Differentiable Optimization as a Layer in Neural Networks

碩士論文投影片

NS-CUK Seminar: H.E.Lee, Review on "Gated Graph Sequence Neural Networks", I...

V2 final presentation 08-12-2014 (akash gupta's conflicted copy 2014-12-08)

Cvpr 2018 papers review (efficient computing)

Paper study: Learning to solve circuit sat

Introduction to Genetic algorithm and its significance in VLSI design and aut...

Parallel Machine Learning- DSGD and SystemML

"Sparse Graph Attention Networks", IEEE Transactions on Knowledge and Data En...

Feature recognition and classification

مدخل إلى تعلم الآلة

V2 final presentation 08-12-2014

Iiwas19 yamazaki slide

Image anaysis

Recently uploaded

UiPath manufacturing technology benefits and AI overviewDianaGray10

2024 May Patch TuesdayIvanti

Choreo: Empowering the Future of Enterprise Software EngineeringWSO2

How to Check CNIC Information Online with Pakdata cfdanishmna97

Portal Kombat : extension du réseau de propagande russe中央社

Design Guidelines for Passkeys 2024.pptxFIDO Alliance

DBX First Quarter 2024 Investor PresentationDropbox

ERP Contender Series: Acumatica vs. Sage IntacctBrainSell Technologies

Navigating Identity and Access Management in the Modern EnterpriseWSO2

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc

Quantum Leap in Next-Generation ComputingWSO2

WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero

CNIC Information System with Pakdata Cf In Pakistandanishmna97

JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech

Platformless Horizons for Digital AdaptabilityWSO2

Working together SRE & Platform EngineeringMarcus Vechiato

Modernizing Legacy Systems Using BallerinaWSO2

Recently uploaded (20)

UiPath manufacturing technology benefits and AI overview

2024 May Patch Tuesday

Choreo: Empowering the Future of Enterprise Software Engineering

How to Check CNIC Information Online with Pakdata cf

Portal Kombat : extension du réseau de propagande russe

Design Guidelines for Passkeys 2024.pptx

DBX First Quarter 2024 Investor Presentation

ERP Contender Series: Acumatica vs. Sage Intacct

Navigating Identity and Access Management in the Modern Enterprise

Six Myths about Ontologies: The Basics of Formal Ontology

The Zero-ETL Approach: Enhancing Data Agility and Insight

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...

Quantum Leap in Next-Generation Computing

WebRTC and SIP not just audio and video @ OpenSIPS 2024

CNIC Information System with Pakdata Cf In Pakistan

JavaScript Usage Statistics 2024 - The Ultimate Guide

Platformless Horizons for Digital Adaptability

Working together SRE & Platform Engineering

Modernizing Legacy Systems Using Ballerina

論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

1. Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers Tsung-Wei Ke, Jyh-Jing Hwang, Yunhui Guo, Xudong Wang, Stella X. Yu, CVPR2022 2023/05/11

2. ◼Hierarchical Segment Grouping (HSG) • • • Multiview Cosegmentation Clustering Transformer

3. ◼Hierarchical Segment Grouping (HSG)

4. Pixel-Segment Contrastive Feature Learning 1. 𝑉 • 2. • OWT-UCM [Arbelaez+, IEEEPAMI2010] • 3. • 𝑢 ∝ 𝑚𝑒𝑎𝑛 𝑣𝑖: 𝑖 𝑖𝑛 𝑠𝑒𝑔𝑚𝑒𝑛𝑡 𝑠 , 𝑢 = 1 4. 1 2 3

5. Consistent Segmentations by View & Hierarchy ◼ • fine~coarse level • Grouping 𝐺𝑒 • OWT-UCM 𝐺0 • Grouping 𝐺𝑙+1 • 𝐺𝑙 • Base-cluster feature 𝑋0 • 𝑉 • Base-cluster feature 𝑋𝑙+1 • 𝑋𝑙 Clustering transformer

6. Next-level cluster feature 𝑿𝒍+𝟏 and grouping 𝑮𝒍+𝟏 ◼𝑃𝑙+1 1. • 𝑙 𝑥 𝑎 • 𝑃𝑙 𝑎 = 𝑃𝑟𝑜𝑏(𝐺𝑙 = 𝑎|𝑥) 2. • 𝑙 𝑎 𝑙 + 1 𝑏 • 𝐶𝑙 𝑙+1 𝑎, 𝑏 = 𝑃𝑟𝑜𝑏 𝐺𝑙+1 = 𝑏 𝐺𝑙 = 𝑎 • • 𝑃𝑙+1 𝑏 = σ𝑎 𝑃𝑙 𝑎 ∙ 𝐶𝑙 𝑙+1 𝑎, 𝑏 3. 𝑋0 Pl+1 • 𝑃𝑙+1 = 𝑃𝑙 × 𝐶𝑙 𝑙+1 = 𝑃0 × 𝐶0 1 × 𝐶1 2 ×⋅⋅⋅× 𝐶𝑙 𝑙+1

7. Clustering Transformer ◼𝑋𝑙+1 𝐶𝑙+1 • ◼Statistical feature mappings • 𝑌𝑙 fc • 𝑄𝑙 ◼Softmax • 𝐶𝑙 𝑙+1 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥 1 𝑚 𝑌𝑙 ⊺ 𝑍𝑙+1 • 𝑚:

8. Goodness of Grouping ◼ • Modularity maximization [Newman, Phys. Rev2006] • • Collapse regularization • • • maximize centroid separation • • ◼

9. Training and Testing ◼ • ◼ • 𝐺𝑙 • pixel feature CNN clustering transformers • OWT-UCM

10. ◼ • Moco [He+, CVPR2020] • DenseCL [Wang+, CVPR2021] • Revisit [Gansbeke+, arXiv2021] • SegSort [Hwang+, ICCV2019] • DeepCluster 2018 [Caron+, ECCV2018] • Doersch 2015 [Doersch+, ECCV2015] • [Isora+, arXiv2016] • IIC [Ji+, ICCV2019] • AC [Ouali+, ECCV2020] 47 ◼ • Pascal VOC 2012 [Everingham+, IJCV2010] • MSCOCO [Lin+, ECCV2014] • Cityscapes [Cordts+, CVPR2016] • KITTI-STEP [Weber+, arXiv2021] • COCO-stuff [Caesar+, CVPR2018] • Potsdam [Gerke+, 2014] 17

11. KITTI-STEP Cityscapes

12.

13. Results on unsupervised semantic segmentation ◼ • mIoU, Accuracy

14. Results on hierarchical segmentation ◼Normalized Foreground Covering • 𝑠𝑓𝑔: ground truth

15. ◼HSG • Multiview Cosegmentation Clustering Transformer • ◼ • •

論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers

Similar to 論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers (20)

More from Toru Tamaki

More from Toru Tamaki (20)

Recently uploaded

Recently uploaded (20)

論文紹介：Unsupervised Hierarchical Semantic Segmentation With Multiview Cosegmentation and Clustering Transformers