SlideShare a Scribd company logo
2021 CVPR
Jaemin Jeong Seminar 2
Bisenet
Jaemin Jeong Seminar 3
Bisenet-V2
Jaemin Jeong Seminar 4
 Bisenet has been proved to be a popular two-stream network for real-time segmentation.
 However, its principle of adding an extra path to encode spatial information is time-consuming, and the
backbones borrowed from pretrained tasks, e.g., image classification, may be inefficient for image
segmentation due to the deficiency of task specific design.
 They propose a novel and efficient structure named Short-Term Dense Concatenate network (STDC network)
by removing structure redundancy.
 They gradually reduce the dimension of feature maps and use the aggregation of them for image
representation.
 In the decoder, they propose a Detail Aggregation module by integration the learning of spatial information
into low-level layers in single-stream manner
 Finally, the low-level features and deep features are fused to predict the final segmentation results.
 (Cityscape) we achieve 71.9% mIoU on the test set with a speed of 250.4 FPS on NVIDIA GTX 1080Ti, which
is 45.2% faster than the latest methods, and achieve 76.8% mIoU with 97.0 FPS while inferring on higher
resolution images.
Abstract
Jaemin Jeong Seminar 5
Performance
Jaemin Jeong Seminar 6
 They design a Short-Term Dense Concatenate module (STDC module) to extract deep features
with scalable receptive field and multi-scale information. This module promotes the performance of
our STDC network with affordable computational cost.
 They propose the Detail Aggregation module to learn the decoder, leading to more precise
preservation of spatial details in low-level layers without extra computation cost in the inference
time.
 They conduct extensive experiments to present the effectiveness of our methods. The experiment
results present that STDC networks achieve new state-of-the-art results on ImageNet, Cityscapes
and CamVid.
 Specifically, our STDC1-Seg50 achieves 71.9% mIoU on the Cityscapes test set at a speed of
250.4 FPS on one NVIDIA GTX 1080Ti card. Under the same experiment setting, our STDC2-
Seg75 achieves 76.8% mIoU at a speed of 97.0 FPS.
Introduction
Jaemin Jeong Seminar 7
 Short-Term Dense Concatenate Module
𝑥𝑖 = 𝐶𝑜𝑛𝑣𝑋𝑖 𝑥𝑖−1, 𝑘𝑖
𝑥𝑜𝑢𝑡𝑝𝑢𝑡 = 𝐹(𝑥1, 𝑥2, … , 𝑥𝑛)
ConvX includes one convolutional layer, one batch
normalization layer and ReLU activation layer, and 𝑘𝑖 is
the kernel size of convolutional layer.
Design of Encoding Network
Jaemin Jeong Seminar 8
Design of Encoding Network
 𝑆 : stride
 𝑅 : repeat
 𝐶 : output channels
 𝑀 : input channel
 𝑁 : output channel
Jaemin Jeong Seminar 9
Network Architecture
Jaemin Jeong Seminar 10
 Seg Head includes a 3×3 Conv-BN-ReLU operator followed with a 1 × 1 convolution to get the
output dimension N, which is set as the number of classes.
 we upsample the detail feature maps to the original size and fuse it with a trainable 1 × 1
convolution for dynamic re-wegihting.
 Finally, we adopt a threshold 0.1 to convert the predicted details to the final binary detail ground-
truth with boundary and corner informations.
 we use a Detail Head to produce the detail map, which guide the shallow layer to encode spatial
information.
 Detail Head includes a 3 × 3 Conv-BN-ReLU operator followed with a 1 × 1 convolution to get the
output detail map.
Network Architecture
Jaemin Jeong Seminar 11
Detail Loss
 Since the number of detail pixels is much less than the non-detail pixels, detail prediction is a class
imbalance problem.
 we adopt binary cross-entropy and dice loss to jointly optimize the detail learning.
 Dice loss measures the overlap between predict maps and ground-truth. Also, it is insensitive to
the number of foreground/background pixels, which means it can alleviating the class-imbalance
problem.
𝐿𝑑𝑒𝑡𝑎𝑖𝑙 𝑝𝑑, 𝑔𝑑 = 𝐿𝑑𝑖𝑐𝑒 𝑝𝑑, 𝑔𝑑 + 𝐿𝑏𝑐𝑒(𝑝𝑑, 𝑔𝑑)
𝐿𝑑𝑖𝑐𝑒 𝑝𝑑, 𝑔𝑑 = 1 −
2 𝑖
𝐻×𝑊
𝑝𝑑
𝑖
𝑔𝑑
𝑖
+ 𝜖
𝑖
𝐻×𝑊
𝑝𝑑
𝑖 2
+ 𝑖
𝐻×𝑊
𝑔𝑑
𝑖 2
+ 𝜖
 𝜖 = 1
Detail Ground-truth Generation
Jaemin Jeong Seminar 12
Experiments
Jaemin Jeong Seminar 13
Experiments
Jaemin Jeong Seminar 14
Experiments
Jaemin Jeong Seminar 15
Experiments
Jaemin Jeong Seminar 16
Experiments
We use 50 and 75 after the method name to represent the
input size 512×1024 and 768×1536 respectively.

More Related Content

What's hot

facility layout paper
 facility layout paper facility layout paper
facility layout paperSaurabh Tiwary
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLJanani C
 
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...Simplilearn
 
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...Artem Lutov
 
CAD: introduction to floorplanning
CAD:  introduction to floorplanningCAD:  introduction to floorplanning
CAD: introduction to floorplanningTeam-VLSI-ITMU
 
Gray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processingGray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processingNITHIN KALLE PALLY
 
Fuzzy control design_tutorial
Fuzzy control design_tutorialFuzzy control design_tutorial
Fuzzy control design_tutorialResul Çöteli
 
Numerical analysis m3 l2slides
Numerical analysis  m3 l2slidesNumerical analysis  m3 l2slides
Numerical analysis m3 l2slidesSHAMJITH KM
 
Faster computation with matlab
Faster computation with matlabFaster computation with matlab
Faster computation with matlabMuhammad Alli
 
Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...eSAT Publishing House
 
Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...eSAT Journals
 
Craft software for dummies
Craft software for dummiesCraft software for dummies
Craft software for dummiesRama Renspandy
 
Lecture 10 (Digital Image Processing)
Lecture 10 (Digital Image Processing)Lecture 10 (Digital Image Processing)
Lecture 10 (Digital Image Processing)VARUN KUMAR
 
Computer graphic software and data base
Computer graphic software and data baseComputer graphic software and data base
Computer graphic software and data baseSiddeshKumar N M
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu
 
Block diagram representation of DT systems
Block diagram representation of DT systemsBlock diagram representation of DT systems
Block diagram representation of DT systemsDr.SHANTHI K.G
 
Popular image restoration technique
Popular image restoration techniquePopular image restoration technique
Popular image restoration techniqueVARUN KUMAR
 

What's hot (20)

facility layout paper
 facility layout paper facility layout paper
facility layout paper
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
 
Plant Layout Algorithm
Plant Layout AlgorithmPlant Layout Algorithm
Plant Layout Algorithm
 
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
What Is Dynamic Programming? | Dynamic Programming Explained | Programming Fo...
 
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
 
CAD: introduction to floorplanning
CAD:  introduction to floorplanningCAD:  introduction to floorplanning
CAD: introduction to floorplanning
 
Gray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processingGray Image Watermarking using slant transform - digital image processing
Gray Image Watermarking using slant transform - digital image processing
 
Project 1
Project 1Project 1
Project 1
 
Fuzzy control design_tutorial
Fuzzy control design_tutorialFuzzy control design_tutorial
Fuzzy control design_tutorial
 
Numerical analysis m3 l2slides
Numerical analysis  m3 l2slidesNumerical analysis  m3 l2slides
Numerical analysis m3 l2slides
 
Faster computation with matlab
Faster computation with matlabFaster computation with matlab
Faster computation with matlab
 
Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...
 
Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...Fpga implementation of optimal step size nlms algorithm and its performance a...
Fpga implementation of optimal step size nlms algorithm and its performance a...
 
Craft software for dummies
Craft software for dummiesCraft software for dummies
Craft software for dummies
 
International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)
 
Lecture 10 (Digital Image Processing)
Lecture 10 (Digital Image Processing)Lecture 10 (Digital Image Processing)
Lecture 10 (Digital Image Processing)
 
Computer graphic software and data base
Computer graphic software and data baseComputer graphic software and data base
Computer graphic software and data base
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Block diagram representation of DT systems
Block diagram representation of DT systemsBlock diagram representation of DT systems
Block diagram representation of DT systems
 
Popular image restoration technique
Popular image restoration techniquePopular image restoration technique
Popular image restoration technique
 

Similar to 2022-01-17-Rethinking_Bisenet.pptx

IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Machine Vision on Embedded Hardware
Machine Vision on Embedded HardwareMachine Vision on Embedded Hardware
Machine Vision on Embedded HardwareJash Shah
 
34 8951 suseela g suseela paper8 (edit)new
34 8951 suseela g   suseela paper8 (edit)new34 8951 suseela g   suseela paper8 (edit)new
34 8951 suseela g suseela paper8 (edit)newIAESIJEECS
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2IAESIJEECS
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2IAESIJEECS
 
Kassem2009
Kassem2009Kassem2009
Kassem2009lazchi
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolutionPrudhvi Raj
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolutionPrudhvi Raj
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...VLSICS Design
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...VLSICS Design
 
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...TELKOMNIKA JOURNAL
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...IRJET Journal
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...rinzindorjej
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...rinzindorjej
 
Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467IJRAT
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...IJERA Editor
 

Similar to 2022-01-17-Rethinking_Bisenet.pptx (20)

IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Machine Vision on Embedded Hardware
Machine Vision on Embedded HardwareMachine Vision on Embedded Hardware
Machine Vision on Embedded Hardware
 
34 8951 suseela g suseela paper8 (edit)new
34 8951 suseela g   suseela paper8 (edit)new34 8951 suseela g   suseela paper8 (edit)new
34 8951 suseela g suseela paper8 (edit)new
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2
 
33 8951 suseela g suseela paper8 (edit)new2
33 8951 suseela g   suseela paper8 (edit)new233 8951 suseela g   suseela paper8 (edit)new2
33 8951 suseela g suseela paper8 (edit)new2
 
Kassem2009
Kassem2009Kassem2009
Kassem2009
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Deep learning for image super resolution
Deep learning for image super resolutionDeep learning for image super resolution
Deep learning for image super resolution
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
 
An35225228
An35225228An35225228
An35225228
 
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
Stochastic Computing Correlation Utilization in Convolutional Neural Network ...
 
A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...A Review on Color Recognition using Deep Learning and Different Image Segment...
A Review on Color Recognition using Deep Learning and Different Image Segment...
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
 
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
CONTRAST OF RESNET AND DENSENET BASED ON THE RECOGNITION OF SIMPLE FRUIT DATA...
 
6119ijcsitce01
6119ijcsitce016119ijcsitce01
6119ijcsitce01
 
Jv2517361741
Jv2517361741Jv2517361741
Jv2517361741
 
Jv2517361741
Jv2517361741Jv2517361741
Jv2517361741
 
Paper id 25201467
Paper id 25201467Paper id 25201467
Paper id 25201467
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
 

More from JAEMINJEONG5

Jaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJAEMINJEONG5
 
2021 04-04-google nmt
2021 04-04-google nmt2021 04-04-google nmt
2021 04-04-google nmtJAEMINJEONG5
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrasesJAEMINJEONG5
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretabilityJAEMINJEONG5
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layersJAEMINJEONG5
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basisJAEMINJEONG5
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformerJAEMINJEONG5
 
2020 12-04-shake shake
2020 12-04-shake shake2020 12-04-shake shake
2020 12-04-shake shakeJAEMINJEONG5
 
2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricksJAEMINJEONG5
 
2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heartJAEMINJEONG5
 
2020 11 1_sleep_net
2020 11 1_sleep_net2020 11 1_sleep_net
2020 11 1_sleep_netJAEMINJEONG5
 
2020 11 3_face_detection
2020 11 3_face_detection2020 11 3_face_detection
2020 11 3_face_detectionJAEMINJEONG5
 
white blood cell classification
white blood cell classificationwhite blood cell classification
white blood cell classificationJAEMINJEONG5
 

More from JAEMINJEONG5 (19)

Jaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptxJaemin_230701_Simple_Copy_paste.pptx
Jaemin_230701_Simple_Copy_paste.pptx
 
2021 05-04-u2-net
2021 05-04-u2-net2021 05-04-u2-net
2021 05-04-u2-net
 
2021 04-04-google nmt
2021 04-04-google nmt2021 04-04-google nmt
2021 04-04-google nmt
 
2021 04-03-sean
2021 04-03-sean2021 04-03-sean
2021 04-03-sean
 
2021 03-02-spade
2021 03-02-spade2021 03-02-spade
2021 03-02-spade
 
2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases2021 03-02-distributed representations-of_words_and_phrases
2021 03-02-distributed representations-of_words_and_phrases
 
2021 03-02-transformer interpretability
2021 03-02-transformer interpretability2021 03-02-transformer interpretability
2021 03-02-transformer interpretability
 
2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers2021 03-01-on the relationship between self-attention and convolutional layers
2021 03-01-on the relationship between self-attention and convolutional layers
 
2021 01-04-learning filter-basis
2021 01-04-learning filter-basis2021 01-04-learning filter-basis
2021 01-04-learning filter-basis
 
2021 01-02-linformer
2021 01-02-linformer2021 01-02-linformer
2021 01-02-linformer
 
2020 12-04-shake shake
2020 12-04-shake shake2020 12-04-shake shake
2020 12-04-shake shake
 
2020 12-03-vit
2020 12-03-vit2020 12-03-vit
2020 12-03-vit
 
2020 12-2-detr
2020 12-2-detr2020 12-2-detr
2020 12-2-detr
 
2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks2020 11 4_bag_of_tricks
2020 11 4_bag_of_tricks
 
2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart2020 11 2_automated sleep stage scoring of the sleep heart
2020 11 2_automated sleep stage scoring of the sleep heart
 
2020 11 1_sleep_net
2020 11 1_sleep_net2020 11 1_sleep_net
2020 11 1_sleep_net
 
2020 12-1-adam w
2020 12-1-adam w2020 12-1-adam w
2020 12-1-adam w
 
2020 11 3_face_detection
2020 11 3_face_detection2020 11 3_face_detection
2020 11 3_face_detection
 
white blood cell classification
white blood cell classificationwhite blood cell classification
white blood cell classification
 

Recently uploaded

KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationKIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationDr. Radhey Shyam
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamDr. Radhey Shyam
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdfKamal Acharya
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdfKamal Acharya
 
AI for workflow automation Use cases applications benefits and development.pdf
AI for workflow automation Use cases applications benefits and development.pdfAI for workflow automation Use cases applications benefits and development.pdf
AI for workflow automation Use cases applications benefits and development.pdfmahaffeycheryld
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturingssuser0811ec
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf884710SadaqatAli
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdfKamal Acharya
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientistgettygaming1
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsAtif Razi
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfPipe Restoration Solutions
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageRCC Institute of Information Technology
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdfKamal Acharya
 
shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxVishalDeshpande27
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdfKamal Acharya
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdfKamal Acharya
 

Recently uploaded (20)

KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and VisualizationKIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
KIT-601 Lecture Notes-UNIT-5.pdf Frame Works and Visualization
 
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
AI for workflow automation Use cases applications benefits and development.pdf
AI for workflow automation Use cases applications benefits and development.pdfAI for workflow automation Use cases applications benefits and development.pdf
AI for workflow automation Use cases applications benefits and development.pdf
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturing
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical SolutionsRS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
RS Khurmi Machine Design Clutch and Brake Exercise Numerical Solutions
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdfONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
ONLINE VEHICLE RENTAL SYSTEM PROJECT REPORT.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptx
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Online resume builder management system project report.pdf
Online resume builder management system project report.pdfOnline resume builder management system project report.pdf
Online resume builder management system project report.pdf
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 

2022-01-17-Rethinking_Bisenet.pptx

  • 3. Jaemin Jeong Seminar 3 Bisenet-V2
  • 4. Jaemin Jeong Seminar 4  Bisenet has been proved to be a popular two-stream network for real-time segmentation.  However, its principle of adding an extra path to encode spatial information is time-consuming, and the backbones borrowed from pretrained tasks, e.g., image classification, may be inefficient for image segmentation due to the deficiency of task specific design.  They propose a novel and efficient structure named Short-Term Dense Concatenate network (STDC network) by removing structure redundancy.  They gradually reduce the dimension of feature maps and use the aggregation of them for image representation.  In the decoder, they propose a Detail Aggregation module by integration the learning of spatial information into low-level layers in single-stream manner  Finally, the low-level features and deep features are fused to predict the final segmentation results.  (Cityscape) we achieve 71.9% mIoU on the test set with a speed of 250.4 FPS on NVIDIA GTX 1080Ti, which is 45.2% faster than the latest methods, and achieve 76.8% mIoU with 97.0 FPS while inferring on higher resolution images. Abstract
  • 5. Jaemin Jeong Seminar 5 Performance
  • 6. Jaemin Jeong Seminar 6  They design a Short-Term Dense Concatenate module (STDC module) to extract deep features with scalable receptive field and multi-scale information. This module promotes the performance of our STDC network with affordable computational cost.  They propose the Detail Aggregation module to learn the decoder, leading to more precise preservation of spatial details in low-level layers without extra computation cost in the inference time.  They conduct extensive experiments to present the effectiveness of our methods. The experiment results present that STDC networks achieve new state-of-the-art results on ImageNet, Cityscapes and CamVid.  Specifically, our STDC1-Seg50 achieves 71.9% mIoU on the Cityscapes test set at a speed of 250.4 FPS on one NVIDIA GTX 1080Ti card. Under the same experiment setting, our STDC2- Seg75 achieves 76.8% mIoU at a speed of 97.0 FPS. Introduction
  • 7. Jaemin Jeong Seminar 7  Short-Term Dense Concatenate Module 𝑥𝑖 = 𝐶𝑜𝑛𝑣𝑋𝑖 𝑥𝑖−1, 𝑘𝑖 𝑥𝑜𝑢𝑡𝑝𝑢𝑡 = 𝐹(𝑥1, 𝑥2, … , 𝑥𝑛) ConvX includes one convolutional layer, one batch normalization layer and ReLU activation layer, and 𝑘𝑖 is the kernel size of convolutional layer. Design of Encoding Network
  • 8. Jaemin Jeong Seminar 8 Design of Encoding Network  𝑆 : stride  𝑅 : repeat  𝐶 : output channels  𝑀 : input channel  𝑁 : output channel
  • 9. Jaemin Jeong Seminar 9 Network Architecture
  • 10. Jaemin Jeong Seminar 10  Seg Head includes a 3×3 Conv-BN-ReLU operator followed with a 1 × 1 convolution to get the output dimension N, which is set as the number of classes.  we upsample the detail feature maps to the original size and fuse it with a trainable 1 × 1 convolution for dynamic re-wegihting.  Finally, we adopt a threshold 0.1 to convert the predicted details to the final binary detail ground- truth with boundary and corner informations.  we use a Detail Head to produce the detail map, which guide the shallow layer to encode spatial information.  Detail Head includes a 3 × 3 Conv-BN-ReLU operator followed with a 1 × 1 convolution to get the output detail map. Network Architecture
  • 11. Jaemin Jeong Seminar 11 Detail Loss  Since the number of detail pixels is much less than the non-detail pixels, detail prediction is a class imbalance problem.  we adopt binary cross-entropy and dice loss to jointly optimize the detail learning.  Dice loss measures the overlap between predict maps and ground-truth. Also, it is insensitive to the number of foreground/background pixels, which means it can alleviating the class-imbalance problem. 𝐿𝑑𝑒𝑡𝑎𝑖𝑙 𝑝𝑑, 𝑔𝑑 = 𝐿𝑑𝑖𝑐𝑒 𝑝𝑑, 𝑔𝑑 + 𝐿𝑏𝑐𝑒(𝑝𝑑, 𝑔𝑑) 𝐿𝑑𝑖𝑐𝑒 𝑝𝑑, 𝑔𝑑 = 1 − 2 𝑖 𝐻×𝑊 𝑝𝑑 𝑖 𝑔𝑑 𝑖 + 𝜖 𝑖 𝐻×𝑊 𝑝𝑑 𝑖 2 + 𝑖 𝐻×𝑊 𝑔𝑑 𝑖 2 + 𝜖  𝜖 = 1 Detail Ground-truth Generation
  • 12. Jaemin Jeong Seminar 12 Experiments
  • 13. Jaemin Jeong Seminar 13 Experiments
  • 14. Jaemin Jeong Seminar 14 Experiments
  • 15. Jaemin Jeong Seminar 15 Experiments
  • 16. Jaemin Jeong Seminar 16 Experiments We use 50 and 75 after the method name to represent the input size 512×1024 and 768×1536 respectively.

Editor's Notes

  1. Hello The title of the paper to be presented is Rethinking BiSeNet For Real-time Semantic Segmentation. It is adopted in CVPR 2021.
  2. First, we need to know the model structure of BiseNet. Bisenet is a semantic segmentation model and consists of Spatial Path and Context Path. There are three layer in Spatial Path, and each layer consists of convolution layer with stride 2, batch normalization, and ReLU. Therefore, the size of feature map becomes 1/8 of the original image, and rich spatial information can be stored. Context Path uses a lightweight model to perform downsampling quickly, so a large Receptive Field can be obtained, and Receptive Field that maximizes global context information can be provided using global average pooling. Attention Refinement Module is a module for combining features obtained from Context Path. It obtain global context using global average pooling and compute attention refinement. It also has the advantage of being able to easily Feature Fusion Module is a module for combining two features. We cannot simply combine two features. Spatial path has low-level features and context path has high-level features. To combine this information, we connect the two features, scale them using batch normalization, and obtain global pooling and weight vectors. Spatial Path : Spatial path에는 세개의 레이어가 존재하며 각 층에서 stride가 2인 컨볼루션을 진행하며 batch nomalization과 ReLU를 진행합니다. 따라서 feature의 크기는 원본 이미지의 1/8이 되며 풍부한 Spatial information을 저장할 수 있습니다. Context Path : Context Path에서는 경량화된 모델(xception)을 사용해 빠르게 다운 샘플링을 진행할 수 있어서 큰 Receptive Field를 얻을 수 있고 Global average pooling 을 이용해 global context 정보를 최대 Receptive Field 제공할 수 있습니다. Attention Refinement Module : Context Path에서 얻은 특징들을 결합하기 위한 모듈입니다. Global average pooling을 사용해 Global context를 얻고 attention refinement 를 계산합니다. 또한 업샘플링 없이 Global context 정보를 쉽게 결합할 수 있는 장점이 있습니다. Feature Fusion Modul (FFM) : 두가지 feature를 결합하기 위한 모듈입니다. 두가지 feature를 간단히 결합할 수 없습니다. Spatial path에는 low-level의 feature를 가지고 있고 Context Path에서는 High-level feature를 가지고 있습니다. 이 정보들을 결합하기 위해 두개의 feature를 연결하고 batch nomalization을 사용해 스케일을 조정하고 global pooling 및 가중치 벡터를 구합니다.
  3. bisenetv2 is an improved version of bisenet's model. First, bisenetv2 simplifies the structure by eliminating time-consuming cross-layer connections. The overall structure has been changed to a more compact structure and well-designed components. We made the context path deeper so we can encode more detail. We designed light-weight components using depth-wise convolution in spatial path. A comprehensive ablative experiment was performed. Performance has been greatly improved. Context Branch Since low-level information must be contained, it must have an abundant amount of channels. Therefore, the number of layers is reduced while increasing the amount of channels. Since it has a wide spatial size, residual connection is not performed. Spatial Branch Contrary to the detail branch, it has a small amount of channels and stacks the layers deeply. In this case, a fast-down sampling method is used to widen a wide receptive field and increase the level of feature representation. Aggregation Layer It is a layer for merging the above two branches. Because fast-down sampling is used in the semantic branch, the output is small compared to the detail branch. Therefore, we upsampling the output feature map of the semantic branch. After that, each element-wise product is processed and then added. ------------------------------------------------------------ ------------------------------------------------------------ ------------------- Context Branch It consists of a total of three layers, and each includes convolution, batch normalization, and ReLu activation functions. Finally, a feature map with the size of 1/8 of the input can be extracted. Spatial Branch Spatial Branch uses Stem Block, Context Embedding Block, Gather, and Expansion Layer structures. Stem Block Concat the two branches after downsampling in different ways. By using the following strategy, we were able to reduce computation cost and select features effectively.Context Embedding BlockAverage pooling and resisual connection are used to efficiently obtain global contextual information. Gather and Expansion Layer depth-wise convolution is used, and finally, 1X1 convolution is used to project the output of depth-wise conv. Unlike the structure of the existing mobilenet v2, the GE layer was able to obtain better feature quality by using one more 3X3 convolution. Booster Training Strategy A booster strategy is introduced for segmentation accuracy. During training, the feature representation is improved, and the computation cost is not very large during inference. Inserted between semantic branches. 구조 단순화 시간이 많이 소요되는 cross-layer connection을 없앰 전체적인 구조를 더 compact한 구조와 well-designed components로 변경 Detail Path를 더 깊게 만들어서 더 많은 detail을 encode할 수 있도록 함 (context) Semantic Path에서 depth-wise convolution을 사용한 light-weight components을 설계 (spatial) comprehensive ablative experiment를 수행 성능 향상 많이 함 ---------------------------------------------------------------------------------------------------------------------- Detail Branch low-level의 정보들이 담겨야 하므로 풍부한 channel 양을 가져야 한다. 따라서 채널 양은 많게 하면서 Layer의 수를 줄인다. 넓은 spatial size를 가지고 있으므로 residual connection을 진행하지 않는다. Semantic Branch Detail branch와는 반대로 적은 채널 양을 가지며 layer를 깊게 쌓는다. 이때 넓은 receptive field 넓히고 feature representation의 level을 높이기 위해 fast-down sampling 방법을 이용한다. Aggregation Layer 위 두 가지 branch를 merge 하기 위한 layer이다. semantic branch에서 fast-down sampling을 사용했기 때문에 detail branch에 비해 output이 작다. 따라서 semantic branch의 output feature map을 upsampling 한다. 이후 각각 element-wise product를 진행한 후 더한다. ----------------------------------------------------------------------------------------------------------------------- Detail Branch 총 3개의 layer로 이루어져 있으며 각각 convolution과 batch normalization, 그리고 ReLu 활성화 함수가 포함되어있다. 최종적으로 input의 1/8 크기의 feature map을 뽑을 수 있다. Semantic Branch - Semantic Branch는 Stem Block과 Context Embedding Block, Gather and Expansion Layer 구조를 이용한다.  Stem Block 서로 다른 방식의 downsampling을 진행한 이후에 두 branch를 concat 한다. 다음과 같은 전략을 사용함으로써 computation cost를 줄이고 효과적으로 feature를 뽑을 수 있었다. Context Embedding Block global contextual 정보를 효율적으로 얻기 위해서 average pooling과 resisual connection을 사용했다. Gather and Expansion Layerdepth-wise convolution을 이용하며 마지막에 1X1 convolution을 이용해 depth-wise conv의 output을 projection 시킨다. 기존 mobilenet v2의 구조와는 다르게 GE layer는 3X3 convolution을 하나 더 사용함으로써 더 좋은 feature quality를 얻을 수 있었다. Booster Training Strategysegmentation accuracy를 위해서 booster strategy를 도입했다. 트레이닝 시, feature representation을 향상시키며 inference 시에는 computation cost는 그리 크지 않다. semantic branch 사이사이에 삽입시킨다. -------------------------------------------------------------------------------------------------------------------------
  4. STDC is the proposed method in this paper. STDC has better mean IOU and inference speed compared to other methods. Bisenet은 보다 높습니다. 다른 방법들에 비해서도 월등히 뛰어납니다.
  5. 만약 그림 c처럼 block2에서 다운샘플링하면 그전 레이어를 fusion 할 때는 stride 2를 가지는 average pooling 하여 fusion한다. If downsampling in block 2 as shown in figure c, fusion is performed by average pooling with stride 2 when fusion of the previous layer. Short-Term Dense Concatenate Module Kernel size First block 1 Rest of them 3 we focus on scalable receptive field and multi-scale informations. Low-level layers need enough channels to encode more fine-grained informations with small receptive field, while high-level layers with large receptive field focus more on high-level information induction, setting the same channel with low-level layers may cause information redundancy. Down-sample is only happened in Block2.
  6. S R C M N is stride repeat output channels input channel output channel respectively
  7. Stage 3 : 1/8 Stage 4 : 1/16 Stage 5 : 1/32 In the figure, the blue area is used for training and inference. The green area is used for training only. Laplacian Conv is just 2d convolution
  8. STDC is faster than other methods and has a high mean IOU.
  9. Detail Guidance 를 사용했을때와 안했을 때를 비교합니다. 사용할 때 조금 더 디테일한 부분을 살리게 됩니다. In this figure, they compare the case with and without the Detail Guidance. When Detail Guidance is used, more detail is detected.
  10. This table compares using Spatial Path and using Detail Guidance. STDC using Detail Guidance has some performance and speed improvements than STDC using Spatial Path. STDC는 Spatial Path를 제거하고 약간의 성능 향상과 속도 향상을 가집니다.
  11. From this table, we can see that STDC1 has a fast speed and maintains similar accuracy to the previous method, and STDC2 has a high accuracy while having a similar speed to the previous method.
  12. Ej 높은 정확도와 더 높은 FPS 50 70은 각각 이미지 크기를 의미합니다. This table represents the mean IOU and FPS according to the backbone network. 50 means that input size is 512 x 1024. 75 means that input size is 768 x 1536