SlideShare a Scribd company logo
1 of 18
SCRDet++:DetectingSmall,Cluttered
andRotatedObjectsviaInstance-Level
FeatureDenoisingandRotationLoss
Smoothing
XueYang,JunchiYanMember
,IEEE,XiaokangYangFellow,
IEEE,JinTang,WenlongLiao,TaoHe
Neha
SCI Research Lab
neha@kent.edu
Contents:
Object detection
Instance level denoising (InLD) in the Feature Map
The pipeline
How Instance-level Feature Map Denoising works
Mathematical foundation to remove instance level noise
Rotated object detection
Horizontal vs Rotated object detection
Datasets
Experiment
Effect of Instance-Level Denoising
Results
Object detection
When humans look at images or video, they can recognize and
locate objects of interest within a matter of moments.
Similarly, Object detection is a computer vision technique for
locating instances of objects in images or videos and The goal of
object detection is to replicate this intelligence using a computer.
Limitations of Current detectors
- small size, cluttered arrangement, and arbitrary orientations
1) Small objects Overwhelmed by complex
surrounding.
2) Cluttered arrangement
– Densely arranged objects
– inter- class feature coupling and intraclass feature
boundary blur
3) Arbitrary orientations.
Rotation detection > Axis aligned detection
The horizontal bounding box for a rotated object is more loose than an aligned
rotated one, such that the box contains a large portion of background or nearby
cluttered objects as disturbance.
A way to dismiss the noisy interference
from both background and other
foreground objects
Types of noises
1. Image level noise.
2. Instance level noise
– Mutual interference between objects
– interference between object and
background
Denoising is performed on raw image
for the purpose of image enhancement,
and it also improves the detection
performance of small objects.
Instance level
denoising
(InLD) in the
Feature Map
(InLD) is realized by supervised segmentation.
Instance Level Denoising ( InLD) is applied to decouple the features of
different object categories into their respective channels.
At the same time features of the object and background are
enhanced and weakened, respectively in the spatial domain.
• Rotated objects = Smooth L1 Loss + IoU constant factor
• > five parameter regression
• discountinous boundaries
• Periodicity of angular
• Exchangeability of edges
The pipeline
SCRDet++ mainly consists of four
modules:
– Feature extraction
– Image-level denoising module
– Instance-level denoising module
– ‘class+box’
Fig 1.
Instance-level Feature Map
Denoising
Instance-Level Noise has adversary effects on feature map.,
such as:
– The non-object with object-like shape has a higher
response in the feature map, especially for small
objects (see the top row of Fig. 2).
– Clutter objects that are densely arranged tend to
suffer the issue for inter-class feature coupling and
intra-class feature boundary blurring
– The response of object is not prominent enough
surrounded by the background
Fig. 2. Images (left) and their feature maps before (middle) and after (right) the
instance-level denoising operation. First row: non-object with object-like
shape. Second row: inter-class feature coupling and intra- class feature
boundary blurring.
Fig 2.
Mathematical foundation to remove
instance level noise
– Reweight the convolutional response maps [10].
– Important parts > uninformative ones
Fig 3.
- X, Y ∈ R^(C ×H ×W) are two feature maps of input image
- A(X) is an attention function
- ⊙ is the element-wise product
- Ws ∈ R^H×W and Wc ∈ R^C denote the spatial weight and
channel weight
- Wci indicates the weight of the i-th channel
- U, concatenation operation for connecting tensor among the
feature map
The new formulation which considers the total I number of object categories with one additional category
for background is as follows:
During the implementation of InLD, learned weights are regarded as a result of semantic segmentation task, where the
feature responses of each category on the previous layers of the output layer are separated in the channel dimension, and the
feature responses of the foreground and background in the spatial dimension are also polarized.
s
• Channel dimension = inter class features
• Spatial dimension = intra class features
• Original feature map + Denoised feature map = Decoupled feature map
Rotated object detection
– Ideal case: The blue box rotates
counterclockwise to the red box.
Limitations: Higher loss due to
periodicity of angular (PoA) and
exchangeability of edges (EoE)
whereas, rotating the bounding box
clockwise while scaling w and h adds more
complexity
– Thus, Add IoU constant factor in the
traditional smooth L1 loss
– The new regression loss
– determines the direction of gradient
propagation
– And magnitude of gradient
Fig 4.
Horizontal vs Rotated Object detection
Horizontal Object detection
Uses: Multi-task Loss
Rotated object detection
Uses: smooth L1 loss + IoU constant factor
Datasets
DOTA DIOR UCAS-AOD BSTLD S2 TLD
• 2806 Aerial
images
• 15 object classes
• 188,282
instances
• 23,463 Aerial
images
• 20 object classes
• 190,288
instances
• 1510 Aerial
images
• 2 object classes
• 14,596 instances
• 13,427 camera
images
• Few instances of
many categories
• 5,786 images
• 5 object
categories
• 14,130 instances
In addition to the above datasets, they also use natural image dataset COCO [8] and scene text dataset
ICDAR2015 [28] for further evaluation.
Experiment
Server with a GeForce RTX 2080 Ti and 11G memory.
– Initialization by ResNet50 [14] by default.
– The weight decay and momentum for all experiments are set 0.0001 and 0.9, respectively.
– A Momentum Optimizer was employed over 8 GPUs with a total of 8 images per minibatch.
– Standard evaluation protocol of COCO, while for other datasets, the anchors of RetinaNet-based
method were used with seven aspect ratios {1, 1/2, 2, 1/3, 3, 5, 1/5} and three scales {20 , 21/3 ,
22/3 }.
– For rotating anchor-based method (RetinaNet-R), the angle is set by an arithmetic progression
from −90◦ to −15◦ with an interval of 15 degrees.
Effect of Instance-Level Denoising
– Improved accuracy
– Effect of IoU-Smooth L1 Loss
– Eliminates the boundary effects of the angle,
– Model easily regresses the object coordinates.
– The new loss improves three detectors’(RetinaNet-R [4],SCRDet [3], FPN [15] ) accuracy to 69.83%, 68.65% and
76.20%, respectively.
– Effect of Data Augmentation and Backbone.
– Used ResNet101
– Improvement from 69.81% → 72.98%.
– Final performance of the model was improved from 72.98% to 74.41% by using ResNet152 as backbone.
– InLD with the state-of-the-art algorithms on
two datasets DOTA [16] and DIOR [17]
outperforms all other models and achieves
the best performance, 76.56% and 76.81%
respectively.
– Methods achieve the best performance,
76.56% and 76.81% respectively 77.80%
and 75.11% mAP on FPN and RetinaNet
based methods.
– Table.1 illustrates the comparison of
performance on UCAS-AOD dataset.
– Method achieves 96.95% for OBB task and
is the best out of all the existing published
methods.
Method mAP Plane Car
YOLOv2 [18] 87.90 96.60 79.20
R-DFPN [12] 89.20 95.90 82.50
DRBox [19] 89.95 94.90 85.00
S2 ARN [20] 94.90 97.60 92.20
RetinaNet-H
[4]
95.47 97.34 93.60
ICN [21] 95.67 - -
FADet [22] 95.71 98.69 92.72
R3 Det [4] 96.17 98.20 94.14
SCRDet++ (R3
Det-based)
96.95 98.93 94.97
TABLE: 1 Performance by accuracy (%) on UCAS-AOD dataset.
Results:
References:
[1] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing imagery,” in Asian Conference on
Computer Vision. Springer, 2018, pp. 150–165.
[2] J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning roi transformer for oriented object detection in aerial images,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), June 2019.
[3] X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, X. Sun, and K. Fu, “Scrdet: Towards more robust detection for small, cluttered and rotated objects,” in Proceedings of the
IEEE International Conference on Computer Vision (ICCV), October 2019.
[4] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refinement for rotating object,” arXiv preprint arXiv:1908.05612, 2019.
[5] W. Qian, X. Yang, S. Peng, Y. Guo, and C. Yan, “Learn- ing modulated loss for rotated object detection,” arXiv preprint arXiv:1911.08299, 2019.
[6] Y. Xu, M. Fu, Q. Wang, Y. Wang, K. Chen, G.-S. Xia, and X. Bai, “Gliding vertex on the horizontal bounding box for multi-oriented object detection,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, 2020.
[7] H.Wei,L.Zhou,Y.Zhang,H.Li,R.Guo,andH.Wang,“Oriented objects as pairs of middle lines,” arXiv preprint arXiv:1912.10694, 2019.
[8] Z. Xiao, L. Qian, W. Shao, X. Tan, and K. Wang, “Axis learning for orientated objects detection in aerial images,” Remote Sensing, vol. 12, no. 6, p. 908, 2020.
[9] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018,
pp. 7794–7803.
[10] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp.
7132–7141.
[11] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refi
[12] X. Yang, H. Sun, K. Fu, J. Yang, X. Sun, M. Yan, and Z. Guo, “Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale
rotation dense feature pyramid networks,” Remote Sensing, vol. 10, no. 1, p. 132, 2018.
[13] X. Yang, H. Sun, X. Sun, M. Yan, Z. Guo, and K. Fu, “Position detection and direction prediction for arbitrary-oriented ships via multitask
rotation region convolutional neural network,” IEEE Access, vol. 6, pp. 50 839–50 849, 2018.
[14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[15 ] T.-Y. Lin, P. Dolla ́r, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection.” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, no. 2, 2017, p. 4.
[16] G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “Dota: A large-scale dataset for object detection
in aerial images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[17] K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS
Journal of Photogrammetry and Remote Sensing, vol. 159, pp. 296–307, 2020.
[18] J. Redmon and A. Farhadi, “Yolo9000: better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2017, pp. 7263–7271.
[19] L. Liu, Z. Pan, and B. Lei, “Learning a rotation invariant detector with rotatable bounding box,” arXiv preprint arXiv:1711.09405, 2017.
[20] S. Bao, X. Zhong, R. Zhu, X. Zhang, Z. Li, and M. Li, “Single shot anchor refinement network for oriented object detection in optical remote
sensing imagery,” IEEE Access, vol. 7, pp. 87150–87161, 2019.
[21] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing
imagery,” in Asian Conference on Computer Vision. Springer, 2018, pp. 150–165.
[22] C. Li, C. Xu, Z. Cui, D. Wang, T. Zhang, and J. Yang, “Feature- attentioned object detection in remote sensing imagery,” in 2019 IEEE
International Conference on Image Processing (ICIP). IEEE, 2019, pp. 3886–3890.

More Related Content

What's hot

A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
 A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv... A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
Chennai Networks
 

What's hot (20)

YOLO
YOLOYOLO
YOLO
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detection
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
 A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv... A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
A Genetic Algorithm-Based Moving Object Detection For Real-Time Traffic Surv...
 
motion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videosmotion and feature based person tracking in survillance videos
motion and feature based person tracking in survillance videos
 
YOLO9000 - PR023
YOLO9000 - PR023YOLO9000 - PR023
YOLO9000 - PR023
 
Yolov3
Yolov3Yolov3
Yolov3
 
Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421Yolo v2 ai_tech_20190421
Yolo v2 ai_tech_20190421
 
AN ADAPTIVE MESH METHOD FOR OBJECT TRACKING
AN ADAPTIVE MESH METHOD FOR OBJECT TRACKING AN ADAPTIVE MESH METHOD FOR OBJECT TRACKING
AN ADAPTIVE MESH METHOD FOR OBJECT TRACKING
 
Detection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A SurveyDetection and Tracking of Moving Object: A Survey
Detection and Tracking of Moving Object: A Survey
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image SegmentationDeconvNet, DecoupledNet, TransferNet in Image Segmentation
DeconvNet, DecoupledNet, TransferNet in Image Segmentation
 
Overview Of Video Object Tracking System
Overview Of Video Object Tracking SystemOverview Of Video Object Tracking System
Overview Of Video Object Tracking System
 
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCESTRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
TRACKING OF PARTIALLY OCCLUDED OBJECTS IN VIDEO SEQUENCES
 
Visual object tracking using particle clustering - ICITACEE 2014
Visual object tracking using particle clustering - ICITACEE 2014Visual object tracking using particle clustering - ICITACEE 2014
Visual object tracking using particle clustering - ICITACEE 2014
 
K-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective BackgroundK-Means Clustering in Moving Objects Extraction with Selective Background
K-Means Clustering in Moving Objects Extraction with Selective Background
 
Deep learning for object detection
Deep learning for object detectionDeep learning for object detection
Deep learning for object detection
 
Real-time Object Tracking
Real-time Object TrackingReal-time Object Tracking
Real-time Object Tracking
 
IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3IRJET - Real Time Object Detection using YOLOv3
IRJET - Real Time Object Detection using YOLOv3
 
Presentation of Visual Tracking
Presentation of Visual TrackingPresentation of Visual Tracking
Presentation of Visual Tracking
 

Similar to Scrdet++ analysis

NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
paperpublications3
 
Stixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normalStixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normal
TaeKang Woo
 

Similar to Scrdet++ analysis (20)

最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Blurclassification
BlurclassificationBlurclassification
Blurclassification
 
物件偵測與辨識技術
物件偵測與辨識技術物件偵測與辨識技術
物件偵測與辨識技術
 
When Remote Sensing Meets Artificial Intelligence
When Remote Sensing Meets Artificial IntelligenceWhen Remote Sensing Meets Artificial Intelligence
When Remote Sensing Meets Artificial Intelligence
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Object Recogniton Based on Undecimated Wavelet Transform
Object Recogniton Based on Undecimated Wavelet TransformObject Recogniton Based on Undecimated Wavelet Transform
Object Recogniton Based on Undecimated Wavelet Transform
 
Gesture Recognition using Principle Component Analysis & Viola-Jones Algorithm
Gesture Recognition using Principle Component Analysis &  Viola-Jones AlgorithmGesture Recognition using Principle Component Analysis &  Viola-Jones Algorithm
Gesture Recognition using Principle Component Analysis & Viola-Jones Algorithm
 
Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]Objects as points (CenterNet) review [CDM]
Objects as points (CenterNet) review [CDM]
 
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...
 
LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.LIDAR- Light Detection and Ranging.
LIDAR- Light Detection and Ranging.
 
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...IRJET-  	  Identification of Missing Person in the Crowd using Pretrained Neu...
IRJET- Identification of Missing Person in the Crowd using Pretrained Neu...
 
J017426467
J017426467J017426467
J017426467
 
Stixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normalStixel based real time object detection for ADAS using surface normal
Stixel based real time object detection for ADAS using surface normal
 
Fn2611681170
Fn2611681170Fn2611681170
Fn2611681170
 
Object Detection and Tracking AI Robot
Object Detection and Tracking AI RobotObject Detection and Tracking AI Robot
Object Detection and Tracking AI Robot
 
mini prjt
mini prjtmini prjt
mini prjt
 
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...
 
Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...
Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...
Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...
 
Intelligent Auto Horn System Using Artificial Intelligence
Intelligent Auto Horn System Using Artificial IntelligenceIntelligent Auto Horn System Using Artificial Intelligence
Intelligent Auto Horn System Using Artificial Intelligence
 
E0333021025
E0333021025E0333021025
E0333021025
 

Recently uploaded

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 

Scrdet++ analysis

  • 2. Contents: Object detection Instance level denoising (InLD) in the Feature Map The pipeline How Instance-level Feature Map Denoising works Mathematical foundation to remove instance level noise Rotated object detection Horizontal vs Rotated object detection Datasets Experiment Effect of Instance-Level Denoising Results
  • 3. Object detection When humans look at images or video, they can recognize and locate objects of interest within a matter of moments. Similarly, Object detection is a computer vision technique for locating instances of objects in images or videos and The goal of object detection is to replicate this intelligence using a computer. Limitations of Current detectors - small size, cluttered arrangement, and arbitrary orientations
  • 4. 1) Small objects Overwhelmed by complex surrounding. 2) Cluttered arrangement – Densely arranged objects – inter- class feature coupling and intraclass feature boundary blur 3) Arbitrary orientations. Rotation detection > Axis aligned detection The horizontal bounding box for a rotated object is more loose than an aligned rotated one, such that the box contains a large portion of background or nearby cluttered objects as disturbance.
  • 5. A way to dismiss the noisy interference from both background and other foreground objects Types of noises 1. Image level noise. 2. Instance level noise – Mutual interference between objects – interference between object and background Denoising is performed on raw image for the purpose of image enhancement, and it also improves the detection performance of small objects.
  • 6. Instance level denoising (InLD) in the Feature Map (InLD) is realized by supervised segmentation. Instance Level Denoising ( InLD) is applied to decouple the features of different object categories into their respective channels. At the same time features of the object and background are enhanced and weakened, respectively in the spatial domain. • Rotated objects = Smooth L1 Loss + IoU constant factor • > five parameter regression • discountinous boundaries • Periodicity of angular • Exchangeability of edges
  • 7. The pipeline SCRDet++ mainly consists of four modules: – Feature extraction – Image-level denoising module – Instance-level denoising module – ‘class+box’ Fig 1.
  • 8. Instance-level Feature Map Denoising Instance-Level Noise has adversary effects on feature map., such as: – The non-object with object-like shape has a higher response in the feature map, especially for small objects (see the top row of Fig. 2). – Clutter objects that are densely arranged tend to suffer the issue for inter-class feature coupling and intra-class feature boundary blurring – The response of object is not prominent enough surrounded by the background Fig. 2. Images (left) and their feature maps before (middle) and after (right) the instance-level denoising operation. First row: non-object with object-like shape. Second row: inter-class feature coupling and intra- class feature boundary blurring. Fig 2.
  • 9. Mathematical foundation to remove instance level noise – Reweight the convolutional response maps [10]. – Important parts > uninformative ones Fig 3. - X, Y ∈ R^(C ×H ×W) are two feature maps of input image - A(X) is an attention function - ⊙ is the element-wise product - Ws ∈ R^H×W and Wc ∈ R^C denote the spatial weight and channel weight - Wci indicates the weight of the i-th channel - U, concatenation operation for connecting tensor among the feature map
  • 10. The new formulation which considers the total I number of object categories with one additional category for background is as follows: During the implementation of InLD, learned weights are regarded as a result of semantic segmentation task, where the feature responses of each category on the previous layers of the output layer are separated in the channel dimension, and the feature responses of the foreground and background in the spatial dimension are also polarized. s • Channel dimension = inter class features • Spatial dimension = intra class features • Original feature map + Denoised feature map = Decoupled feature map
  • 11. Rotated object detection – Ideal case: The blue box rotates counterclockwise to the red box. Limitations: Higher loss due to periodicity of angular (PoA) and exchangeability of edges (EoE) whereas, rotating the bounding box clockwise while scaling w and h adds more complexity – Thus, Add IoU constant factor in the traditional smooth L1 loss – The new regression loss – determines the direction of gradient propagation – And magnitude of gradient Fig 4.
  • 12. Horizontal vs Rotated Object detection Horizontal Object detection Uses: Multi-task Loss Rotated object detection Uses: smooth L1 loss + IoU constant factor
  • 13. Datasets DOTA DIOR UCAS-AOD BSTLD S2 TLD • 2806 Aerial images • 15 object classes • 188,282 instances • 23,463 Aerial images • 20 object classes • 190,288 instances • 1510 Aerial images • 2 object classes • 14,596 instances • 13,427 camera images • Few instances of many categories • 5,786 images • 5 object categories • 14,130 instances In addition to the above datasets, they also use natural image dataset COCO [8] and scene text dataset ICDAR2015 [28] for further evaluation.
  • 14. Experiment Server with a GeForce RTX 2080 Ti and 11G memory. – Initialization by ResNet50 [14] by default. – The weight decay and momentum for all experiments are set 0.0001 and 0.9, respectively. – A Momentum Optimizer was employed over 8 GPUs with a total of 8 images per minibatch. – Standard evaluation protocol of COCO, while for other datasets, the anchors of RetinaNet-based method were used with seven aspect ratios {1, 1/2, 2, 1/3, 3, 5, 1/5} and three scales {20 , 21/3 , 22/3 }. – For rotating anchor-based method (RetinaNet-R), the angle is set by an arithmetic progression from −90◦ to −15◦ with an interval of 15 degrees.
  • 15. Effect of Instance-Level Denoising – Improved accuracy – Effect of IoU-Smooth L1 Loss – Eliminates the boundary effects of the angle, – Model easily regresses the object coordinates. – The new loss improves three detectors’(RetinaNet-R [4],SCRDet [3], FPN [15] ) accuracy to 69.83%, 68.65% and 76.20%, respectively. – Effect of Data Augmentation and Backbone. – Used ResNet101 – Improvement from 69.81% → 72.98%. – Final performance of the model was improved from 72.98% to 74.41% by using ResNet152 as backbone.
  • 16. – InLD with the state-of-the-art algorithms on two datasets DOTA [16] and DIOR [17] outperforms all other models and achieves the best performance, 76.56% and 76.81% respectively. – Methods achieve the best performance, 76.56% and 76.81% respectively 77.80% and 75.11% mAP on FPN and RetinaNet based methods. – Table.1 illustrates the comparison of performance on UCAS-AOD dataset. – Method achieves 96.95% for OBB task and is the best out of all the existing published methods. Method mAP Plane Car YOLOv2 [18] 87.90 96.60 79.20 R-DFPN [12] 89.20 95.90 82.50 DRBox [19] 89.95 94.90 85.00 S2 ARN [20] 94.90 97.60 92.20 RetinaNet-H [4] 95.47 97.34 93.60 ICN [21] 95.67 - - FADet [22] 95.71 98.69 92.72 R3 Det [4] 96.17 98.20 94.14 SCRDet++ (R3 Det-based) 96.95 98.93 94.97 TABLE: 1 Performance by accuracy (%) on UCAS-AOD dataset. Results:
  • 17. References: [1] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing imagery,” in Asian Conference on Computer Vision. Springer, 2018, pp. 150–165. [2] J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning roi transformer for oriented object detection in aerial images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. [3] X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, X. Sun, and K. Fu, “Scrdet: Towards more robust detection for small, cluttered and rotated objects,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 2019. [4] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refinement for rotating object,” arXiv preprint arXiv:1908.05612, 2019. [5] W. Qian, X. Yang, S. Peng, Y. Guo, and C. Yan, “Learn- ing modulated loss for rotated object detection,” arXiv preprint arXiv:1911.08299, 2019. [6] Y. Xu, M. Fu, Q. Wang, Y. Wang, K. Chen, G.-S. Xia, and X. Bai, “Gliding vertex on the horizontal bounding box for multi-oriented object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020. [7] H.Wei,L.Zhou,Y.Zhang,H.Li,R.Guo,andH.Wang,“Oriented objects as pairs of middle lines,” arXiv preprint arXiv:1912.10694, 2019. [8] Z. Xiao, L. Qian, W. Shao, X. Tan, and K. Wang, “Axis learning for orientated objects detection in aerial images,” Remote Sensing, vol. 12, no. 6, p. 908, 2020. [9] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7794–7803. [10] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132–7141. [11] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refi [12] X. Yang, H. Sun, K. Fu, J. Yang, X. Sun, M. Yan, and Z. Guo, “Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale rotation dense feature pyramid networks,” Remote Sensing, vol. 10, no. 1, p. 132, 2018.
  • 18. [13] X. Yang, H. Sun, X. Sun, M. Yan, Z. Guo, and K. Fu, “Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network,” IEEE Access, vol. 6, pp. 50 839–50 849, 2018. [14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. [15 ] T.-Y. Lin, P. Dolla ́r, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection.” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, no. 2, 2017, p. 4. [16] G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “Dota: A large-scale dataset for object detection in aerial images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [17] K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 159, pp. 296–307, 2020. [18] J. Redmon and A. Farhadi, “Yolo9000: better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7263–7271. [19] L. Liu, Z. Pan, and B. Lei, “Learning a rotation invariant detector with rotatable bounding box,” arXiv preprint arXiv:1711.09405, 2017. [20] S. Bao, X. Zhong, R. Zhu, X. Zhang, Z. Li, and M. Li, “Single shot anchor refinement network for oriented object detection in optical remote sensing imagery,” IEEE Access, vol. 7, pp. 87150–87161, 2019. [21] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing imagery,” in Asian Conference on Computer Vision. Springer, 2018, pp. 150–165. [22] C. Li, C. Xu, Z. Cui, D. Wang, T. Zhang, and J. Yang, “Feature- attentioned object detection in remote sensing imagery,” in 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019, pp. 3886–3890.