This presentation is an analysis of the paper,"SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing"
In Comparison with other object detection algorithms, YOLO proposes the use of an end-to-end neural network that makes predictions of bounding boxes and class probabilities all at once.
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
There’s been enormous progress in object detection algorithms. Starting from multi-stage ones like R-CNN to end-to-end ones like SSD or YOLO, accuracy of the methods improved significantly. Current applications include pedestrian detection for cars and face detection on facebook.
But that’s just the beginning. I am going to show the algorithms for solving the problem, show what’s currently possible, and what will be possible in the near future.
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
Deep learning techniques ignited a great progress in many computer vision tasks like image classification, object detection, and segmentation. Almost every month a new method is published that achieves state-of-the-art result on some common benchmark dataset. In addition to that, DL is being applied to new problems in CV.
In the talk we’re going to focus on DL application to image segmentation task. We want to show the practical importance of this task for the fashion industry by presenting our case study with results achieved with various attempts and methods.
In Comparison with other object detection algorithms, YOLO proposes the use of an end-to-end neural network that makes predictions of bounding boxes and class probabilities all at once.
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
There’s been enormous progress in object detection algorithms. Starting from multi-stage ones like R-CNN to end-to-end ones like SSD or YOLO, accuracy of the methods improved significantly. Current applications include pedestrian detection for cars and face detection on facebook.
But that’s just the beginning. I am going to show the algorithms for solving the problem, show what’s currently possible, and what will be possible in the near future.
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
Deep learning techniques ignited a great progress in many computer vision tasks like image classification, object detection, and segmentation. Almost every month a new method is published that achieves state-of-the-art result on some common benchmark dataset. In addition to that, DL is being applied to new problems in CV.
In the talk we’re going to focus on DL application to image segmentation task. We want to show the practical importance of this task for the fashion industry by presenting our case study with results achieved with various attempts and methods.
Object tracking is one of the most important problems in modern visual systems and researches are
continuing their studies in this field. A suitable tracking method should not only be able to recognize and
track the related object in continuous frames, but should also provide a reliable and efficient reaction
against the phenomena disturbing tracking process including performance efficiency in real-time
applications. In this article, an effective mesh-based method is introduced as a suitable tracking method in
continuous frames. Also, its preference and limitation is discussed.
Detection and Tracking of Moving Object: A SurveyIJERA Editor
Object tracking is the process of locating moving object or multiple objects in sequence of frames. Object
tracking is basically a challenging problem. Difficulties in tracking of an object may arise due to abrupt changes
in environment, motion of object, noise etc. To overcome such problems different tracking algorithms have been
proposed. This paper presents various techniques related to object detection and tracking..The goal of this paper
is to present a survey of these techniques.
Overview Of Video Object Tracking SystemEditor IJMTER
The goal of video object tracking system is segmenting a region of interest from a video
scene and keeping track of its motion, positioning and occlusion. There are the three steps of video
object tracking system those are object detection, object classification and object tracking. Object
detection is performed to check existence of objects in video. Then the detected object can be
classified in various categories on the basis on their shape, motion, color and texture. Object tracking
is performed using monitoring object changes. This paper we are going to take overview of different
object detection, object classification and object tracking techniques and also the comparison of
different techniques used for various stages of tracking.
Visual object tracking using particle clustering - ICITACEE 2014Harindra Pradhana
Particle clustering been used to estimate object location relatively from observer on many applications. The observer estimated location measured using estimated location of nearby objects. Particle clustering method estimates the object location by grouping several detection data with certain similarity. Instead of detecting edges and corner on the visual data, this paper use clustering method to group pixels with certain similarity and measure its element. The cluster measured both height and width to estimate the distance of the object from the observer. New color features introduced in this research promising a better detection approach
We presents a technique for moving objects extraction. There are several different approaches for moving object extraction, clustering is one of object extraction method with a stronger teorical foundation used in many applications. And need high performance in many extraction process of moving object. We compare K-Means and Self-Organizing Map method for extraction moving objects, for performance measurement of moving object extraction by applying MSE and PSNR. According to experimental result that the MSE value of K-Means is smaller than Self-Organizing Map. It is also that PSNR of K-Means is higher than Self-Organizing Map algorithm. The result proves that K-Means is a promising method to cluster pixels in moving objects extraction.
Related article: Wonsang You, M.S. Houari Sabirin, and Munchurl Kim, "Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain," Proceedings of SPIE, N. Kehtarnavaz and M.F. Carlsohn, San Jose, CA, USA: SPIE, 2009, pp. 72440D-72440D-12.
Object tracking is one of the most important problems in modern visual systems and researches are
continuing their studies in this field. A suitable tracking method should not only be able to recognize and
track the related object in continuous frames, but should also provide a reliable and efficient reaction
against the phenomena disturbing tracking process including performance efficiency in real-time
applications. In this article, an effective mesh-based method is introduced as a suitable tracking method in
continuous frames. Also, its preference and limitation is discussed.
Detection and Tracking of Moving Object: A SurveyIJERA Editor
Object tracking is the process of locating moving object or multiple objects in sequence of frames. Object
tracking is basically a challenging problem. Difficulties in tracking of an object may arise due to abrupt changes
in environment, motion of object, noise etc. To overcome such problems different tracking algorithms have been
proposed. This paper presents various techniques related to object detection and tracking..The goal of this paper
is to present a survey of these techniques.
Overview Of Video Object Tracking SystemEditor IJMTER
The goal of video object tracking system is segmenting a region of interest from a video
scene and keeping track of its motion, positioning and occlusion. There are the three steps of video
object tracking system those are object detection, object classification and object tracking. Object
detection is performed to check existence of objects in video. Then the detected object can be
classified in various categories on the basis on their shape, motion, color and texture. Object tracking
is performed using monitoring object changes. This paper we are going to take overview of different
object detection, object classification and object tracking techniques and also the comparison of
different techniques used for various stages of tracking.
Visual object tracking using particle clustering - ICITACEE 2014Harindra Pradhana
Particle clustering been used to estimate object location relatively from observer on many applications. The observer estimated location measured using estimated location of nearby objects. Particle clustering method estimates the object location by grouping several detection data with certain similarity. Instead of detecting edges and corner on the visual data, this paper use clustering method to group pixels with certain similarity and measure its element. The cluster measured both height and width to estimate the distance of the object from the observer. New color features introduced in this research promising a better detection approach
We presents a technique for moving objects extraction. There are several different approaches for moving object extraction, clustering is one of object extraction method with a stronger teorical foundation used in many applications. And need high performance in many extraction process of moving object. We compare K-Means and Self-Organizing Map method for extraction moving objects, for performance measurement of moving object extraction by applying MSE and PSNR. According to experimental result that the MSE value of K-Means is smaller than Self-Organizing Map. It is also that PSNR of K-Means is higher than Self-Organizing Map algorithm. The result proves that K-Means is a promising method to cluster pixels in moving objects extraction.
Related article: Wonsang You, M.S. Houari Sabirin, and Munchurl Kim, "Real-time detection and tracking of multiple objects with partial decoding in H.264/AVC bitstream domain," Proceedings of SPIE, N. Kehtarnavaz and M.F. Carlsohn, San Jose, CA, USA: SPIE, 2009, pp. 72440D-72440D-12.
Recent Progress on Object Detection_20170331Jihong Kang
This slide provides a brief summary of recent progress on object detection using deep learning.
The concept of selected previous works(R-CNN series/YOLO/SSD) and 6 recent papers (uploaded to the Arxiv between Dec/2016 and Mar/2017) are introduced in this slide.
Most papers are focusing on improving the performance of small object detection.
Object Recogniton Based on Undecimated Wavelet TransformIJCOAiir
Object Recognition (OR) is the mission of finding a specified object in an image or video sequence
in computer vision. An efficient method for recognizing object in an image based on Undecimated Wavelet
Transform (UWT) is proposed. In this system, the undecimated coefficients are used as features to recognize the
objects. The given original image is decomposed by using the UWT. All coefficients are taken as features for
the classification process. This method is applied to all the training images and the extracted features of
unknown object are used as an input to the K-Nearest Neighbor (K-NN) classifier to recognize the object. The
assessment of the system is agreed on using Columbia Object Image Library Dataset (COIL-100) database.
Gesture Recognition using Principle Component Analysis & Viola-Jones AlgorithmIJMER
Gesture recognition pertains to recognizing meaningful expressions of motion by a human,
involving the hands, arms, face, head, and/or body. It is of utmost importance in designing an intelligent
and efficient human–computer interface. The applications of gesture recognition are manifold, ranging
from sign language through medical rehabilitation to virtual reality. In this paper, we provide a survey on
gesture recognition with particular emphasis on hand gestures and facial expressions. Applications
involving wavelet transform and principal component analysis for face and hand gesture recognition on
digital images
NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIM...paperpublications3
Abstract: Many recent advancements have been introduced in both hardware and software technologies. Out of these technologies, we gain a lot of interest in Image Processing. As the eccentric identification of a vehicle, number plate is a key clue to discover theft and over-speed vehicles. The captured images from the camera are always in low resolution and suffer severe loss of edge information, which cast great challenge to existing blind deblurring methods. The blur kernel can be showed as linear uniform convolution and with angle and length estimation. In this paper, sparse representation is used to identify the blur kernel. Then, the length of the motion kernel has been estimated with Radon transform in Fourier domain. We evaluate our approach on real-world images and compare with several popular blind image deblurring algorithms. Based on the results obtained the supremacy of our proposed approach in terms of efficacy.
Keywords: Image Segmentation, Angle Estimation, Length Estimation, Deconvolution Algorithm, Artificial Neural Network.
Title: NUMBER PLATE IMAGE DETECTION FOR FAST MOTION VEHICLES USING BLUR KERNEL ESTIMATION AND ANN
Author: S.KALAIVANI, K.PRAVEENA, T.PREETHI, N.PUNITHA
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Paper Publications
AUTO AI 2021 talk Real world data augmentations for autonomous driving : B Ra...Ravi Kiran B.
Modern perception pipelines in autonomous driving (AD) systems are based on Deep Neural Networks (DNNs) which utilize multiple hyper-parameter configurations and training strategies. Data augmentations is now a well-established training strategy to improve the generalization of DNNs, especially in a low dataset regime. Self-supervised learning and semi-supervised methods depend heavily on data augmentation strategies. In this study we view generalization due to data augmentations training DNNs since they implicitly model the geometric, viewpoint based transformations present on images/pointclouds due to noise, perspective, motion of the ego-vehicle. We shortly review current data augmentation strategies for perception tasks in AD, and recent developments on understanding its effects on model generalization.
In the talk we shall review data augmentation strategies through two case studies:
- Improving model performance of monocular 3D object detection model by using geometry preserving data augmentations on images
- Understand the role of data augmentation in reducing data redundancy and improving label efficiency within an active learning pipeline
Hardware Unit for Edge Detection with Comparative Analysis of Different Edge ...paperpublications3
Abstract: An edge in an image is a contour across which the brightness of the image changes abruptly. In image processing, an edge is often interpreted as one class of singularities. Edge detection is an important task in image processing. It is a main tool in pattern recognition, image segmentation, and scene analysis. An edge detector is basically a high pass filter that can be applied to extract the edge points in an image. This topic has attracted many researchers and many achievements have been made. Many researchers provided different approaches based on mathematical calculations which some of them are either robust or cost effective. A new algorithm will be proposed to detect the edges of image with increased robustness and throughput. Using this algorithm we will reduce the time complexity problem which is faced by previous algorithm. We will also propose hardware unit for proposed algorithm which will reduce the area, power and speed problem. We will compare our proposed algorithm with previous approach. For image quality measurement we will use some scientific parameters those are PSNR, SSIM, FSIM. Implementation of proposed algorithm will be done by Matlab and hardware implementation will be done by using of Verilog on Xilinx 14.1 simulator. Verification will be done on Model sim.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
2. Contents:
Object detection
Instance level denoising (InLD) in the Feature Map
The pipeline
How Instance-level Feature Map Denoising works
Mathematical foundation to remove instance level noise
Rotated object detection
Horizontal vs Rotated object detection
Datasets
Experiment
Effect of Instance-Level Denoising
Results
3. Object detection
When humans look at images or video, they can recognize and
locate objects of interest within a matter of moments.
Similarly, Object detection is a computer vision technique for
locating instances of objects in images or videos and The goal of
object detection is to replicate this intelligence using a computer.
Limitations of Current detectors
- small size, cluttered arrangement, and arbitrary orientations
4. 1) Small objects Overwhelmed by complex
surrounding.
2) Cluttered arrangement
– Densely arranged objects
– inter- class feature coupling and intraclass feature
boundary blur
3) Arbitrary orientations.
Rotation detection > Axis aligned detection
The horizontal bounding box for a rotated object is more loose than an aligned
rotated one, such that the box contains a large portion of background or nearby
cluttered objects as disturbance.
5. A way to dismiss the noisy interference
from both background and other
foreground objects
Types of noises
1. Image level noise.
2. Instance level noise
– Mutual interference between objects
– interference between object and
background
Denoising is performed on raw image
for the purpose of image enhancement,
and it also improves the detection
performance of small objects.
6. Instance level
denoising
(InLD) in the
Feature Map
(InLD) is realized by supervised segmentation.
Instance Level Denoising ( InLD) is applied to decouple the features of
different object categories into their respective channels.
At the same time features of the object and background are
enhanced and weakened, respectively in the spatial domain.
• Rotated objects = Smooth L1 Loss + IoU constant factor
• > five parameter regression
• discountinous boundaries
• Periodicity of angular
• Exchangeability of edges
7. The pipeline
SCRDet++ mainly consists of four
modules:
– Feature extraction
– Image-level denoising module
– Instance-level denoising module
– ‘class+box’
Fig 1.
8. Instance-level Feature Map
Denoising
Instance-Level Noise has adversary effects on feature map.,
such as:
– The non-object with object-like shape has a higher
response in the feature map, especially for small
objects (see the top row of Fig. 2).
– Clutter objects that are densely arranged tend to
suffer the issue for inter-class feature coupling and
intra-class feature boundary blurring
– The response of object is not prominent enough
surrounded by the background
Fig. 2. Images (left) and their feature maps before (middle) and after (right) the
instance-level denoising operation. First row: non-object with object-like
shape. Second row: inter-class feature coupling and intra- class feature
boundary blurring.
Fig 2.
9. Mathematical foundation to remove
instance level noise
– Reweight the convolutional response maps [10].
– Important parts > uninformative ones
Fig 3.
- X, Y ∈ R^(C ×H ×W) are two feature maps of input image
- A(X) is an attention function
- ⊙ is the element-wise product
- Ws ∈ R^H×W and Wc ∈ R^C denote the spatial weight and
channel weight
- Wci indicates the weight of the i-th channel
- U, concatenation operation for connecting tensor among the
feature map
10. The new formulation which considers the total I number of object categories with one additional category
for background is as follows:
During the implementation of InLD, learned weights are regarded as a result of semantic segmentation task, where the
feature responses of each category on the previous layers of the output layer are separated in the channel dimension, and the
feature responses of the foreground and background in the spatial dimension are also polarized.
s
• Channel dimension = inter class features
• Spatial dimension = intra class features
• Original feature map + Denoised feature map = Decoupled feature map
11. Rotated object detection
– Ideal case: The blue box rotates
counterclockwise to the red box.
Limitations: Higher loss due to
periodicity of angular (PoA) and
exchangeability of edges (EoE)
whereas, rotating the bounding box
clockwise while scaling w and h adds more
complexity
– Thus, Add IoU constant factor in the
traditional smooth L1 loss
– The new regression loss
– determines the direction of gradient
propagation
– And magnitude of gradient
Fig 4.
12. Horizontal vs Rotated Object detection
Horizontal Object detection
Uses: Multi-task Loss
Rotated object detection
Uses: smooth L1 loss + IoU constant factor
13. Datasets
DOTA DIOR UCAS-AOD BSTLD S2 TLD
• 2806 Aerial
images
• 15 object classes
• 188,282
instances
• 23,463 Aerial
images
• 20 object classes
• 190,288
instances
• 1510 Aerial
images
• 2 object classes
• 14,596 instances
• 13,427 camera
images
• Few instances of
many categories
• 5,786 images
• 5 object
categories
• 14,130 instances
In addition to the above datasets, they also use natural image dataset COCO [8] and scene text dataset
ICDAR2015 [28] for further evaluation.
14. Experiment
Server with a GeForce RTX 2080 Ti and 11G memory.
– Initialization by ResNet50 [14] by default.
– The weight decay and momentum for all experiments are set 0.0001 and 0.9, respectively.
– A Momentum Optimizer was employed over 8 GPUs with a total of 8 images per minibatch.
– Standard evaluation protocol of COCO, while for other datasets, the anchors of RetinaNet-based
method were used with seven aspect ratios {1, 1/2, 2, 1/3, 3, 5, 1/5} and three scales {20 , 21/3 ,
22/3 }.
– For rotating anchor-based method (RetinaNet-R), the angle is set by an arithmetic progression
from −90◦ to −15◦ with an interval of 15 degrees.
15. Effect of Instance-Level Denoising
– Improved accuracy
– Effect of IoU-Smooth L1 Loss
– Eliminates the boundary effects of the angle,
– Model easily regresses the object coordinates.
– The new loss improves three detectors’(RetinaNet-R [4],SCRDet [3], FPN [15] ) accuracy to 69.83%, 68.65% and
76.20%, respectively.
– Effect of Data Augmentation and Backbone.
– Used ResNet101
– Improvement from 69.81% → 72.98%.
– Final performance of the model was improved from 72.98% to 74.41% by using ResNet152 as backbone.
16. – InLD with the state-of-the-art algorithms on
two datasets DOTA [16] and DIOR [17]
outperforms all other models and achieves
the best performance, 76.56% and 76.81%
respectively.
– Methods achieve the best performance,
76.56% and 76.81% respectively 77.80%
and 75.11% mAP on FPN and RetinaNet
based methods.
– Table.1 illustrates the comparison of
performance on UCAS-AOD dataset.
– Method achieves 96.95% for OBB task and
is the best out of all the existing published
methods.
Method mAP Plane Car
YOLOv2 [18] 87.90 96.60 79.20
R-DFPN [12] 89.20 95.90 82.50
DRBox [19] 89.95 94.90 85.00
S2 ARN [20] 94.90 97.60 92.20
RetinaNet-H
[4]
95.47 97.34 93.60
ICN [21] 95.67 - -
FADet [22] 95.71 98.69 92.72
R3 Det [4] 96.17 98.20 94.14
SCRDet++ (R3
Det-based)
96.95 98.93 94.97
TABLE: 1 Performance by accuracy (%) on UCAS-AOD dataset.
Results:
17. References:
[1] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing imagery,” in Asian Conference on
Computer Vision. Springer, 2018, pp. 150–165.
[2] J. Ding, N. Xue, Y. Long, G.-S. Xia, and Q. Lu, “Learning roi transformer for oriented object detection in aerial images,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), June 2019.
[3] X. Yang, J. Yang, J. Yan, Y. Zhang, T. Zhang, Z. Guo, X. Sun, and K. Fu, “Scrdet: Towards more robust detection for small, cluttered and rotated objects,” in Proceedings of the
IEEE International Conference on Computer Vision (ICCV), October 2019.
[4] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refinement for rotating object,” arXiv preprint arXiv:1908.05612, 2019.
[5] W. Qian, X. Yang, S. Peng, Y. Guo, and C. Yan, “Learn- ing modulated loss for rotated object detection,” arXiv preprint arXiv:1911.08299, 2019.
[6] Y. Xu, M. Fu, Q. Wang, Y. Wang, K. Chen, G.-S. Xia, and X. Bai, “Gliding vertex on the horizontal bounding box for multi-oriented object detection,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, 2020.
[7] H.Wei,L.Zhou,Y.Zhang,H.Li,R.Guo,andH.Wang,“Oriented objects as pairs of middle lines,” arXiv preprint arXiv:1912.10694, 2019.
[8] Z. Xiao, L. Qian, W. Shao, X. Tan, and K. Wang, “Axis learning for orientated objects detection in aerial images,” Remote Sensing, vol. 12, no. 6, p. 908, 2020.
[9] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018,
pp. 7794–7803.
[10] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp.
7132–7141.
[11] X. Yang, Q. Liu, J. Yan, and A. Li, “R3det: Refined single-stage detector with feature refi
[12] X. Yang, H. Sun, K. Fu, J. Yang, X. Sun, M. Yan, and Z. Guo, “Automatic ship detection in remote sensing images from google earth of complex scenes based on multiscale
rotation dense feature pyramid networks,” Remote Sensing, vol. 10, no. 1, p. 132, 2018.
18. [13] X. Yang, H. Sun, X. Sun, M. Yan, Z. Guo, and K. Fu, “Position detection and direction prediction for arbitrary-oriented ships via multitask
rotation region convolutional neural network,” IEEE Access, vol. 6, pp. 50 839–50 849, 2018.
[14] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[15 ] T.-Y. Lin, P. Dolla ́r, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection.” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, no. 2, 2017, p. 4.
[16] G.-S. Xia, X. Bai, J. Ding, Z. Zhu, S. Belongie, J. Luo, M. Datcu, M. Pelillo, and L. Zhang, “Dota: A large-scale dataset for object detection
in aerial images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[17] K. Li, G. Wan, G. Cheng, L. Meng, and J. Han, “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS
Journal of Photogrammetry and Remote Sensing, vol. 159, pp. 296–307, 2020.
[18] J. Redmon and A. Farhadi, “Yolo9000: better, faster, stronger,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2017, pp. 7263–7271.
[19] L. Liu, Z. Pan, and B. Lei, “Learning a rotation invariant detector with rotatable bounding box,” arXiv preprint arXiv:1711.09405, 2017.
[20] S. Bao, X. Zhong, R. Zhu, X. Zhang, Z. Li, and M. Li, “Single shot anchor refinement network for oriented object detection in optical remote
sensing imagery,” IEEE Access, vol. 7, pp. 87150–87161, 2019.
[21] S.M.Azimi,E.Vig,R.Bahmanyar,M.Ko ̈rner,andP.Reinartz,“To- wards multi-class object detection in unconstrained remote sens- ing
imagery,” in Asian Conference on Computer Vision. Springer, 2018, pp. 150–165.
[22] C. Li, C. Xu, Z. Cui, D. Wang, T. Zhang, and J. Yang, “Feature- attentioned object detection in remote sensing imagery,” in 2019 IEEE
International Conference on Image Processing (ICIP). IEEE, 2019, pp. 3886–3890.