위성이미지 객체 검출 대회 - 1등

위성 이미지 객체 검출
{green669, sbson0621, rnans33, karma1002, ohhs}@koreatech.ac.kr
DICE Lab
KOREATECH
1
박주찬 손성빈 정준욱 오흥선이선훈

DICE Lab
• Deep Intelligence for Cognitive Environment (DICE) Lab
• in School of Computer Science and Engineering, KOREATECH
• Focus on deep intelligence systems by understanding various cognitive
environments based on language and vision technologies.
• https://sites.google.com/view/dice-lab/home
• Looking for self-motivated graduate students!
2

목차
• Problem Definition
• Challenges in Aerial Object Detection
• The Approaches to Challenges
• Model
• Experiments
• Conclusion
3

Problem Definition
• 인공위성 영상의 선박 종류 인식 및 위치를 검출 알고리즘 개발
5
• Object Detection
• Classification
• Localization
• Horizontal bounding boxes
• Oriented bounding boxes

Problem Definition
6
• Classification
• Localization

Problem Definition – Object Detection
Classification
7
Cat ( x, y, w, h )
Object Detection
Classification + Localization(Horizontal bounding box)
Localization
Horizontal
bounding box

Problem Definition
8
• Classification
• Localization

Problem Definition – Aerial Object Detection
Classification
9
Ship[1] ( x1, y1, x2, y2, x3, y3, x4, y4 )
Aerial Object Detection
Classification + Localization(Oriented bounding box)
Localization
Oriented
bounding box

Problem Definition – Aerial Detection
Classification
10
Ship ( x1, y1, x2, y2, x3, y3, x4, y4 )
Aerial Object Detection
Classification + Localization(Oriented bounding box)
Localization
Oriented
bounding box
Aerial Object Detection Task

Challenges
in Aerial Object Detection
11

Challenges
12
1. 데이터 수의 부족
2. 항공뷰의 특징
3. 데이터의 불균형

Challenge 1
13
Dataset Classes Images
ILSVRC 2014 200 516,840
COCO2017 80 163,957
PASCAL VOC 2012 20 22,531
Aerial Data 4 2,676
<Aerial Data vs 일반 OD데이터 셋 비교>
데이터 수의 부족

Challenge 2
14
항공뷰의 특징
고 해상도 배경 및 물체의 복잡도
↕
Aerial image 해상도 : 3000 x 3000
일반 사진에 비해서 배경, 물체
의 복잡도가 높음
항공 이미지 일반 이미지
높은 해상도로 인해 원본 이미지를
바로 처리하기 어려움
데이터셋 평균 해상도
ImageNet 482 x 415
COCO 480 x 640
VOC2012 469 x 387

Challenge 2
15
물체의 밀집
밀집된 물체의 검출이 어려움
다양한 회전각도
모든 각도를 고려하기 어려움
항공뷰의 특징

Class Images Ratio
Maritime vessels 12,018 67.3 %
Container 3,986 22.3 %
Oil tanker 1,807 10.2 %
Aircraft carrier 39 0.20 %
데이터의 불균형
Challenge 3
16
<Aerial Data Class 별 객체 수 비교>

Approaches to the Challenges
17

Our Approaches
18
• 항공뷰의 특징
• 고해상도 처리 → Image patches
• 다양한 회전 각도 및 물체의 밀집 → RoI Transformer
• 데이터 수의 부족
→ Multi-scale augmentation
→ Appearance augmentation
• 데이터의 불균형
→ Data uncertainty approach
+ 데이터의 noise 및 model uncertainty에 강한 모델
→ Bayesian deep learning approach

Our Approaches
19

Image Patches
20
1024
1024
3000 x 3000
0
0
• 하나의 3000 x 3000 이미지
→ 모델의 input size인 1024 x 1024 크기의 패치로 나눔, Stride : 512
→ 총 25장의 이미지 패치가 됨

Our Approaches
21

Data Augmentation
22
고양이
형상정보 변환
데이터 속성 변환
Augmentation에는 크게 형상정보 변환과 데이터 속성 변환이 있음

Data Augmentation
23
• 형상 정보 변환
• Multi-scale augmentation
• 데이터 속성 변환 (Appearance augmentation)
• 색감 변환
• 안개 적용
• 밝기 변화
• 가우시안 블러링

Data Augmentation
• 학습데이터의 크기를 다양한 스케일로 학습시키는 것
• 장점
• 다양한 크기의 object에 강건한 모델을 만들 수 있음
• Ours
• Image patch를 1024 x 1024, 1500 x 1500, 3000 x 3000의 스케일로 생성
24

Data Augmentation
• Ex) 한 장의 3000 x 3000 크기의 이미지
-> 1024 x 1024 크기의 이미지 25장, stride : 512
25
1024
1024
3000 x 3000
0
0

Data Augmentation
-> 1024 x 1024 크기의 이미지 25장, stride : 512
26
3000 x 3000
1536
1024
512
0

Data Augmentation
-> 1500 x 1500 크기의 이미지 16장, stride : 700
27
3000 x 3000
1500
1500
0
0

Data Augmentation
-> 1500 x 1500 크기의 이미지 16장, stride : 700
28
3000 x 3000
2200
1500
0
700

Data Augmentation
-> 1024 x 1024 크기의 이미지 패치 25장, stride : 512
-> 1500 x 1500 크기의 이미지 패치 16장, stride : 700
-> 3000 x 3000 크기의 이미지 패치 1장
• 한 장당 총 42장의 이미지 패치가 만들어짐
• 총 학습 이미지의 개수 : 2,646 x 42 = 111,132장
• 학습 이미지 사이즈가 다양함
• 1024 x 1024 고정된 사이즈로 resize한 후, 모델의 input으로 들어가게 됨
29

Data Augmentation
30
• 형상 정보 변환
• 데이터 속성 변환(Appearance augmentation)
• 색감 변환
• 안개 적용
• 밝기 변화
• 가우시안 블러링

Data Augmentation
• Aerial 데이터 속성 분석
• 시간대(해의 위치 변화에 따른 조명 변화)
• 인공 위성의 성능(성능이 높을 수록 깨끗한 사진, 낮을 수록 blur한 사진)
• 안개, 구름
• 인공 위성이 받아들이는 전자기파 영역(적외선, 가시광선)
DICE Lab@KOREATECH 31

Data Augmentation
색감 안개 가우시안 블러링 밝기
일반 (65%)
Cool(15%)
Warm(15%)
적외선(5%)
일반(76%)
1~5 옵션(24%)
일반(76%)
Blurring(24%):
-필터 범위 (5,5),
표준편차 (1~10)
일반(38%)
Bright(62%):
-20 ~ 50
<Appearance augmentation 조합 비율>
Image = 색감 × 안개 × 가우시안 블러링 × 밝기

Data Augmentation
33
좌 : 원본 사진, 우 : Appearance augmentation 변환 이미지(비율에 맞게 sampling해서 뽑음)
Example 1

Data Augmentation
34
좌 : 원본 사진, 우 : Appearance augmentation 변환 이미지(비율에 맞게 sampling해서 뽑음)
Example 2

Our Approaches
35

RoI Transformer
• HRoI(Horizontal Region of Interest)
• 버스의 방향에 상관없이 RoI를 잡
는 방식
• RRoI(Rotated Region of Interest)
• 버스의 방향을 고려하여 RoI 자체
를 rotate해서 잡음으로써 밀집한
이미지에 강건해짐
<HRoI vs RRoI [2]>

RoI Transformer
37Feature map
Anchor scale : 3
Anchor ratio : 1:1
Anchor angle : 0
Anchor scale : 3
Anchor ratio : 1:2
Anchor angle : 0
Anchor scale : 3
Anchor ratio : 1:1
Anchor angle :
𝜋
6
1
1
• Anchor란?
• RRoI를 선택하기 위해 필요한 후보
• RRoI를 계산하기 위해서는 Anchor를 구해야함
• Scale, Ratio, Angle이 존재
• Scale은 anchor의 크기를 조절
• Ratio는 anchor의 가로세로 비율을 조절
• Angle은 anchor의 각도를 조절
• Anchor의 개수를 구하는 공식
→ (num_scales × num_aspect_ratios × num_angles)[1,2,3,4]

RoI Transformer
• 기존 RRoI(Rotated Region of Interest)의 문제점
• Anchor의 개수를 구하는 공식
→ (num_scales × num_aspect_ratios × num_angles)[3,4,5,6]
• Anchor의 각도가 n배 세밀해진다면
• 연산량 n배 증가
• 메모리 n배 증가
• Proposal간의 match efficiency도 같이 떨어짐
38

RoI Transformer
• RoI transformer[2]
• Angle 자체를 추론함으로써 연산량을 극적으로 줄임
• anchor의 개수 = (num_scale * num_aspect_ratio * num_angles)
→ anchor의 개수 = (num_scale * num_aspect_ratio * 1)
• 표현 가능한 angle의 종류를 무한개로 늘림
• num_angles ∈ {
π
2
,
π
3
,
𝜋
4
, … ,
𝜋
𝑛
}, finite set
→ num_angles ∈ 𝑅 , infinite set
39

Our Approaches
40

Data Uncertainty
41
<Balanced data vs Imbalanced data class feature space view[7]>
• Imbalanced Data의 문제점
• Balanced data 경우는 class
간의 경계선이 이상적임
• Imbalanced data는 class간
경계가 데이터가 부족한 쪽
으로 치우쳐짐(false positive
증가)

Data Uncertainty
42
<Regression and classification uncertainty with data frequency[7]>
• Imbalanced Data와 Uncertainty간의 상관관계
• 데이터의 수가 적을 수록 uncertainty가 증가

• Uncertainty 측정방법
• Aleatoric uncertainty, Epistemic uncertainty를 동시에 잡는 기법 사용 [7]
→ 기존 uncertainty 측정보다 향상된 기법
• Data imbalance 해결방법
• Category-level 해결[7]
43
Data Uncertainty
uncertainty를 측정하여 0.5 × 𝑢𝑛𝑐𝑒𝑟𝑡𝑎𝑖𝑛𝑡𝑦
만큼 rare class에 margin 을 부여함으로써
rare class 영역 회복

Our Approaches
44
+ 데이터의 noise 및 overfitting에 강한 모델

Bayesian deep learning inference[8]
y∗
~ p y∗
x∗
, X, Y y∗
= 𝑓(x∗
; 𝜃)
Bayesian deep learning 일반 deep learning
predict distribution predict point
• data의 noise를 처리하는 기법이 있음
• predict point를 추론하는 것이 아닌 predict distribution을 추론하기 때문에 model의 overfitting에 강함

목차
• Model
• Experiments
• Conclusion
46

Overall Process
47
Image patches
3x1024x1024
I 𝑛
3x3000x3000
⋯
Output image
3x3000x3000
Model
patch 1
patch 2
patch N
𝒇 𝒎𝒆𝒓𝒈𝒆(
[patch 1,
…
patch N])
Box regression,
Classification
patch K
⋯
⋯
patch 1
patch 2
patch N
patch K
⋯
oil tanker
oil tanker
𝒇 𝒂𝒖𝒈 𝐼 𝑛
Augmentation Merge

Overall Process
48
Image patches
3x1024x1024
I 𝑛
3x3000x3000
⋯
Output image
3x3000x3000
Model
patch 1
patch 2
patch N
𝒇 𝒎𝒆𝒓𝒈𝒆(
[patch 1,
…
patch N])
Box regression,
Classification
patch K
⋯
⋯
patch 1
patch 2
patch N
patch K
⋯
oil tanker
oil tanker
𝒇 𝒂𝒖𝒈 𝐼 𝑛
Augmentation Merge

Our Model
oil tanker
oil tanker
oil tanker
Deep
features
FPN
+
BFN
Backbone
Image patch
3x1024x1024
ResNetXt101
oil tanker
merge
Feature
pyramids
Deep
features
RPN:4
RPN:3
RPN:2
RPN:1
RPN:0
50

목차
• Model
• Experiments
• Conclusion
51

Result Analysis
53
Model Public mAP
ResNet50 + FPN 0.561
ResNet101 + FPN 0.608
ResNeXt101 + FPN 0.706
Model Public mAP
ResNeXt101 + FPN 0.706
ResNeXt101 + FPN+ BFN 0.734
ResNeXt101 + BiFPN[9]+ BFN 0.687
Backbone이 큰 모델일수록 성능 향상을 보임.
BFN 모듈을 추가했을 때 성능 향상을 보임
BiFPN[9] 모듈은 오히려 성능이 감소함.

Result Analysis
Model Appearance
Multi-scale
training
Uncertainty Public mAP Private mAP
ResNeXt101 + FPN+ BFN 0.762 -
ResNeXt101 + FPN+ BFN 0.812 -
ResNeXt101 + FPN+ BFN* 0.840 -
ResNeXt101 + FPN+ BFN* 0.849 0.824
Cascaded[10] ResNeXt101 + FPN+ BFN* 0.861 -
Cascaded[10] ResNeXt152 + FPN+ BFN* 0.838 -
54
* : multi-scale test 적용
Multi-scale test 적용 시 성능 향상을 보임
Bayesian model과 data imbalance 처리를 했을 시 성능 향상을 보임
Cascaded[10] ResNeXt101 + FPN+ BFN은 Public mAP는 높았으나 Private mAP는 오히려 떨어짐
Cascaded[10] ResNeXt152 + FPN+ BFN은 시간 부족으로 인해 완전한 성능을 내지 못함

Result Analysis
55
팀 이름(순위) 가채점 점수 최종 점수 점수 변위폭
DICE(1위) 0.849 0.825 -0.024
Top secret(2위) 0.844 0.815 -0.029
박태현_1579495977001(3위) 0.813 0.765 -0.048
상위 3개팀 중 모델 결과의 variance가 가장 낮음
➔ overfitting 및 noise에 강한 모델

Result Analysis
• Multi-scale을 적용하면, 큰 배들에 대해 더 잘 찾는 것을 볼 수 있음
56
Multi-scale 전 Multi-scale 후

Result Analysis
57
• 작은 배들은 대부분 검출이 잘 되는 것을 볼 수 있음

Conclusion
• Challenges 해결 방법
➔ 이미지 패치로 나누는 것과 RoI Transformer을 통해 해결했음
➔ Multi-scale augmentation과 Appearance augmentation을 통해 해결했음
➔uncertainty 측정을 통한 category별 margin을 줌을 통해 해결했음
• noise 및 overfitting에 강건한 모델
➔ Bayesian model을 통해 noise와 overfitting에 강건한 모델을 만듬
58

Further Discussion
• CBNet: A Novel Composite Backbone Network Architecture for Object
Detection[10]
59
• CBNet: Multiple backbone을 조합
• COCO object detection state-of-the-art model
(2020-04-09 기준)
• COCO dataset에서 기존 ResNeXt 보다 높은 성능

References
[1] https://commons.wikimedia.org/wiki/File:Aerial_photograph_of_a_cargo_ship.jpg
[2] Ding, J., Xue, N., Long, Y., Xia, G. S., & Lu, Q. (2019). Learning roi transformer for oriented object
detection in aerial images. Proceedings of the IEEE Computer Society Conference on Computer Vision and
Pattern Recognition, 2019-June, 2844–2853. https://doi.org/10.1109/CVPR.2019.00296
[3] Seyed Majid Azimi, Eleonora Vig, Reza Bahmanyar, Marco K¨orner, and Peter Reinartz. Towards multi-
class object detection in unconstrained remote sensing imagery. arXiv:1807.02700, 2018
[4] Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, and Xiangyang Xue. Arbitrary-
oriented scene text detection via rotation proposals. TMM, 2018. 4321, 4322, 4327, 4328
[5] Zenghui Zhang, Weiwei Guo, Shengnan Zhu, and Wenxian Yu. Toward arbitrary-oriented ship detection
with rotated region proposal and discrimination networks. IEEE Geosci. Remote Sensing Lett., (99):1–5,
2018. 4322, 4326
[6] Xue Yang, Hao Sun, Kun Fu, Jirui Yang, Xian Sun, Menglong Yan, and Zhi Guo. Automatic ship detection
in remotesensing images from google earth of complex scenes based on multiscale rotation dense feature
pyramid networks. Re-mote Sensing, 10(1):132, 2018. 4322, 4327
[7]Khan, S., Hayat, M., Zamir, S. W., Shen, J., & Shao, L. (2019). Striking the right balance with uncertainty.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019-
June, 103–112. https://doi.org/10.1109/CVPR.2019.00019
60

References
[8] Yarin Gal. (2017). Uncertainty in Deep Learning. Phd Thesis, 1(1), 1–11.
https://doi.org/10.1371/journal.pcbi.1005062
[9] Tan, M., Pang, R., & Le, Q. V. (2019). EfficientDet: Scalable and Efficient Object Detection.
http://arxiv.org/abs/1911.09070
[10] Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., & Ling, H. (2019). CBNet: A Novel Composite
Backbone Network Architecture for Object Detection. http://arxiv.org/abs/1909.03625
61

위성이미지 객체 검출 대회 - 1등

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 위성이미지 객체 검출 대회 - 1등

Similar to 위성이미지 객체 검출 대회 - 1등 (20)

More from DACON AI 데이콘

More from DACON AI 데이콘 (20)

위성이미지 객체 검출 대회 - 1등