4_image_detection.pdf

國立臺北護理健康大學 NTUNHS
Image Detection
Orozco Hsu
2022-05-09
1

About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2

Tutorial
Content
3
Useful dataset introduction
Homework
Image Detection (YOLOX)
Labeling images and create your own model

Code
• Download code
• https://github.com/orozcohsu/ntunhs_2022_01.git
• Folder/file
• 20220509_inter_master/run.ipynb
• 20220509_inter_master/split_voc.py
4

Code
5
Click button
Open it with Colab
Copy it to your
google drive
Check your google
drive

Object Detection Milestones
8
Ref: https://www.researchgate.net/figure/A-road-map-of-object-detection-Milestone-detectors-in-this-figure-VJ-Det-10-11-HOG_fig2_333077580

9
Input Image
Backbone Feature extraction from pre-train model (ImageNet); VGG16, ResNet, EfficientNet,
CSPDarknet53
Neck Integrates feature map from each layers; Using SPP or FPN or PAN…
Head Predicts bbox from those integrated feature map
• One-stage: predicts whether there is a bbox in each grid-cell (dense)
• Two-stage: predicts ROI only (sparse)
https://medium.com/mdbros/%E4%BA%94%E5%88%86%E9%90%98%E8%AA%8D%E8%AD%98%E7%94%A8-yolo-
%E5%81%9A%E8%A1%80%E6%B6%B2%E6%8A%B9%E7%89%87%E7%89%A9%E4%BB%B6%E5%81%B5%E6%B8%AC-object-detection-
be3c77e13228

10
Proposal-base
Two-Stage Detector

About YOLOX
• YOLOX is a high-performance anchor-free YOLO
• Exceeding YOLOV3, V4, V5
• With ONNX, TensorRT, OpenVINO
11
Jetson Nano + TensorRT
Rasperry Pi + OpenVINO

YOLOX Github
13
Ref: https://github.com/Megvii-BaseDetection/YOLOX

You Only Look Once
• YOLO divides the feature-map into small grid-cells
• Each grid is responsible for the detection of the targets at one time
• Predicting the bounding box
• Location
• Confidence of the targets
• Probability of all labels
14

15
Union the same label grid-cells and bolden the edge of box
Positive sample (bbox offset/ Regression)
Negative sample (background)
Ignore sample (not in the training labels)

16
The length depends on dataset label
X, Y : the central of grid-cell value
W, H: the prediction of object value

Anchor-Base/ Anchor-Free
18
Anchor-based:
Provide some pre-defined boxes (large and small)
Adjustment to scale the size of anchor for object prediction
https://medium.com/%E8%BB%9F%E9%AB%94%E4%B9%8B%E
5%BF%83/cv-object-detection-1-anchor-
free%E5%A4%A7%E7%88%86%E7%99%BC%E7%9A%842019%E
5%B9%B4-e3b4271cdf1a
Anchor-free:
Use FPN and focal loss
https://medium.com/%E8%BB%9F%E9%AB%94%E4%B9%8B%E
5%BF%83/%E9%80%9A%E5%BE%80anchor-
free%E7%9A%84%E7%9C%9F%E7%9B%B8-object-
detection%E7%9A%84%E6%AD%A3%E8%B2%A0%E6%A8%A3%
E6%9C%AC%E5%AE%9A%E7%BE%A9-83f2fe36167f
https://arxiv.org/pdf/1708.02002.pdf

Feature Pyramid Network (FPN)
19
Union the feature maps for prediction
https://arxiv.org/pdf/1612.03144.pdf

Loss function
• Classification loss
• Localization loss (the loss of G.T. box and prediction box)
• Confidence loss (object of the IOU box)
• Total loss:
• Classification loss + Localization loss + Confidence loss
21

Non-Maximum Suppression (NMS)
22
This is in the testing process (not in training process), because there is no G.T. bbox

23
YOLOX download: https://drive.google.com/file/d/1P7CkENy1qqoQqeZALFkEP_D4LUVZ9eH-/view?usp=sharing

PASCAL VOC Datasets
• PASCAL VOC
• Pattern Analysis Statistical and Computational Learning (Visual Object Classes)
• PASCAL VOC
• The dataset of image detection challenge of between 2005 and 2012
• In total, there are 20 labels
• PASCAL VOC 2007
• 9,963 images
• 24,640 annotation
• PASCAL VOC 2012
• 11,530 images
• 27,450 annotation
37

PASCAL VOC Datasets
• Person: person
• Animal: bird, cat, cow, dog, horse, sheep
• Vehicle: aero-plane, bicycle, boat, bus, car, motorbike, train
• Indoor: bottle, chair, dining-table, potted-plant, sofa, tv
38
20 labels
Rex: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html

MS COCO Datasets
• MS COCO
• Microsoft Common Object in Context
39
Ref: https://cocodataset.org/#home

Download LabelImg (For Windows)
• LabelImg
• Windows_v1.8.0
• Download: https://tzutalin.github.io/labelImg/
• Tutorial: https://tw.leaderg.com/article/index?sn=11159
40
Another annotation tool: http://www.jinglingbiaozhu.com/

LabelImg PACAL VOC
• PASCAL VOC (xml) annotation format
41

LabelImg YOLO
• YOLO annotation format
42

LabelImg
• Define the label in windows_v1.8.0datapredefined_classes.txt
• Open the LabelImg tool and go to 「Open Dir 」
• Zoom-in/out to scale your image and click 「Create RecBox 」
• Choose the label and 「Save 」
• Continue to label next image and click 「Next Image 」
43

LabelImg
• Go to your label folder, and check those files (PASCAL VOC)
• Double check the VOCdevkitVOC2007
45

YOLOX training
• Run the run.ipynb and go to CoLab (CoLab.ipynb)
• Make sure it is in GPU runtime
46

Config YOLOX
• Modify the YOLOX/exps/example/yolox_voc/yolox_voc_s.py
• Check YOLOX/exps/default for each proper depth and width
47
self.num_classes = 2
image_sets=[('2007', 'trainval')],

Config YOLOX
• Modify the YOLOX/yolox/data/datasets/voc_classes.py and
YOLOX/yolox/data/datasets/coco_classes.py
49

51
Consider to reduce the max epoch in YOLOX/yolox/exp/yolox_base.py

Config YOLOX
• Add executable path to train.py
52

YOLOX training
• Training command
53
python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 16 --fp16
-o -c yolox_s.pth
f: yolox configuration file
fp16: Half-precision floating-point format, refer to https://en.wikipedia.org/wiki/Half-precision_floating-point_format
d: define your GPU amont
b: define your batch size (adjust it for memory usage)
o: occupy GPU memory first for training
c: network weight file

More
• Deploy model to edge device for real-time detection
• Edge device
• Use acceleration device on edge device
• TensorRT, OpenVINO
• No acceleration device on edge device
• https://medium.com/ching-i/yolo-fastest-ncnn-on-raspberry-pi-4-
f44143b44e45
• https://chtseng.wordpress.com/2020/09/27/%E8%A6%BA%E5%BE
%97yolo-
tiny%E4%B8%8D%E5%A4%A0%E5%BF%AB%E5%97%8E%EF%BC%
9F%E8%A9%A6%E8%A9%A6yolo-fastest/
58

More
• YOLOX resume from epoch
• https://chowdera.com/2021/11/20211119022033342w.html
• YOLOX documentation
• https://yolox.readthedocs.io/_/downloads/en/latest/pdf/
59

Homework
• Try to label other classes (more than 2) and rebuild your YOLOX
model (at least 1,500 images or more for training and testing data)
• https://blog.csdn.net/nan355655600/article/details/119519294
• If you want to use yolo label format
• https://d246810g2000.medium.com/%E5%A6%82%E4%BD%95%E4%BD%BF
%E7%94%A8%E8%87%AA%E5%B7%B1%E7%9A%84%E8%B3%87%E6%96%99
%E9%9B%86%E8%A8%93%E7%B7%B4-yolox-c02548734a48
60

4_image_detection.pdf

Recommended

Recommended

More Related Content

Similar to 4_image_detection.pdf

Similar to 4_image_detection.pdf (20)

More from FEG

More from FEG (20)

Recently uploaded

Recently uploaded (20)

4_image_detection.pdf