2. About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2
9. 9
Input Image
Backbone Feature extraction from pre-train model (ImageNet); VGG16, ResNet, EfficientNet,
CSPDarknet53
Neck Integrates feature map from each layers; Using SPP or FPN or PAN…
Head Predicts bbox from those integrated feature map
• One-stage: predicts whether there is a bbox in each grid-cell (dense)
• Two-stage: predicts ROI only (sparse)
https://medium.com/mdbros/%E4%BA%94%E5%88%86%E9%90%98%E8%AA%8D%E8%AD%98%E7%94%A8-yolo-
%E5%81%9A%E8%A1%80%E6%B6%B2%E6%8A%B9%E7%89%87%E7%89%A9%E4%BB%B6%E5%81%B5%E6%B8%AC-object-detection-
be3c77e13228
14. You Only Look Once
• YOLO divides the feature-map into small grid-cells
• Each grid is responsible for the detection of the targets at one time
• Predicting the bounding box
• Location
• Confidence of the targets
• Probability of all labels
14
15. 15
Union the same label grid-cells and bolden the edge of box
Positive sample (bbox offset/ Regression)
Negative sample (background)
Ignore sample (not in the training labels)
16. 16
The length depends on dataset label
X, Y : the central of grid-cell value
W, H: the prediction of object value
18. Anchor-Base/ Anchor-Free
18
Anchor-based:
Provide some pre-defined boxes (large and small)
Adjustment to scale the size of anchor for object prediction
https://medium.com/%E8%BB%9F%E9%AB%94%E4%B9%8B%E
5%BF%83/cv-object-detection-1-anchor-
free%E5%A4%A7%E7%88%86%E7%99%BC%E7%9A%842019%E
5%B9%B4-e3b4271cdf1a
Anchor-free:
Use FPN and focal loss
https://medium.com/%E8%BB%9F%E9%AB%94%E4%B9%8B%E
5%BF%83/%E9%80%9A%E5%BE%80anchor-
free%E7%9A%84%E7%9C%9F%E7%9B%B8-object-
detection%E7%9A%84%E6%AD%A3%E8%B2%A0%E6%A8%A3%
E6%9C%AC%E5%AE%9A%E7%BE%A9-83f2fe36167f
https://arxiv.org/pdf/1708.02002.pdf
19. Feature Pyramid Network (FPN)
19
Union the feature maps for prediction
https://arxiv.org/pdf/1612.03144.pdf
21. Loss function
• Classification loss
• Localization loss (the loss of G.T. box and prediction box)
• Confidence loss (object of the IOU box)
• Total loss:
• Classification loss + Localization loss + Confidence loss
21
43. LabelImg
• Define the label in windows_v1.8.0datapredefined_classes.txt
• Open the LabelImg tool and go to 「Open Dir 」
• Zoom-in/out to scale your image and click 「Create RecBox 」
• Choose the label and 「Save 」
• Continue to label next image and click 「Next Image 」
43
58. More
• Deploy model to edge device for real-time detection
• Edge device
• Use acceleration device on edge device
• TensorRT, OpenVINO
• No acceleration device on edge device
• https://medium.com/ching-i/yolo-fastest-ncnn-on-raspberry-pi-4-
f44143b44e45
• https://chtseng.wordpress.com/2020/09/27/%E8%A6%BA%E5%BE
%97yolo-
tiny%E4%B8%8D%E5%A4%A0%E5%BF%AB%E5%97%8E%EF%BC%
9F%E8%A9%A6%E8%A9%A6yolo-fastest/
58
59. More
• YOLOX resume from epoch
• https://chowdera.com/2021/11/20211119022033342w.html
• YOLOX documentation
• https://yolox.readthedocs.io/_/downloads/en/latest/pdf/
59
60. Homework
• Try to label other classes (more than 2) and rebuild your YOLOX
model (at least 1,500 images or more for training and testing data)
• https://blog.csdn.net/nan355655600/article/details/119519294
• If you want to use yolo label format
• https://d246810g2000.medium.com/%E5%A6%82%E4%BD%95%E4%BD%BF
%E7%94%A8%E8%87%AA%E5%B7%B1%E7%9A%84%E8%B3%87%E6%96%99
%E9%9B%86%E8%A8%93%E7%B7%B4-yolox-c02548734a48
60