20220811 - computer vision

20220811
Computer Vision 101
Taka Wang

Hardware Accelerator
3~6 months 18~36 months
VPU/TPU/xPU

Hardware vendors developed their own frameworks

Google Coral
Tensor
fl
ow Lite
ASIC

Intel Movidius
OpenVino ASIC
FPGA

Nvidia Jetson Family GPU
TensorRT

Neural Network Model
Visualization

Model Optimization
• Prune
• Quantization
• Fusion
• Knowledge Distillation

ONNX (Open Neural Network Exchange)
LF AI Graduated Project
The Promise of ONNX

Model File Extension
• PyTorch (.pth)
• ONNX (.onnx)
• Ca
ff
e (.ca
ff
e)
• Tensor
fl
ow (.h5)
• Tensor
fl
ow Lite (.t
fl
ite)
• TensorRT (.trt or .engine)
• OpenVino (.xml and .bin)

Some fundamental problems in
Computer Vision

Image classi
fi
cation
with localization

Image Classi
fi
cation Object Detection Image Segmentation
Is this a dog?
What is there in image
and where?
Which pixels belong to
which object?

Pattern Matching
Computer Vision

Sliding Window Pattern Matching
Computer Vision

Famous Dataset
• PASCAL VOC
• 1.7萬多張照片，20個類別
• ImageNet
• 1400萬多張照片，27
大
類，2萬
小
類
• COCO Dataset
• 91個類別，32.8萬張圖片，250萬個標註。常
用
的類別80個，超過33萬張照
片，其中20萬張有標註

Deep Learning Object Detection

One-Stage vs Two-Stage Object Detectors
YOLO Family
SSD
RetinaNet
Faster RCNN
Cascade RCNN
Mask RCNN

YOLO Timeline
2015
2017
2018
2020
2021
2022
YOLOv1
2015
YOLOv3
2018
YOLOX (曠視)
YOLOR (中研院 GPL)
PP-YOLO v2 (百度)
2021
YOLOv2
YOLO9000
2017
YOLOv4 (中研院)
Scale YOLOv4 (中研院*)
PP-YOLO (百度)
YOLOv5 (
自
稱)
2020
PP-YOLOE (百度)
YOLOv6 (美圖 GPL)
YOLOv7 (中研院 GPL)
2022

YOLO Concept
“We reframe the object detection as a single regression problem, straight
from image pixels to bounding box coordinates and class probabilities.”

Non-Maximum Suppression (NMS)
threshold = 0.5

Non-Maximum Suppression (NMS)
threshold = 0.7

People Counting Features
• Hardware Accelerator
• TensoRT
• Multi-Class Tracking
• Count
• Rule

What is a frame
A video is a sequence of images
called frames. Each frame is a two-
dimensional grid of pixels. A set of
connected pixels is called a "blob."
In this case, the blob is a gull
fl
ying
across the camera's FOV from left to
right.

Centroid-Based Object Tracking

Pseudo Code (aka. algorithm)
• Step1: Objects are detected using a bounding box for the frame at time t-1
• Step 2: Calculate the centroids for the object detected for the frame at time t-1.
• Step 3: Objects are detected using a bounding box for the frame at time t. Assign a unique ID to the objects
• Step 4: Calculate the centroids of the object detected for the frame at time t.
• Step 5: Calculate the Euclidean distance between the centroids of all the objects detected in frames t-1 and t.
• Step 6: If the distance between the centroid at time t-1 and t is less than the threshold, it is the same object in
motion. Hence, use the existing object Id and update the bounding box coordinates of the object to the new
bounding box value.
• Step 7: If the distance between the centroid at time t-1 and t exceeds the threshold, add a new object id.
• Step 8:When objects detected in the previous frame cannot be matched to any existing objects, remove the
object id from tracking.

Counting Line
Hot Zone
Hot Zone
Tracking
Tracking
Ignore
Ignore

Background Subtraction (OpenCV)

• What makes this tricky is that there are several di
ff
erent coordinate
systems that we’re dealing with:
• the full image, before resizing and cropping, e.g. 1920×1080 pixels
• the input image that the Core ML model sees, e.g. 416×416 pixels
• normalized coordinates relative to the crop region
• normalized coordinates relative to the full input image
• the UI view that displays the image and the bounding boxes
• Whenever we’re talking about bounding box coordinates, it’s important
to understand the reference frame in which these coordinates live.
Run YOLOv7 model on iOS devices

IP Camera 與 Webcam 比較
連接網路的網路攝影機 (IP Camera) 連接PC的網路攝影機 (Webcam)
彈性無安裝距離限制，在任何地
方
皆可安裝，您可以將伺服器連
接
至
網路、數據機、
行
動電話或是無線模組
有安裝距離限制，需要距離連接的電腦在 3 公尺內
機能不需要個
人
電腦，可獨立運作，透過網路傳送即時視訊所需
的
一
切都內建於網路攝影機 (IP Camera) 中
您需要三個組成：
網路攝影機(webcam) + 個
人
電腦 + 軟體
安裝設定 IP 位址後即可上線使
用
安裝過程複雜，需先安裝驅動程式後，再安裝個
人
電腦
中的應
用
軟體
方
便您可使
用
任何電腦的標準網
頁
瀏覽器，直接管理與觀看影像您需要特定軟體觀看，並且無法遠端管理
穩定 24
小
時運作，提供
高
度穩定性 ⭐ 視連接電腦的穩定度
而
定
品質硬體壓縮，使
用
專業壓縮晶片 ⭐ 軟體壓縮，通常是私有的壓縮技術
價格僅需網路攝影機的成本網路攝影機、個
人
電腦與軟體的成本加總
環保符合綠能要求，耗電量通常低於 10W 耗電量視個
人
電腦
而
定，通常為 250W

Different Cameras
MIPI
IP Camera Industrial Camera
WebCam

Camera in OpenCV is easy?
https://gitlab.com/nilvana-ai/toolbox/hikrobot-py/-/blob/main/hikcam.py
https://gitlab.com/nilvana-ai/nilvana-x-counting/frame-processor/-/blob/main/pkg/capturer/capturer.go#L50
1. enum device
2. select device and create handle
3. open device
4. set trigger mode
5. more settings [optional]
6. start grabbing
7. thread or software trigger
8. stop grabbing
9. close device
10. destroy handle

20220811 - computer vision

Recommended

Recommended

More Related Content

Similar to 20220811 - computer vision

Similar to 20220811 - computer vision (20)

More from Jamie (Taka) Wang

More from Jamie (Taka) Wang (20)

Recently uploaded

Recently uploaded (20)

20220811 - computer vision