Image Classification (20230411)

國立臺北護理健康大學 NTUNHS
Image Classification
Orozco Hsu
2023-04-11
1

About me
• Education
• NCU (MIS)、NCCU (CS)
• Work Experience
• Telecom big data Innovation
• AI projects
• Retail marketing technology
• User Group
• TW Spark User Group
• TW Hadoop User Group
• Taiwan Data Engineer Association Director
• Research
• Big Data/ ML/ AIOT/ AI Columnist
2

Tutorial
Content
3
Image in AI process types
Overall technologies
Homework
Deep Learning history timeline
Exercise in image classification

Code
• Sample code
• https://github.com/orozcohsu/ntunhs_2023_01/blob/master/20230411/run.i
pynb
4

Code
5
Click button
Open it with Colab
Copy it to your
google drive
Check your google
drive

• From 1943-2019
6
Ref: https://machinelearningknowledge.ai/brief-history-of-deep-learning/

• 2018 Turing Award
• Bengio, Hinton, and LeCun, are sometimes referred to as the "Godfathers of
AI" and "Godfathers of Deep Learning
7
Ref: https://awards.acm.org/about/2018-turing

• ImageNet dataset
• Over 15 million images with more than 22,000 categories
• ILSVRC
8
Ref: https://image-net.org/about.php
Ref:
https://www.cs.princeton.edu/courses/archiv
e/spr18/cos598B/slides/cos598b_7feb18_ima
genet.pdf

• ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
9
Ref: https://medium.com/nanonets/how-to-automate-surveillance-with-deep-learning-c8dea1d6387f
Ref: https://image-net.org/index.php

• Object detection
10
val top-1: Probability of predicting only once and being error
val top-5: Predict five times, as long as one guess is error, the probability of being error
VGG16 has 1.3E parameters with
500 MB size (13 Conv + 3 FC = 16)
ChatGPT has 1750E parameters
with 45TB size

• Object localization
11

• Object detection from video
• IoU (Intersection over Union)
• IoU > 0.5 (we set it as TP)
• IoU < 0.5 (we set it as FP)
12

More
13
Ref: https://chih-sheng-huang821.medium.com/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92%E7%B3%BB%E5%88%97-
%E4%BB%80%E9%BA%BC%E6%98%AFap-map-aaf089920848
Precision = TP/(TP + FP)
Recall = TP/(TP + FN)

More
• AP (Average Precision): used to object detection
• mAP (more Average Precision): more objects detection
14
precision-recall curve
AP = area under curve, AUC
IS dog possibility
Precision:3/4 =0.75 Recall:3/8= 0.375
Precision:5/10 =0.5 Recall:5/8= 0.625

• CS231n:
• Convolutional Neural Networks for Visual Recognition
15
Ref: http://cs231n.stanford.edu/index.html

AI-Generated Content
• The year 2022 is considered as the first year of AI-generated content
production.
• In 2023, there will be an explosion of AI applications. Experts predict
that within two years, hundreds of thousands of AI application apps
may be created, with a myriad of new species of AI applications
emerging. While this is exciting, it also presents potential risks and
challenges, such as a complete rewriting and innovation of the
definition and process of personal productivity. The AIGC also
anticipates challenges to current societal norms, including issues
related to copyright law, academic ethics, and the proliferation of
deepfakes and fake news.
17

AIGC application
• ChatGPT
• New Bing
• Midjourney
• DALL·
E 2
• D-ID viedo introduction
• Others
19

一句前輩的經驗
• AI其實分很多領域，正是因為分很多的領域，所以大家聽到的強人工
智慧都是假的，目前還做不到一個AI跨所有領域。不然我做機器視覺
的我就立刻跑去做AIGC了，最近中國急著追美國，急著找有AIGC經
驗的人，還行情價的兩倍挖人，乾超高好嗎！人家找有該領域經驗的，
而不是只會AI演算法的，就說明了不同領域還有不同細節。
• 就我最近分享的paper，無論是camera還是LiDAR 都是3D的視覺，
我應該也數次提到座標係，因為3D空間感知我們正常情況下會在笛卡
爾座標系處理問題，當然座標係還有球體極座標系、球面極座標系。
所以如果只是做普通的2D object detection 或2D的segmentation
模型跑來面試機器視覺都不被加分，就是這麼卷。除非有投頂會期刊
入選的人會被加分，做2D模型的都不算有相關經驗，大概就這麼現實
20

Image for deep learning process flow
21
Ref: https://www.aldec.com/en/solutions/embedded/deep-learning-using-fpga

Main type of AI image recognition
• Image classification
• Image detection
• Image segmentation
22
Can you tell the difference?

Image classification
23
Ref: https://becominghuman.ai/building-an-image-classifier-using-deep-learning-in-python-totally-from-a-beginners-perspective-be8dbaf22dd8

Image detection
24
Ref: https://medium.com/ai-techsystems/image-detection-recognition-and-image-
classification-with-machine-learning-92226ea5f595

Image segmentation
25
Ref: https://tariq-hasan.github.io/concepts/computer-vision-semantic-segmentation/

The input is fed to the network of stacked
Conv, Pool and Dense layers
26
Ref: https://learnopencv.com/image-classification-using-convolutional-neural-networks-in-keras/
Output can be a Softmax
layer indication

More
• Softmax
27
Ref: https://rstudio-pubs-static.s3.amazonaws.com/337306_79a7966fad184532ab3ad66b322fe96e.html

Network architecture
30
Ref: https://www.sciencedirect.com/science/article/pii/S0386111219301566

Network architecture
31
Ref: https://cs231n.github.io/convolutional-networks/

Grey images
32
Support format: jpeg, png, bmp, gif

Vector/ Raster image
• Input image: bitmap(raster)/ vector images
• Channels: RGB: black[0,0,0] - white[255,255,255]; gray[0-255]
34

Improving model prediction accuracy
• Different geometric transformations
35
Ref: https://medium.com/analytics-vidhya/data-augmentation-is-it-really-necessary-b3cb12ab3c3f

Feature extractor
• Kernel map: image edge-detection,
sharpen… called image Filters
• Convolutional: Convolutional and
pooling layers which act as the feature
extractor
• Feature maps: The output of kernel
map process
36
Feature maps1
Feature maps2

Feature extractor
• Stride
37

Feature extractor
38
Ref: https://learnopencv.com/wp-content/uploads/2017/11/convolution-example-matrix.gif
(4*1)+(3*-1)+(3*1)+(2*-1)+(2*1)+(7*-1)

More
• What if you want the feature map to be of the same size as the input
image? Using zero padding on it.
39
Ref: https://towardsdatascience.com/convolution-neural-networks-a-beginners-guide-implementing-a-mnist-hand-written-digit-8aa60330d022

feature extractor
• Pooling
• Max
• Average
40
Ref: https://www.researchgate.net/figure/Toy-example-illustrating-the-drawbacks-of-
max-pooling-and-average-pooling_fig2_300020038

Feature extractor
• Spatial Contextual Module (pixel wise)
41
Ref: https://www.researchgate.net/figure/The-sum-pooling-strategy-for-feature-maps-in-a-convolutional-layer-a-Input-cloud-image_fig2_323433191

Classifier
• Flatten Layer
• It is used to convert the data into 1D arrays to create a single feature vector.
After flattening we forward the data to a fully connected layer for final
classification
42
Ref: https://data-flair.training/blogs/keras-convolution-neural-network/

Classifier
• Dense Layer
• It is a fully connected layer. Each node in this layer is connected to the
previous layer
• This layer is used at the final stage of CNN to perform classification
43

Classifier
44
Ref: https://stats.stackexchange.com/questions/188277/activation-function-for-first-layer-nodes-in-an-ann

Classifier
• 激活函數
• 在深層類神經網路中，網路透過誤差
反向傳遞(BP)更新權重，若在權重更
新中，梯度累積得到非常大的值，使
得每一層會大幅度的更新權重參數，
而造成網路相當不穩定，若權重參數
變得相當大，並超出可計算之臨界值，
則輸出結果會產生溢位的狀況發生
• 常見的激勵函數如下圖所示，包含:
Sigmoid、tan h以及ReLU 函數，而基
於梯度消失、爆炸以及收斂性等問題，
一般最常使用ReLU函數進行激發
45
Ref: https://sefiks.com/2020/02/02/dance-moves-of-deep-learning-activation-functions/

Classifier
• Dropout Layer
• It is used to prevent the network from overfitting
46

Keras framework
• Keras is a deep learning API written in Python, running on top of the
machine learning platform TensorFlow.
• It was developed with a focus on enabling fast experimentation
• Integrates with TensorFlow2
• Efficiently executing low-level tensor operations on CPU, GPU, or TPU
• Faster developing for deep-learning networks
• Provides FULL Connection, Convolutional, Pooling, RNN, LSTM…
• The latest version: 2.12.0 (2023-3-24)
47
Ref: https://keras.io/about/

Dog and cat image dataset
• Cat and Dog: 23,422 images
• Training: 18,738 images
• Validation: 4,684 images
• Image size are not fixed!
• 350X320
• 448X329
• …
• …
49

More
• Data imbalance
• If a dataset consists of 100 cat and 900 dog images. If we train the neural
network on this data, it will just learn to predict dog every time
• In this case, we can easily balance the data using sampling techniques
• Down-sampling
• By removing some dog examples
• Up-sampling
• By creating more cat examples using image augmentation or any other method
50
Ref: Multi-Label Image Classification with Neural Network | Keras | by Shiva Verma | Towards Data Science

• Network Calculator
51
Ref: https://madebyollin.github.io/convnet-calculator/

• epochs = 50
52
…

Homework
• Try to build a VGG-16 network with Dog and Cat classification
• https://blog.51cto.com/u_15351425/3727442
• Try to add additional bird images as the third image label
• Download the bird images
• https://drive.google.com/file/d/1NgmjVrRug_qPqlfU_Zb2O-
kyd5Hblaat/view?usp=sharing
• Modify the code, build the model to multi-class prediction
56

Image Classification (20230411)

Recommended

Recommended

More Related Content

Similar to Image Classification (20230411)

Similar to Image Classification (20230411) (20)

More from FEG

More from FEG (20)

Recently uploaded

Recently uploaded (20)

Image Classification (20230411)