Deep Learning For Computer Vision- Day 3 Study Jams GDSC Unsri.pptx

DEEP LEARNING
Computer Vision
Annisa Darmawahyuni Machine Learning Study Jams, 2024

DEEP LEARNING BASIC
“ARTIFICIAL NEURAL NETWORKS”

MACHINE VS DEEP LEARNING
Annisa Darmawahyuni

“Deep learning allows computational models of multiple
processing layers to learn and represent data with multiple
levels of abstraction mimicking how the brain perceives and
understands multimodal information, thus implicitly capturing intricate
structures of large-scale data”

Annisa Darmawahyuni
(a) Face Detection
(b) Object Instance Segmentation
(c) Structure from motion (3D)
(d) Stereo Matching (3D)
COMPUTER
VISION

Annisa Darmawahyuni
Computer vision is a field of artificial intelligence (AI) that enables
computers and systems to derive meaningful information from digital
images, videos and other visual inputs — and take actions or make
recommendations based on that information. If AI enables computers to
think, computer vision enables them to see, observe and understand.
COMPUTER
VISION

Ti
Annisa Darmawahyuni
Timeline of topic research in computer vision

Annisa Darmawahyuni
COMPUTER VISION
MACHINE LEARNING DEEP LEARNING
Haar-like wavelet feature and integral graph
method
K-means, Naive Bayes classifier, Decision
Tree, Boosting, Random Forest, Haar
Classifier, Expectation–Maximization (EM), K-
Nearest Neighbor (KNN), and Support Vector
Machine (SVM
Convolutional Neural Networks (CNNs),
Restricted Boltzmann Machines (RBMs),
Autoencoders, Sparse Coding

Annisa Darmawahyuni
CNN FOR
COMPUTER VISION

Annisa Darmawahyuni
OBJECT DETECTION
Object detection is the process of detecting instances of semantic objects of a certain class (such as humans,
airplanes, or birds) in digital images and video.
Ground truth Bounding Box with region approach Bounding Box with region and semantic
segmentation approach

Annisa Darmawahyuni
OBJECT DETECTION
You can choose from two key approaches to get started with object detection using deep learning:
Create and train a custom object detector.
To train a custom object detector from scratch, you need to design a network architecture to learn
the features for the objects of interest. You also need to compile a very large set of labeled data to
train the CNN. The results of a custom object detector can be remarkable. That said, you need to
manually set up the layers and weights in the CNN, which requires a lot of time and training data.
Use a pretrained object detector.
Many object detection workflows using deep learning leverage transfer learning, an approach that
enables you to start with a pretrained network and then fine-tune it for your application. This
method can provide faster results because the object detectors have already been trained on
thousands, or even millions, of images.

Annisa Darmawahyuni
OBJECT DETECTION

Annisa Darmawahyuni
SEMANTIC SEGMENTATION
Semantic Segmentation is a deep learning algorithm that associates a label or category with every pixel in an
image. It is used to recognize a collection of pixels that form distinct categories
A simple example of semantic segmentation is separating the images into two classes. For example, in Figure 1, an image showing a person
at the beach is paired with a version showing the image's pixels segmented into two separate classes: person and background.

Annisa Darmawahyuni
HOW DOES SEMANTIC SEGMENTATION
DIFFER FROM OBJECT DETECTION?
Semantic segmentation can be a useful alternative to object detection because it allows the object of interest to span
multiple areas in the image at the pixel level. This technique cleanly detects objects that are irregularly shaped, in
contrast to object detection, where objects must fit within a bounding box (Figure 2)
Figure 2. Object detection, showing bounding boxes to identify objects.

Annisa Darmawahyuni

Annisa Darmawahyuni
SEMANTIC
SEGMENTATION
The process of training a semantic segmentation network to
classify images follows these steps:
Analyze a collection of pixel-labeled images.
Create a semantic segmentation network.
Train the network to classify images into pixel categories.
Assess the accuracy of the network

Annisa Darmawahyuni
Highway scene showing color image (left) and corresponding labeled pixels (right)

Annisa Darmawahyuni
DATASET FOR COMPUTER VISION
Grayscale Images. The most used grayscale images dataset is MNIST
(https://www.kaggle.com/datasets/hojjatk/mnist-dataset) and its variations, that is, NIST and perturbed
NIST. The application scenario is the recognition of handwritten digits.
RGB Natural Images. Caltech RGB image datasets (https://euclid.caltech.edu/image/euclid20231107b-
ngc-6822), CIFAR datasets (https://www.cs.toronto.edu/~kriz/cifar.html) consist of thousands of 32 × 32
color images in various classes.
Hyperspectral Images. SCIEN hyperspectral image data and AVIRIS sensor based datasets, for example,
contain hyperspectral images.
Facial Characteristics Images. Adience benchmark dataset
Medical Images. Chest X-ray dataset (https://www.kaggle.com/datasets/paultimothymooney/chest-xray-
pneumonia) comprises 112120 frontal-view X-ray images of 30805 unique patients.
Video Streams. The WR datasets can be used for video-based activity recognition in assembly lines.
YouTube-8M is a dataset of 8 million YouTube video URLs, along with video-level labels from a diverse set
of 4800 Knowledge Graph entities.

Annisa Darmawahyuni
PARAMETER VS HYPERPARAMETER

Annisa Darmawahyuni
HYPERPARAMETER TUNING (DL)
Learning rate (LR). If the learning rate (LR) is too small, overfitting can occur. Large learning rates help to
regularize the training but if the learning rate is too large, the training will diverge.
Number of hidden layers.
Number of nodes/neurons per layer.
Optimizer
Batch Size
Epochs
Artikel Ilmiah Computer Vision Deep Learning
Intelligent System Research Group
https://docs.google.com/spreadsheets/d/13MLJnecd32B3H-f342M-
Uoqd_y5wRVgGDK1aT-bQg3w/edit#gid=0

annisadarmawahyuni@unsri.ac.id
riset.annisadarmawahyuni@gmail.com

Deep Learning For Computer Vision- Day 3 Study Jams GDSC Unsri.pptx

Recommended

Recommended

More Related Content

Similar to Deep Learning For Computer Vision- Day 3 Study Jams GDSC Unsri.pptx

Similar to Deep Learning For Computer Vision- Day 3 Study Jams GDSC Unsri.pptx (20)

More from pmgdscunsri

More from pmgdscunsri (15)

Recently uploaded

Recently uploaded (20)

Deep Learning For Computer Vision- Day 3 Study Jams GDSC Unsri.pptx