Image analytics - A Primer

GOPI KRISHNA NUTI
VICE PRESIDENT, MUST RESEARCH
VP@MUST.CO.IN, NGOPIKRISHNA@GMAIL.COM
COMPUTER VISION AND IMAGE ANALYTICS
– A PRIMER

© 2021 MUST Research
MUST Research – Our publications
• Masters in Data Science from State University of New York at Buffalo, MBA
from Amrita University, Bangalore
• A book introducing Machine Learning from basics through Supervised and
Unsupervised learning for beginners
https://www.amazon.in/Machine-Learning-Engineers-Gopi-
Krishna/dp/9389024870/ref=sr_1_2?dchild=1&keywords=machine+learning+for
+engineers&qid=1616195333&sr=8-2
• Multiple publications and patents
• https://www.linkedin.com/in/ngopikrishna/

© 2021 MUST Research
MUST Research
MUST Research is dedicated to promote excellence and competence in the field of data science, cognitive computing, artificial intelligence,
machine learning, advanced analytics for the benefit of the mankind - it’s a must.
Our vision is to build an ecosystem that enables interaction between academia and enterprise, help them in resolving problems and make them
aware of the latest developments in the cognitive era to provide solutions, guidance or training, organize lectures, seminars and workshops,
collaborate on scientific programs and societal missions.
• India’s largest AI community with 500+ data scientists
• Award winning robots – Softie built in collaboration with Microsoft®
https://www.youtube.com/watch?v=jQ8Gq2HWxiA
• Multiple demonstrations of our robots MUSTie and MUSTani
https://www.youtube.com/watch?v=AewM3TsjoBk
• Letter of appreciation from Govt of Telangana for our contributions

• Branch of Machine Learning which deals with
Images
• Unstructured Data
• Everywhere
• Captured from cameras
• Created by software like MSPaint, Coreldraw,
Adobe Photoshop etc
• Created by software like AutoCAD, Catia, Adobe
Acrobat, MS Word, Powerpoint
• Can contain text, regular shapes, irregular
shapes
• Contain a treasure of information
INTRODUCTIO
N TO
COMPUTER
VISION

• Unstructured Data
• Array of Pixels
IMAGE BASICS - IMAGE REPRESENTATION
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 255 255 255 255 255
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0

COLOUR SPACES
RGB
Red, Green and Blue
0-255
CMYK Cyan, Magenta, Yellow and Key
HSV Hue, Saturation and Value
Grayscale Black and White

COLOUR SPACES
Traditional Colors
•Described by Isaac Newton described 1672.
•Primary colors are Red, Green and Blue
•Commonly referred to as "Painter's Colors“.
•Not all colors can be generated.
Subtractive Colors
•Called "Printer's Colors“.
•Colour we see is because of a particular frequency not being absorbed from White light. i.e. Subtracted
•Primary colors are Cyan, Yellow, and Magenta.
Additive Colors
•Adds primary colours together to get a choice of colour
•Displays work like this

WHAT IS IMAGE
PROCESSING?
• Extract quantifiable and meaningful
information out of an image
• Objects present in the image
• Location in the image
• Background or Foreground
• Distance from the viewer

IS IMAGE PROCESSING NEW TO COMPUTERS?
No. My grand mother used it without ever seeing a computer.
Remember the days before internet?
Features in a Cathode Ray Tube Television
 Brightening
 Contrast
 Colour
 Sharpness
 How was this done?
Convolution

CONVOLUTION IN DIGITAL WORLD
• Process of adding each element of an image to its local neighbours weighted by a curve
• NOT the same as MatMult
• Used for blurring, sharpening, Up/Down sampling, Spherical distortion, De-noising, noise-filter etc

CONVOLUTION IN DIGITAL WORLD
• Depending on the convolution matrix, steps and operation chosen, he resultant image shall vary.

WHERE IS WALDO
Locate this
gentleman in next
slide

ON TO COMPUTER
VISION
ALGORITHMS

COMPUTER VISION – THE (AGE) OLD PROBLEMS
• What should a robot do in “Scene understanding”?
• Identify colours, brightness etc
• Identify objects a.k.a Image Segmentation
• Different things
• Multiple occurrences of the same thing
• Stuff other than things
• Distance of things and stuff
• Relative and absolute

COLOUR AND
BRIGHTNESS
Colour
spaces
•Grayscale,
RGB, CMY,
•Transparen
cy/Opacity
using a
fourth
attribute
Limitations
•Does not
represent all
colours in
nature
•colour
perception
highly
susceptible to
lighting
changes.
New Solutions
• Colour spaces
have been
expanded
greatly.
• With micro and
macro level
differences,
~250 colour
spaces are in
vogue
• HSV, HSL/HSI,
YUV, YPbPr,
YCbCr etc

OLD PROBLEM –
IMAGE
SEGMENTATION
• Panoptic Segmentation – Not a
technique. A metric

OLD PROBLEM –
IMAGE
SEGMENTATION
Image is an matrix of numbers.
How to identify the edges of each object
How to recognize the object correctly
Differentiate between “things”
(foreground) and “stuff” (background)

IMAGE SEGMENTATION
–
OLD SOLUTIONS
Solution
Family
Algorithm Drawbacks
Thresholding
• Otsu thresholding
• Adaptive local thresholding
• Mean
• Gaussian
For reasonably simple scenarios only
Edges and Corners
• Canny edges, Sobel Hough, Laplace algorithms
• Harris Corner detection
• Convolution of kernels
Unsuitable for noisy/blurry images
Region Growing
Watershed
• Relatively strong at detecting overlapping/touching
objects
Super Pixels
• SLCI Algorithm
• Susceptible to noise
• Steep increase in algorithmic complexity
Clustering
• K-means
• Fuzzy C-Means (FCM)
• Expectation Maximization (EM)
• Relies on low level features like colour etc.
• Poor performance on complicated images
Clustering • Image Pyramid
• Carefully controlled environments only
• Cannot handle non-affine transformation like rotation,
reflection etc.
• Occlusions are a big no-no
• Compute intensive

IMAGE SEGMENTATION
–
CONVOLUTIONAL NEURAL NETWORKS
• Specialized kind of neural networks
• Process data in known grid-like spatial structures
• Comprised of large number of layers like convolution, pooling and Fully connected layers
• Usually, very very deep. i.e. lots of layers and lots of weight parameters
• Non linear Activation Functions are mandatory for learning complex features

http://cs231n.github.io/convolutional-networks/#overview

EVOLUTIO
N OF CNN
CLASSIFIE
RS
2014
• Regions
with CNN
Features
2015
• Fast R-CNN
• Faster R-CNN
• Inception V3
2016
• YOLO
• SSD
• UberNet
2017
• Mask R-CNN
• Pixel wise
Instance
Segmentation

SOME
SALIENT
POINTS
Regions with CNN Features
R-CNN
•Uses Selective Search
•Significantly reduced the search space to ~2000 region proposal
•Very Slow and very complicated
Designed to solve the problems with R-CNN
Fast R-CNN
•Region Of Interest is treated as a pooling layer
•Jointly trains feature extractor, classifier and bounding box regression into a single model
•Almost 25 time faster than R-CNN
Replace Selective search with region proposal network
Faster R-CNN
•10 times faster than Fast R-CNN
You Only Look Once
YOLO
•Detection is considered as a regression problem
•Extremely fast but less accurate. Struggles with small objects that appear in groups
Single Shot Multi box detector
SSD
•Faster than YOLO and more accurate as well.
Extension of Faster R-CNN
Mask R-CNN
•Predicts the object masks as well as bounding box
•Impressive results

OLD
PROBLEM
-
DEPTH
PERCEPTIO
N
Normal vision and
depth perception
expectation
Relative
depth
Optical illusion based on
depth
Picture of a picture. All
pixels have same depth

OLD
SOLUTIONS
-
DEPTH
PERCEPTIO
N
• Stereo cameras spaced at a fixed distance apart capture the
same image.
• Remember trigonometry? 
• Algorithm Families
• Triangulation
• Interferometry
• Time of Flight
• Many Limitations
• Cost
• Complexity
• Controlled environments only

NEW
SOLUTIONS
-
DEPTH
PERCEPTIO
N
• Furious research in progress
• Single camera moving between two fixed positions
• Monocular Depth perception
• Some interesting proposals
• Train NN with depth information and semantically segmented
image
• Use the models for predicting depth in new images
• Solutions are almost mainstream
• Anyone heard of Kinect?

OLD PROBLEM –
PROGRAMMER’S
DILEMMA

OLD PROBLEM
-
PROGRAMMERS
DILEMMA
• Which image format should I use?
• Which image file format should I code for? Do I have to learn
reading and writing image files?
• Matlab is expensive 

NEW SOLUTION
-
OPENCV, PYTHON,
PILLOW ETC
• OpenCV
• Democratized image processing
• A large number of functionalities provided as APIs
• Impressive Python bindings and native support for C, Java
• Python
• PILLOW and many other libraries for reading images
• Vectorization and Numpy Arrays

NEW SOLUTIONS
–
NEW PROBLEMS

NEURAL
NETWORKS
• Data hungry. Lots and lots of training data.
• Resource hungry and compute intensive.
• Overfitting, Underfitting, Stochasticity
• Black box

SOME
SOLUTIONS
• Transfer Learning to reduce training time
• Hyper parameter tuning
• Hardware based solutions for improving performance
• On-going research for explainability
• On-going research for reducing the training data requirement 3rd
generation neural networks

Image analytics - A Primer

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Image analytics - A Primer

Similar to Image analytics - A Primer (20)

More from Gopi Krishna Nuti

More from Gopi Krishna Nuti (7)

Recently uploaded

Recently uploaded (20)

Image analytics - A Primer