SlideShare a Scribd company logo
1 of 37
GOPI KRISHNA NUTI
VICE PRESIDENT, MUST RESEARCH
VP@MUST.CO.IN, NGOPIKRISHNA@GMAIL.COM
COMPUTER VISION AND IMAGE ANALYTICS
– A PRIMER
INTRODUCTION
© 2021 MUST Research
MUST Research – Our publications
• Masters in Data Science from State University of New York at Buffalo, MBA
from Amrita University, Bangalore
• A book introducing Machine Learning from basics through Supervised and
Unsupervised learning for beginners
https://www.amazon.in/Machine-Learning-Engineers-Gopi-
Krishna/dp/9389024870/ref=sr_1_2?dchild=1&keywords=machine+learning+for
+engineers&qid=1616195333&sr=8-2
• Multiple publications and patents
• https://www.linkedin.com/in/ngopikrishna/
© 2021 MUST Research
MUST Research
MUST Research is dedicated to promote excellence and competence in the field of data science, cognitive computing, artificial intelligence,
machine learning, advanced analytics for the benefit of the mankind - it’s a must.
Our vision is to build an ecosystem that enables interaction between academia and enterprise, help them in resolving problems and make them
aware of the latest developments in the cognitive era to provide solutions, guidance or training, organize lectures, seminars and workshops,
collaborate on scientific programs and societal missions.
• India’s largest AI community with 500+ data scientists
• Award winning robots – Softie built in collaboration with Microsoft®
https://www.youtube.com/watch?v=jQ8Gq2HWxiA
• Multiple demonstrations of our robots MUSTie and MUSTani
https://www.youtube.com/watch?v=AewM3TsjoBk
• Letter of appreciation from Govt of Telangana for our contributions
• Branch of Machine Learning which deals with
Images
• Unstructured Data
• Everywhere
• Captured from cameras
• Created by software like MSPaint, Coreldraw,
Adobe Photoshop etc
• Created by software like AutoCAD, Catia, Adobe
Acrobat, MS Word, Powerpoint
• Can contain text, regular shapes, irregular
shapes
• Contain a treasure of information
INTRODUCTIO
N TO
COMPUTER
VISION
• Unstructured Data
• Array of Pixels
IMAGE BASICS - IMAGE REPRESENTATION
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 0 0 0 0 0
0 255 255 255 255 255 255
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0
COLOUR SPACES
RGB
Red, Green and Blue
0-255
CMYK Cyan, Magenta, Yellow and Key
HSV Hue, Saturation and Value
Grayscale Black and White
COLOUR SPACES
Traditional Colors
•Described by Isaac Newton described 1672.
•Primary colors are Red, Green and Blue
•Commonly referred to as "Painter's Colors“.
•Not all colors can be generated.
Subtractive Colors
•Called "Printer's Colors“.
•Colour we see is because of a particular frequency not being absorbed from White light. i.e. Subtracted
•Primary colors are Cyan, Yellow, and Magenta.
Additive Colors
•Adds primary colours together to get a choice of colour
•Displays work like this
WHAT IS IMAGE
PROCESSING?
• Extract quantifiable and meaningful
information out of an image
• Objects present in the image
• Location in the image
• Background or Foreground
• Distance from the viewer
IS IMAGE PROCESSING NEW TO COMPUTERS?
No. My grand mother used it without ever seeing a computer.
Remember the days before internet?
Features in a Cathode Ray Tube Television
 Brightening
 Contrast
 Colour
 Sharpness
 How was this done?
Convolution
CONVOLUTION IN DIGITAL WORLD
• Process of adding each element of an image to its local neighbours weighted by a curve
• NOT the same as MatMult
• Used for blurring, sharpening, Up/Down sampling, Spherical distortion, De-noising, noise-filter etc
CONVOLUTION IN DIGITAL WORLD
• Depending on the convolution matrix, steps and operation chosen, he resultant image shall vary.
WHERE IS WALDO
Locate this
gentleman in next
slide
WHERE IS THE NUMBER PLATE?
HOW MUCH DAMAGE
ON TO COMPUTER
VISION
ALGORITHMS
COMPUTER VISION – THE (AGE) OLD PROBLEMS
• What should a robot do in “Scene understanding”?
• Identify colours, brightness etc
• Identify objects a.k.a Image Segmentation
• Different things
• Multiple occurrences of the same thing
• Stuff other than things
• Distance of things and stuff
• Relative and absolute
COLOUR AND
BRIGHTNESS
Colour
spaces
•Grayscale,
RGB, CMY,
•Transparen
cy/Opacity
using a
fourth
attribute
Limitations
•Does not
represent all
colours in
nature
•colour
perception
highly
susceptible to
lighting
changes.
New Solutions
• Colour spaces
have been
expanded
greatly.
• With micro and
macro level
differences,
~250 colour
spaces are in
vogue
• HSV, HSL/HSI,
YUV, YPbPr,
YCbCr etc
OLD PROBLEM –
IMAGE
SEGMENTATION
• Panoptic Segmentation – Not a
technique. A metric
OLD PROBLEM –
IMAGE
SEGMENTATION
Image is an matrix of numbers.
How to identify the edges of each object
How to recognize the object correctly
Differentiate between “things”
(foreground) and “stuff” (background)
IMAGE SEGMENTATION
–
OLD SOLUTIONS
Solution
Family
Algorithm Drawbacks
Thresholding
• Otsu thresholding
• Adaptive local thresholding
• Mean
• Gaussian
For reasonably simple scenarios only
Edges and Corners
• Canny edges, Sobel Hough, Laplace algorithms
• Harris Corner detection
• Convolution of kernels
Unsuitable for noisy/blurry images
Region Growing
Watershed
• Relatively strong at detecting overlapping/touching
objects
Super Pixels
• SLCI Algorithm
• Susceptible to noise
• Steep increase in algorithmic complexity
Clustering
• K-means
• Fuzzy C-Means (FCM)
• Expectation Maximization (EM)
• Relies on low level features like colour etc.
• Poor performance on complicated images
Clustering • Image Pyramid
• Carefully controlled environments only
• Cannot handle non-affine transformation like rotation,
reflection etc.
• Occlusions are a big no-no
• Compute intensive
IMAGE SEGMENTATION
–
CONVOLUTIONAL NEURAL NETWORKS
• Specialized kind of neural networks
• Process data in known grid-like spatial structures
• Comprised of large number of layers like convolution, pooling and Fully connected layers
• Usually, very very deep. i.e. lots of layers and lots of weight parameters
• Non linear Activation Functions are mandatory for learning complex features
http://cs231n.github.io/convolutional-networks/#overview
EVOLUTIO
N OF CNN
CLASSIFIE
RS
2014
• Regions
with CNN
Features
2015
• Fast R-CNN
• Faster R-CNN
• Inception V3
2016
• YOLO
• SSD
• UberNet
2017
• Mask R-CNN
• Pixel wise
Instance
Segmentation
SOME
SALIENT
POINTS
Regions with CNN Features
R-CNN
•Uses Selective Search
•Significantly reduced the search space to ~2000 region proposal
•Very Slow and very complicated
Designed to solve the problems with R-CNN
Fast R-CNN
•Region Of Interest is treated as a pooling layer
•Jointly trains feature extractor, classifier and bounding box regression into a single model
•Almost 25 time faster than R-CNN
Replace Selective search with region proposal network
Faster R-CNN
•10 times faster than Fast R-CNN
You Only Look Once
YOLO
•Detection is considered as a regression problem
•Extremely fast but less accurate. Struggles with small objects that appear in groups
Single Shot Multi box detector
SSD
•Faster than YOLO and more accurate as well.
Extension of Faster R-CNN
Mask R-CNN
•Predicts the object masks as well as bounding box
•Impressive results
OLD
PROBLEM
-
DEPTH
PERCEPTIO
N
Normal vision and
depth perception
expectation
Relative
depth
Optical illusion based on
depth
Picture of a picture. All
pixels have same depth
OLD
SOLUTIONS
-
DEPTH
PERCEPTIO
N
• Stereo cameras spaced at a fixed distance apart capture the
same image.
• Remember trigonometry? 
• Algorithm Families
• Triangulation
• Interferometry
• Time of Flight
• Many Limitations
• Cost
• Complexity
• Controlled environments only
NEW
SOLUTIONS
-
DEPTH
PERCEPTIO
N
• Furious research in progress
• Single camera moving between two fixed positions
• Monocular Depth perception
• Some interesting proposals
• Train NN with depth information and semantically segmented
image
• Use the models for predicting depth in new images
• Solutions are almost mainstream
• Anyone heard of Kinect?
OLD PROBLEM –
PROGRAMMER’S
DILEMMA
OLD PROBLEM
-
PROGRAMMERS
DILEMMA
• Which image format should I use?
• Which image file format should I code for? Do I have to learn
reading and writing image files?
• Matlab is expensive 
NEW SOLUTION
-
OPENCV, PYTHON,
PILLOW ETC
• OpenCV
• Democratized image processing
• A large number of functionalities provided as APIs
• Impressive Python bindings and native support for C, Java
• Python
• PILLOW and many other libraries for reading images
• Vectorization and Numpy Arrays
NEW SOLUTIONS
–
NEW PROBLEMS
NEURAL
NETWORKS
• Data hungry. Lots and lots of training data.
• Resource hungry and compute intensive.
• Overfitting, Underfitting, Stochasticity
• Black box
SOME
SOLUTIONS
• Transfer Learning to reduce training time
• Hyper parameter tuning
• Hardware based solutions for improving performance
• On-going research for explainability
• On-going research for reducing the training data requirement 3rd
generation neural networks
THANKS

More Related Content

What's hot

Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Andrew Gardner
 

What's hot (20)

Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Bringing AI to Business Intelligence
Bringing AI to Business IntelligenceBringing AI to Business Intelligence
Bringing AI to Business Intelligence
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning ResearchJeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
 
Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!Data! Data! Data! I Can't Make Bricks Without Clay!
Data! Data! Data! I Can't Make Bricks Without Clay!
 
10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems10 Lessons Learned from Building Machine Learning Systems
10 Lessons Learned from Building Machine Learning Systems
 
SkillsFuture Festival at NUS 2019- Machine Learning for Humans
SkillsFuture Festival at NUS 2019- Machine Learning for HumansSkillsFuture Festival at NUS 2019- Machine Learning for Humans
SkillsFuture Festival at NUS 2019- Machine Learning for Humans
 
The Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and BeyondThe Frontier of Deep Learning in 2020 and Beyond
The Frontier of Deep Learning in 2020 and Beyond
 
Presentation v3
Presentation v3Presentation v3
Presentation v3
 
Building Personalized Data Products with Dato
Building Personalized Data Products with DatoBuilding Personalized Data Products with Dato
Building Personalized Data Products with Dato
 
Towards a Reactive Game Engine
Towards a Reactive Game EngineTowards a Reactive Game Engine
Towards a Reactive Game Engine
 
How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?How Will AI Change the Role of the Data Scientist?
How Will AI Change the Role of the Data Scientist?
 
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
Deep Learning for Data Scientists - Data Science ATL Meetup Presentation, 201...
 
Introduction to Machine Learning & AI
Introduction to Machine Learning & AIIntroduction to Machine Learning & AI
Introduction to Machine Learning & AI
 
Data Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAMLData Workflows for Machine Learning - Seattle DAML
Data Workflows for Machine Learning - Seattle DAML
 
Machine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional ManagersMachine Learning Foundations for Professional Managers
Machine Learning Foundations for Professional Managers
 
SkillsFuture Festival at NUS 2019- Industrial Deep Learning and Latest AI Al...
 SkillsFuture Festival at NUS 2019- Industrial Deep Learning and Latest AI Al... SkillsFuture Festival at NUS 2019- Industrial Deep Learning and Latest AI Al...
SkillsFuture Festival at NUS 2019- Industrial Deep Learning and Latest AI Al...
 
Getting Started With Dato - August 2015
Getting Started With Dato - August 2015Getting Started With Dato - August 2015
Getting Started With Dato - August 2015
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
 

Similar to Image analytics - A Primer

1 [Autosaved].pptx
1 [Autosaved].pptx1 [Autosaved].pptx
1 [Autosaved].pptx
SsdSsd5
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 

Similar to Image analytics - A Primer (20)

Computer vision old problems new solutions
Computer vision   old problems new solutionsComputer vision   old problems new solutions
Computer vision old problems new solutions
 
Image processing.pdf
Image processing.pdfImage processing.pdf
Image processing.pdf
 
Digital image processing
Digital image processingDigital image processing
Digital image processing
 
1 [Autosaved].pptx
1 [Autosaved].pptx1 [Autosaved].pptx
1 [Autosaved].pptx
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
Anits dip
Anits dipAnits dip
Anits dip
 
Extraction of region of interest in an image
Extraction of region of interest in an imageExtraction of region of interest in an image
Extraction of region of interest in an image
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
 
Introduction to computer vision
Introduction to computer visionIntroduction to computer vision
Introduction to computer vision
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
FACE RECOGNITION ACROSS NON-UNIFORM MOTION BLUR
FACE RECOGNITION ACROSS  NON-UNIFORM MOTION BLUR FACE RECOGNITION ACROSS  NON-UNIFORM MOTION BLUR
FACE RECOGNITION ACROSS NON-UNIFORM MOTION BLUR
 
Fundamentals steps in Digital Image processing
Fundamentals steps in Digital Image processingFundamentals steps in Digital Image processing
Fundamentals steps in Digital Image processing
 
fpres
fpresfpres
fpres
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
ICS1020 CV
ICS1020 CVICS1020 CV
ICS1020 CV
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
PPT s06-machine vision-s2
PPT s06-machine vision-s2PPT s06-machine vision-s2
PPT s06-machine vision-s2
 

More from Gopi Krishna Nuti

More from Gopi Krishna Nuti (7)

Neural Networks - it’s usage in Corporate
Neural Networks -it’s usage in CorporateNeural Networks -it’s usage in Corporate
Neural Networks - it’s usage in Corporate
 
Ai for pharmaceutical industry – a primer
Ai for pharmaceutical industry – a primerAi for pharmaceutical industry – a primer
Ai for pharmaceutical industry – a primer
 
Softskills orientation
Softskills orientationSoftskills orientation
Softskills orientation
 
Emerging Technology trends and employability skills
Emerging Technology trends and employability skillsEmerging Technology trends and employability skills
Emerging Technology trends and employability skills
 
Classification vis a-vis ranking - gopi
Classification vis a-vis ranking - gopiClassification vis a-vis ranking - gopi
Classification vis a-vis ranking - gopi
 
F2 talk
F2 talkF2 talk
F2 talk
 
Emerging Trends in Information Technology
Emerging Trends in Information TechnologyEmerging Trends in Information Technology
Emerging Trends in Information Technology
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 

Image analytics - A Primer

  • 1. GOPI KRISHNA NUTI VICE PRESIDENT, MUST RESEARCH VP@MUST.CO.IN, NGOPIKRISHNA@GMAIL.COM COMPUTER VISION AND IMAGE ANALYTICS – A PRIMER
  • 3. © 2021 MUST Research MUST Research – Our publications • Masters in Data Science from State University of New York at Buffalo, MBA from Amrita University, Bangalore • A book introducing Machine Learning from basics through Supervised and Unsupervised learning for beginners https://www.amazon.in/Machine-Learning-Engineers-Gopi- Krishna/dp/9389024870/ref=sr_1_2?dchild=1&keywords=machine+learning+for +engineers&qid=1616195333&sr=8-2 • Multiple publications and patents • https://www.linkedin.com/in/ngopikrishna/
  • 4. © 2021 MUST Research MUST Research MUST Research is dedicated to promote excellence and competence in the field of data science, cognitive computing, artificial intelligence, machine learning, advanced analytics for the benefit of the mankind - it’s a must. Our vision is to build an ecosystem that enables interaction between academia and enterprise, help them in resolving problems and make them aware of the latest developments in the cognitive era to provide solutions, guidance or training, organize lectures, seminars and workshops, collaborate on scientific programs and societal missions. • India’s largest AI community with 500+ data scientists • Award winning robots – Softie built in collaboration with Microsoft® https://www.youtube.com/watch?v=jQ8Gq2HWxiA • Multiple demonstrations of our robots MUSTie and MUSTani https://www.youtube.com/watch?v=AewM3TsjoBk • Letter of appreciation from Govt of Telangana for our contributions
  • 5. • Branch of Machine Learning which deals with Images • Unstructured Data • Everywhere • Captured from cameras • Created by software like MSPaint, Coreldraw, Adobe Photoshop etc • Created by software like AutoCAD, Catia, Adobe Acrobat, MS Word, Powerpoint • Can contain text, regular shapes, irregular shapes • Contain a treasure of information INTRODUCTIO N TO COMPUTER VISION
  • 6. • Unstructured Data • Array of Pixels IMAGE BASICS - IMAGE REPRESENTATION 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 0 0 0 0 0 0 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  • 7. COLOUR SPACES RGB Red, Green and Blue 0-255 CMYK Cyan, Magenta, Yellow and Key HSV Hue, Saturation and Value Grayscale Black and White
  • 8. COLOUR SPACES Traditional Colors •Described by Isaac Newton described 1672. •Primary colors are Red, Green and Blue •Commonly referred to as "Painter's Colors“. •Not all colors can be generated. Subtractive Colors •Called "Printer's Colors“. •Colour we see is because of a particular frequency not being absorbed from White light. i.e. Subtracted •Primary colors are Cyan, Yellow, and Magenta. Additive Colors •Adds primary colours together to get a choice of colour •Displays work like this
  • 9. WHAT IS IMAGE PROCESSING? • Extract quantifiable and meaningful information out of an image • Objects present in the image • Location in the image • Background or Foreground • Distance from the viewer
  • 10. IS IMAGE PROCESSING NEW TO COMPUTERS? No. My grand mother used it without ever seeing a computer. Remember the days before internet? Features in a Cathode Ray Tube Television  Brightening  Contrast  Colour  Sharpness  How was this done? Convolution
  • 11. CONVOLUTION IN DIGITAL WORLD • Process of adding each element of an image to its local neighbours weighted by a curve • NOT the same as MatMult • Used for blurring, sharpening, Up/Down sampling, Spherical distortion, De-noising, noise-filter etc
  • 12. CONVOLUTION IN DIGITAL WORLD • Depending on the convolution matrix, steps and operation chosen, he resultant image shall vary.
  • 13. WHERE IS WALDO Locate this gentleman in next slide
  • 14.
  • 15. WHERE IS THE NUMBER PLATE?
  • 16.
  • 19. COMPUTER VISION – THE (AGE) OLD PROBLEMS • What should a robot do in “Scene understanding”? • Identify colours, brightness etc • Identify objects a.k.a Image Segmentation • Different things • Multiple occurrences of the same thing • Stuff other than things • Distance of things and stuff • Relative and absolute
  • 20. COLOUR AND BRIGHTNESS Colour spaces •Grayscale, RGB, CMY, •Transparen cy/Opacity using a fourth attribute Limitations •Does not represent all colours in nature •colour perception highly susceptible to lighting changes. New Solutions • Colour spaces have been expanded greatly. • With micro and macro level differences, ~250 colour spaces are in vogue • HSV, HSL/HSI, YUV, YPbPr, YCbCr etc
  • 21. OLD PROBLEM – IMAGE SEGMENTATION • Panoptic Segmentation – Not a technique. A metric
  • 22. OLD PROBLEM – IMAGE SEGMENTATION Image is an matrix of numbers. How to identify the edges of each object How to recognize the object correctly Differentiate between “things” (foreground) and “stuff” (background)
  • 23. IMAGE SEGMENTATION – OLD SOLUTIONS Solution Family Algorithm Drawbacks Thresholding • Otsu thresholding • Adaptive local thresholding • Mean • Gaussian For reasonably simple scenarios only Edges and Corners • Canny edges, Sobel Hough, Laplace algorithms • Harris Corner detection • Convolution of kernels Unsuitable for noisy/blurry images Region Growing Watershed • Relatively strong at detecting overlapping/touching objects Super Pixels • SLCI Algorithm • Susceptible to noise • Steep increase in algorithmic complexity Clustering • K-means • Fuzzy C-Means (FCM) • Expectation Maximization (EM) • Relies on low level features like colour etc. • Poor performance on complicated images Clustering • Image Pyramid • Carefully controlled environments only • Cannot handle non-affine transformation like rotation, reflection etc. • Occlusions are a big no-no • Compute intensive
  • 24. IMAGE SEGMENTATION – CONVOLUTIONAL NEURAL NETWORKS • Specialized kind of neural networks • Process data in known grid-like spatial structures • Comprised of large number of layers like convolution, pooling and Fully connected layers • Usually, very very deep. i.e. lots of layers and lots of weight parameters • Non linear Activation Functions are mandatory for learning complex features
  • 26. EVOLUTIO N OF CNN CLASSIFIE RS 2014 • Regions with CNN Features 2015 • Fast R-CNN • Faster R-CNN • Inception V3 2016 • YOLO • SSD • UberNet 2017 • Mask R-CNN • Pixel wise Instance Segmentation
  • 27. SOME SALIENT POINTS Regions with CNN Features R-CNN •Uses Selective Search •Significantly reduced the search space to ~2000 region proposal •Very Slow and very complicated Designed to solve the problems with R-CNN Fast R-CNN •Region Of Interest is treated as a pooling layer •Jointly trains feature extractor, classifier and bounding box regression into a single model •Almost 25 time faster than R-CNN Replace Selective search with region proposal network Faster R-CNN •10 times faster than Fast R-CNN You Only Look Once YOLO •Detection is considered as a regression problem •Extremely fast but less accurate. Struggles with small objects that appear in groups Single Shot Multi box detector SSD •Faster than YOLO and more accurate as well. Extension of Faster R-CNN Mask R-CNN •Predicts the object masks as well as bounding box •Impressive results
  • 28. OLD PROBLEM - DEPTH PERCEPTIO N Normal vision and depth perception expectation Relative depth Optical illusion based on depth Picture of a picture. All pixels have same depth
  • 29. OLD SOLUTIONS - DEPTH PERCEPTIO N • Stereo cameras spaced at a fixed distance apart capture the same image. • Remember trigonometry?  • Algorithm Families • Triangulation • Interferometry • Time of Flight • Many Limitations • Cost • Complexity • Controlled environments only
  • 30. NEW SOLUTIONS - DEPTH PERCEPTIO N • Furious research in progress • Single camera moving between two fixed positions • Monocular Depth perception • Some interesting proposals • Train NN with depth information and semantically segmented image • Use the models for predicting depth in new images • Solutions are almost mainstream • Anyone heard of Kinect?
  • 32. OLD PROBLEM - PROGRAMMERS DILEMMA • Which image format should I use? • Which image file format should I code for? Do I have to learn reading and writing image files? • Matlab is expensive 
  • 33. NEW SOLUTION - OPENCV, PYTHON, PILLOW ETC • OpenCV • Democratized image processing • A large number of functionalities provided as APIs • Impressive Python bindings and native support for C, Java • Python • PILLOW and many other libraries for reading images • Vectorization and Numpy Arrays
  • 35. NEURAL NETWORKS • Data hungry. Lots and lots of training data. • Resource hungry and compute intensive. • Overfitting, Underfitting, Stochasticity • Black box
  • 36. SOME SOLUTIONS • Transfer Learning to reduce training time • Hyper parameter tuning • Hardware based solutions for improving performance • On-going research for explainability • On-going research for reducing the training data requirement 3rd generation neural networks