6. Definition of Computer Vision
• Computer vision is scientific field that deals with how
computers can gain high-level understanding from digital
images or videos.
• From the perspective of engineering, it seeks to understand
and automate tasks that the human visual system can do.
• Develop the theoretical and algorithmic basis to
automatically extract and analyze useful information from
an observed image, image set, or image sequence made by
special-purpose or general-purpose computers.
8. Why Evaluation?
Computer vision algorithms are complex and difficult to
analyse mathematically.
Evaluation is usually through measurement of the
algorithm’s performance on test images
Use of a range of images to establish performance
envelope
Comparison with existing algorithms
Performance on degraded (noise-added) images
(robustness)
Sensitivity to algorithm parameter settings
10. Why computer vision algorithms
need new benchmarks?
1. In recent years, the creation of large data sets of labeled
images has helped in the development of highly efficient
computer vision systems.
2. Artificial intelligence models trained and tested on
repositories like ImageNet (approx. 14 million photos) and
OpenImages (approx. 9 million photos) can match and
sometimes exceed human performance at detecting specific
classes of objects.
11.
12.
13.
14.
15. • The problems of additive noises generated by
different sources caused by:
o Camera sensor,
o Detector sensitivity variation,
o Surrounding environmental effects,
o Discrete nature of radiation,
o Dust on the optics,
o Quantization errors,
o Transmission data errors
Measurements of Noise
16. The Mean Square Error (MSE) is the difference between the probe face image and the
gallery images. Consider I(i,j) is the additive noise, T(i,j) is the true image, and N(i,j) is
the noise image as given in Equation (3-11).
Mean Square Error
where N is the image size, I(i,j) is the probe image, and G(i,j) is the gallery image. Note that the probe image including the
additive white Gaussian noise (AWGN). The MSE will decrease as long as noise reaches to zero and the difference
between probe and gallery image reaches to zero.
17.
18.
19. Confusion matrix for the genuine
and imposter subjects.
19
True positive (TP)
GA
False negative (FN)
IR
TP
TPR
TP FN
=
+
False positive (FP)
IA
True negative (TN)
GR
TN
TPR
TN FP
=
+
TP
PPR
TP FP
=
+
TP TN
ACC
TP TN FP FN
+
=
+ + +
1
2
2
TP
F
TP FP FN
=
+ +
21. TOP 10 COMPUTER VISION TOOLS
FOR 2020.
OpenCV
Most well-known library, multi-platform, and simple to utilize. It covers all the fundamental strategies and algorithms to play out a few
image and video processing tasks, functions admirably with C++ and Python.
Tensorflow
This is the most well-known machine learning and deep learning library today. Its prominence quickly increased and outperformed existing
libraries because of the simplicity of the API. TensorFlow is a free open-source library for data streams and differential programming. It is a
symbolic math library that is additionally utilized for machine learning applications, for example, neural networks.
TensorFlow 2.0 encourages the execution of pre-prepared models that are tuned for picture and speech recognition, object detection,
recommendations, reinforced learning, and so forth. Such reference models permit you to utilize unique best practices and fill in as beginning
stages for building up your own elite solutions.
Matlab
Matlab is an extraordinary tool for making image processing applications and is generally utilized in research as it permits quick prototyping.
Another fascinating perspective is that Matlab code is very concise when compared with C++, making it simpler to peruse and troubleshoot.
It handles errors before execution by proposing a few different ways to make the code faster.
CUDA
NVIDIA’s foundation for parallel computing that is easy to program and very effective and quick. Utilizing the power of GPUs it delivers
incredible performance. Its toolbox incorporates the NVIDIA Performance Primitives library contained with a set of image, signal, and video
processing functions.
22. Theano
Theano is a quick Python numerical library that can run on a CPU or GPU. It was created by the LISA group (presently MILA) at the University of Montreal in Canada.
Theano is an enhancing compiler for controlling and assessing mathematical expressions, especially matrix-valued ones.
SimpleCV
SimpleCV is a system for building computer vision applications. It gives you access to a large number of computer vision tools on any semblance of OpenCV, pygame,
and so forth. If you would prefer not to get into the profundities of image processing and simply need to complete your work, this is the tool to get your hands on. If you
need to do some quick prototyping, SimpleCV will serve you best.
Keras
Keras is a deep learning Python library that combines the elements of different libraries, for example, Tensorflow, Theano, and CNTK. Keras has a favorable position
over contenders, for example, Scikit-learn and PyTorch, as it runs on top of Tensorflow.
Keras can run on TensorFlow, Microsoft Cognitive Toolkit, Theano, or PlaidML. Intended for quick experimentation with deep neural networks, it centers around
convenience, measured quality, and extensibility. Keras follows best practices for decreasing cognitive load: it offers steady and basic APIs and limits the number of user
actions required for regular use cases.
GPUImage
It is a framework based on OpenGL ES 2.0 that permits applying GPU-accelerated impacts and channels to live motion video, pictures, and films. Running custom
channels on a GPU demands a lot of code to set up and keep up.
YOLO
“You just look once” or YOLO is an object detection system planned particularly for real-time processing. YOLO is an advanced real-time object detection system
created by Joseph Redmon and Ali Farhadi from the University of Washington. Their algorithm applies a neural network to a whole picture and the neural network
partitions the picture into a grid and imprints districts with detected items.
BoofCV
BoofCv is an open-source Java library for real-time robotics and computer vision applications which comes under an Apache 2.0 license for both scholastic and business
use. Its functionality covers a wide scope of subjects including, streamlined low-level image processing routines, camera alignment, feature detection/tracking, structure-
from-motion, and recognition.
TOP 10 COMPUTER VISION TOOLS
FOR 2020.
23. A negative result is when the outcome of an experiment or a model is not
what is expected or when a hypothesis does not hold.
Despite being often overlooked in the scientific community, negative results
are results and they carry value.
While this topic has been extensively discussed in other fields such as social
sciences and biosciences, less attention has been paid to it in the computer
vision community.
The unique characteristics of computer vision, particularly its experimental
aspect, call for a special treatment of this
matter.
Negative results in computer vision
24. References
1. Shams, M. Y., A. S. Tolba, and S. H. Sarhan. "A vision system for multi-view face
recognition." arXiv preprint arXiv:1706.00510 (2017).
2. Bekhet, Saddam, and Amr Ahmed. "Evaluation of similarity measures for video
retrieval." Multimedia Tools and Applications 79, no. 9 (2020): 6265-6278.
https://bdtechtalks.com/2019/12/16/objectnet-dataset-ai-computer-vision/
COMPUTER VISION LATEST NEWS TOP LIST