2. ● Robot vision is the ability of a robot to see and recognize objects via the collection into an image of the light reflected
by these objects, and then interpreting and processing this image. Robot vision employs optical or visual sensors
(cameras) and proper electronic equipment to process/analyze visual images and recognize objects of concern in
each robotic application.
● Robot (computer) vision can be grouped in the following principal subareas:
3. Sensing
The sensing subarea of robotic vision includes the following
Camera calibration, the first requirement in sensing, is concerned with the correction of image displacements that occur due
to the characteristics of the camera’s interior orientation. Lens distortion is one of the reasons of the displacement of image
points from their ideal “pinhole model” positions on the sensor plane.
(refer reference page for pinhole model and lens distortion)
4. Sensing
Image acquisition
Image acquisition and digitization is accomplished using a digital camera and a digitizing system to store the image data for
subsequent analysis. The camera is focused on the subject of interest, and an image is obtained by dividing the viewing
area into a matrix of discrete picture elements (called pixels), in which each element has a value that is proportional to the
light intensity of that portion of the scene. The intensity value for each pixel is converted into its equivalent digital value by
an ADC . The operation of viewing a scene consisting of a simple object that contrasts substantially with its background,
and dividing the scene into a corresponding matrix of picture elements is depicted in Figure 22.10.
5.
6. Sensing
Illumination
Another important aspect of machine vision is illumination. The scene viewed by the vision
camera must be well illuminated, and the illumination must be constant over time. This
almost always requires that special lighting be installed for a machine vision application
rather than relying on ambient light in the facility.
7. Preprocessing
The aim of pre-processing is to improve the quality of the image so that we can analyse it in a better way. By preprocessing
we can suppress undesired distortions and enhance some features which are necessary for the particular application we are
working for. Those features might vary for different applications.
Image preprocessing can be performed by two general methodologies:
1. Spatial domain methodology
2. Frequency domain methodology
8. Preprocessing
Frequency domain preprocessing methods use the Fourier transform of an image which converts the image to an
aggregate of complex-valued pixels. To reduce the noise and other spurious effects resulting from the operations of
sampling, quantization transmission, and any other disturbances of the environment, appropriate smoothing
operations are employed.
Other preprocessing operations include the following:
● Image enhancement (automatic adaptation to illumination variations)
● Edge detection ( Edge detection is concerned with determining the location of boundaries between an object
and its surroundings in an image. This is accomplished by identifying the contrast in light intensity that exists
between adjacent pixels at the borders of the object. A number of software algorithms have been developed
for following the border around the object)
● Image thresholding (i.e., the selection of a threshold T that separates the intensity modes, e.g., in images that
have intensities that are grouped into two dominant models, viz., light objects on a dark background)
9. Image Segmentation
Image segmentation is the process that splits a source image into its constituent parts or objects. The regions of interest are
selected on the basis of several criteria. For example, it may be necessary to find a single part out of a bin. For navigation
purposes, it may be useful to extract only floor lines from an image. In general, by segmentation, objects are extracted from
a scene for subsequent recognition and analysis.
The two basic principles used in segmentation algorithms are as follows:
1. Discontinuity (e.g., edge detection)
2. Similarity (e.g., thresholding, region growing)
.
10. Image Description
Image description is the process of extracting features from an object for recognition purposes. Descriptors
must be independent of the size, location, and orientation, and provide sufficient discriminatory information.
Image Recognition
Image recognition is called the labeling process applied to a segmented object of a scene. That is, the image
recognition presumes that objects in a scene have been segmented as individual elements (e.g., a bolt, a seal,
a wrench). The typical constraint here is that images are acquired in a known viewing geometry (often
perpendicular to the workspace).
11. Image interpretation
Image interpretation is a higher level process which uses combinations of the methods discussed earlier, namely, sensing,
preprocessing, segmentation, description, and recognition. A machine vision system is ranked according to its general
ability to extract useful information from a scene under a wide repertory of viewing conditions, needing minimal knowledge
about the objects at hand. Factors that make image interpretation a difficult task include variations in illumination conditions,
viewing geometry, and occluding bodies.
12. Reference
An ideal pinhole camera model is used to represent an ideal lens, and assumes that rays of light travel in straight lines from
the object through the pinhole to the image (sensor) plane. Pinhole camera is the simplest device that captures accurately
the geometry of perspective projection. The pinhole is an infinitesimally small aperture. The image of the object is formed by
the intersection of the light rays with the image plane (Figure). This mapping from the three dimensions onto two
dimensions is called perspective projection.
13. Reference
Lens distortions displace image points from their ideal “pinhole model” locations on the sensor plane. Lens distortions are
distinguished into the following:
●Radial (displacements toward the center image or away from it)
●Tangential (these displacements occur at right angles to the radial direction and usually are much smaller than radial
displacements)
●Asymmetric radial or tangential (here the error functions vary for different locations on the image plane)
●Random (these displacements cannot be mathematically modeled)