INTRODUCTION TO MACHINEVISION
ď‚§ Machine vision is the computational process of converting images captured by
cameras into meaningful information that can be used for decision-making,
measurement, automation, and control. Unlike computer graphics, which synthesizes
images, machine vision analyzes them.
Machine vision systems are widely used in:
ď‚§ Industrial automation
ď‚§ Robotics and navigation
ď‚§ Quality inspection
ď‚§ Medical imaging
ď‚§ Metrology
ď‚§ Surveillance
3.
A MACHINE VISIONPIPELINE
ď‚§ Image Acquisition
ď‚§ Image Enhancement
ď‚§ Segmentation
ď‚§ Feature Extraction
ď‚§ Classification / Decision Making
4.
IMAGE DATA STRUCTURES
ď‚§An image is a 2D discrete sampling of a continuous scene.
ď‚§ Each pixel stores a radiometric value proportional to the incident light at that location.
Pixel and Gray Level Representation
ď‚§ A pixel stores intensity values that represent:
ď‚§ Energy falling on the sensor
ď‚§ Integration over exposure time
Sensor’s electronic response
ď‚§ Gray level quantization typically uses:
 8-bit (0–255) – most common
 10-bit, 12-bit, 14-bit – industrial cameras
 16-bit – scientific cameras
ď‚§ Higher bit depth = higher dynamic range.
5.
IMAGE TYPES INMACHINE VISION
Gray-Scale Images
ď‚§ Single channel
ď‚§ Ideal for segmentation, measurement
ď‚§ Used in >80% industrial applications
RGB Color Images
ď‚§ 3 channels: R, G, B
Useful for:
ď‚§ Color inspection
ď‚§ Surface inspection
ď‚§ Sorting tasks
Multispectral Images
ď‚§ Captured across many narrow
wavelength bands.
Used for:
ď‚§ Agriculture
ď‚§ Material recognition
ď‚§ Medical diagnostics
 These may have 10–200+ spectral
channels.
6.
IMAGE AS ANARRAY OF CHANNELS
image as:
ď‚§ A set of k channels, each storing sampled radiometric values.
Examples:
ď‚§ Gray image k = 1
→
ď‚§ RGB k = 3
→
ď‚§ Multispectral k = N (N > 10)
→
ď‚§ Each channel is stored in a 2D matrix.
7.
REGIONS
Regions represent connectedsets of pixels belonging to an object.
Examples:
ď‚§ A blob found by thresholding
ď‚§ A detected component in inspection
ď‚§ Mask indicating area of interest
ď‚§ Pixel-level storage of large regions is inefficient.
Thus Run-Length Encoding (RLE) is used.
8.
RUN-LENGTH ENCODING (RLE)
ď‚§Each run is a continuous horizontal segment of pixels belonging to the region.
ď‚§ For example:
ď‚§ Row 5: pixels from column 20 to 40 form a run
Stored as (row = 5, start = 20, end = 40)
→
ď‚§ Thus a region is simply a list of runs.
9.
ADVANTAGES OF RLE
MemoryEfficiency
A rectangle of 1000Ă—1000 = 1,000,000 pixels
RLE stores only 1000 runs.
Fast Membership Tests
Checking if a pixel belongs to a region involves only scanning runs for its row.
Efficient Boolean Operations
Union, intersection, difference can be performed run-by-run.
Shape Computations Become Easy
Area = sum of run lengths
Perimeter = transitions between runs
Moments = computed efficiently
10.
SUB-PIXEL PRECISE CONTOURS
High-precision applications need accuracy better than ±1 pixel.
ď‚§ Examples:
 Measuring hole diameter to ±0.01 mm
ď‚§ Locating object centers with micron precision
ď‚§ Industrial metrology
ď‚§ PCB inspection
ď‚§ Pixel grid alignment is too coarse.
Thus sub-pixel contours are used.
11.
WHY PIXEL PRECISIONIS NOT ENOUGH
ď‚§ A pixel is a sample of an underlying continuous signal.
Real edges:
ď‚§ Do not align with pixel locations
ď‚§ Fall between pixels
ď‚§ Must be interpolated for accuracy
ď‚§ If edges are used directly at pixel accuracy, measurement errors accumulate.
12.
SUB-PIXEL EDGE EXTRACTION
ď‚§Common technique:
ď‚§ Compute gradient magnitude
ď‚§ Fit parabola or Gaussian to gradient profile
ď‚§ Estimate true edge position between pixels
 Produces accuracy of ±0.1 pixel or better.
13.
SUB-PIXEL THRESHOLDING
Instead ofthresholding individual pixels, construct a continuous surface using bilinear
interpolation:
Given pixels:
(I11) (I12)
●──────●
│ │
│ │
●──────●
(I21) (I22)
The intensity at any (x, y) inside this block is computed as:
Then find the points where this surface intersects the threshold plane:
The intersection curve gives sub-pixel accurate contour points.
14.
SUB-PIXEL CONTOUR REPRESENTATION
Acontour is stored as:
ď‚§ All coordinates are floating-point, not integers.
ď‚§ Types of contours:
ď‚§ Open contours
ď‚§ Closed contours
ď‚§ Branching contours (junction nodes)
15.
IMAGE ENHANCEMENT
ď‚§ Imageenhancement improves the visual quality of an image or prepares it for
further processing (like segmentation or measurement).
Enhancement does not increase actual information, but makes important
features easier to analyze.
Enhancement techniques fall into three major categories:
ď‚§ Point Operations (Pixel-based)
ď‚§ Neighborhood Operations (Filtering / Smoothing)
ď‚§ Frequency Domain Methods (Fourier-based enhancement)
16.
GRAY VALUE TRANSFORMATIONS
ď‚§(Point-Based Operations)
ď‚§ These operate independently on each pixel, ignoring neighbors.
They modify only the intensity value, keeping the pixel location unchanged.
ď‚§ General form:
Where
f = input image
g = transformed image
T = transformation function
17.
LINEAR GRAY VALUETRANSFORMATION
ď‚§ g=a f+b
â‹…
ď‚§ Meaning of Parameters
ď‚§ a > 1 increases contrast
→
ď‚§ a < 1 reduces contrast
→
ď‚§ b > 0 brightens
→
ď‚§ b < 0 darkens
→
ď‚§ Applications
ď‚§ Correcting under/over exposure
ď‚§ Standardizing image contrast
ď‚§ Preparing images for thresholding
18.
HISTOGRAM-BASED ADJUSTMENTS
ď‚§ Histogram-basedadjustments improve contrast by redistributing or expanding
the range of gray values in an image.
ď‚§ Histogram stretching expands the intensity values of an image to occupy a wider
range.
ď‚§ If the original image uses only a small portion of the possible grayscale (e.g., only
50–150), we stretch it so that the darks become darker and lights become lighter.
ď‚§ Formula
ď‚§ Let:
ď‚§ minimum gray value in the image
ď‚§ = maximum gray value
ď‚§ Input pixel =
ď‚§ Output pixel =
19.
CONTRAST NORMALIZATION
ď‚§ Thisis a special case of histogram
stretching, but it automatically
forces:
ď‚§ So the smallest pixel value becomes
0, and the largest becomes 255, no
matter what they were.
Formula
ď‚§ Same as stretching, but done
automatically:
Effect
ď‚§ Normalizes the entire image contrast
ď‚§ Makes images comparable under
different lighting
Example
 If image range is 10–150 →
normalized to 0–255 automatically.
20.
ROBUST CONTRAST NORMALIZATION
Concept
ď‚§Real images often have outliers:
ď‚§ A few extremely bright pixels
(specular highlights)
ď‚§ A few extremely dark pixels
(shadows or noise)
ď‚§ These outliers can spoil normal
contrast normalization, because
min/max become extreme.
ď‚§ Robust normalization ignores
these outliers by using percentiles
instead of min/max.
How it works
ď‚§ Instead of true minimum and
maximum values, we use:
ď‚§ Lower percentile e.g., 2%
→
intensity value
 p2​= gray value at 2nd percentile
ď‚§ = gray value at 98th percentile
ď‚§ Then:
Assume pixel values:
Extreme min = 0 (1 pixel shadow)
Extreme max = 255 (1 shiny pixel)
Most pixels range = 50 to 180
Instead of using 0 and 255, robust
normalization uses:
2% percentile = 60
98% percentile = 170
Thus only the useful gray values are
stretched.
21.
IMAGE SMOOTHING
ď‚§ Usedto reduce noise and small variations.
Essential before segmentation or edge detection.
ď‚§ Neighborhood-Based Operations
Linear Filters
ď‚§ Linear filters replace each pixel with a weighted sum of neighboring pixels.
22.
MEAN (AVERAGE) FILTER
ď‚§Reduces random noise
ď‚§ Strongly blurs edges
ď‚§ Used for:
ď‚§ Removing sensor noise
ď‚§ Uniform smoothing
23.
GAUSSIAN FILTER
ď‚§ AGaussian kernel:
Properties:
ď‚§ Smooths more naturally than mean filter
ď‚§ Preserves edges better
ď‚§ Reduces high-frequency noise
Applications:
ď‚§ Preprocessing for edge detection (Canny uses Gaussian)
ď‚§ Removal of film/sensor noise
24.
NON-LINEAR FILTERS
ď‚§ Non-linearfilters preserve edges better.
Median Filter
ď‚§ Replaces pixel value with the median of neighbors.
ď‚§ Excellent for:
ď‚§ Salt-and-pepper noise
ď‚§ Impulse noise
ď‚§ Binary image cleaning
ď‚§ Does NOT blur edges best filter for industrial images with impulse noise.
→
25.
RANK FILTERS
ď‚§ Rankfilters find:
ď‚§ Minimum
ď‚§ Maximum
ď‚§ Percentile value
ď‚§ Used in:
ď‚§ Morphological operations
ď‚§ Removing isolated bright/dark spots
ď‚§ Industrial defect masking
26.
FREQUENCY DOMAIN ENHANCEMENT
ď‚§Image enhancement in the frequency domain modifies an image by changing its
frequency components (low frequencies, high frequencies, periodic patterns).
ď‚§ To do this, we use the Fourier Transform.
27.
WHY FREQUENCY DOMAINENHANCEMENT?
In images:
ď‚§ Low frequencies = smooth regions
ď‚§ High frequencies = edges and fine details
ď‚§ Periodic frequencies = textures, patterns
ď‚§ By transforming the image into the frequency domain, we can:
ď‚§ Remove noise
ď‚§ Enhance edges
ď‚§ Extract textures
ď‚§ Smooth or sharpen the image
28.
CONTINUOUS FOURIER TRANSFORM(CTFT)
ď‚§ For a continuous image
Where:
ď‚§ = frequency components
ď‚§ frequency-domain
Inverse CTFT CTFT allows:
•Theoretical analysis
•Optical imaging modeling
•Ideal filter definitions
29.
DISCRETE FOURIER TRANSFORM(DFT)
ď‚§ Digital images discrete samples use
→ → DFT.
ď‚§ For an image of size :
DFT is used for:
•Smoothing
•Sharpening
•Pattern extraction
•Periodic noise removal
30.
2D FREQUENCY-DOMAIN ENHANCEMENT
STEPS
ď‚§Take DFT of image get frequency spectrum
→
ď‚§ Apply filter in frequency domain
ď‚§ Inverse DFT return to spatial domain
→
ď‚§ Get enhanced image
ď‚§ Graphically:
ď‚§ Image DFT Filter in frequency IDFT Enhanced Image
→ → → →
31.
FREQUENCY-DOMAIN FILTERS
ď‚§ Low-PassFilters (LPF)
ď‚§ Remove high frequency smooth the image.
→
ď‚§ High-Pass Filters (HPF)
ď‚§ Remove low frequencies enhance edges.
→
ď‚§ Band-Pass Filters
ď‚§ Select only a certain range of frequencies.
ď‚§ Notch Filters
ď‚§ Remove unwanted periodic noise (e.g., electrical interference).
32.
GEOMETRIC TRANSFORMATIONS
ď‚§ Geometrictransformations change the spatial arrangement of pixels without
affecting their intensity.
Used in:
ď‚§ Registration
ď‚§ Image stitching
ď‚§ Rectification
ď‚§ Rotation/Scaling
ď‚§ Perspective correction
ď‚§ Normalization for pattern recognition
PROJECTIVE TRANSFORMATION
Used when:
•Cameraobserves a planar surface at an angle
•Perspective distortion must be removed
•License plates, documents, labels must be rectified
Key properties
•Straight lines remain straight
•Parallel lines may not remain parallel
36.
POLAR TRANSFORMATION
ď‚§ Usedto convert circular or rotationally symmetric objects into linear form.
ď‚§ Applications
ď‚§ Inspection of bottle caps
ď‚§ Label verification on cans
ď‚§ Printing inspection on round objects
ď‚§ Mapping from Cartesian to polar:
37.
INTRODUCTION TO SEGMENTATION
ď‚§Segmentation divides an image into meaningful units such as:
ď‚§ Objects
ď‚§ Background
ď‚§ Edges
ď‚§ Contours
ď‚§ Regions
ď‚§ Two main classes:
ď‚§ Region-Based Segmentation
ď‚§ Contour-Based Segmentation
38.
THRESHOLDING
ď‚§ Thresholding isone of the simplest and most widely used segmentation
techniques.
Global Thresholding
ď‚§ A constant threshold T is applied:
ď‚§ Used when:
ď‚§ Object and background have distinct intensities
ď‚§ Illumination is uniform
39.
AUTOMATIC THRESHOLD SELECTION
ď‚§When illumination varies, fixed threshold fails.
Automatic thresholding uses histogram analysis to select best threshold.
ď‚§ Common methods:
ď‚§ Valley detection in histogram
ď‚§ Isodata method (iterative)
 Otsu’s method (maximizes inter-class variance)
40.
SUB-PIXEL THRESHOLDING
ď‚§ Thisis crucial for:
ď‚§ Precision measurement
ď‚§ Industrial inspection
ď‚§ Accurate contour extraction
ď‚§ Instead of thresholding discrete pixel values, a continuous intensity surface is
reconstructed using bilinear interpolation.
41.
BILINEAR INTERPOLATION
Given 4neighbors, intensity surface is:
ď‚§ Intersection of this surface with threshold plane I = T yields a continuous contour.
42.
SEGMENTATION OF LINES
ď‚§Least Squares Line Fitting
ď‚§ Given contour points (xi, yi), minimize:
ď‚§ Find best fit line.
43.
SEGMENTATION OF CIRCLES
ď‚§Algebraic Circle Fit
ď‚§ Circle equation:
ď‚§ Geometric Fit
ď‚§ Minimizes orthogonal distances:
Solve using least squares.
44.
SEGMENTATION OF ELLIPSES
ď‚§Ellipse equation:
Used in:
•Bearing inspection
•Hole inspection
•Coin & washer inspection
Parameters estimated via algebraic or geometric least
squares.
45.
NEED FOR CALIBRATION
ď‚§What are X,Y, Z?
ď‚§ These are 3D real-world coordinates.
ď‚§ They represent a point in the physical world (object location).
ď‚§ Units are usually meters or millimeters.
ď‚§ Example:
A point on a table might be at (X=20 cm,Y=10 cm, Z=5 cm) in the real world.
ď‚§ What are u, v?
ď‚§ These are 2D image coordinates.
 They represent the pixel location of that 3D point on the camera’s image.
ď‚§ Units are pixels.
ď‚§ Example:
The same point may appear at (u=350, v=220) on the camera image.
46.
ď‚§ Calibration findsthe mathematical mapping:
 Real world (3D) → Camera image (2D)
(X,Y, Z) → (u, v)
ď‚§ So the camera knows exactly where a real object will appear in the image.
47.
CAMERA PARAMETERS
ď‚§ IntrinsicParameters
ď‚§ Define internal camera geometry:
ď‚§ Focal length (fx, fy)
ď‚§ Principal point (cx, cy)
ď‚§ Pixel size
ď‚§ Skew
ď‚§ Lens distortion (k1, k2, p1, p2, k3)
ď‚§ Extrinsic Parameters
ď‚§ Define camera orientation:
ď‚§ Rotation matrix (R)
ď‚§ Translation vector (t)
48.
CALIBRATION PROCESS
ď‚§ Typicalsteps:
ď‚§ Capture calibration pattern (chessboard, circles)
ď‚§ Detect feature points
ď‚§ Estimate intrinsic and extrinsic parameters
 Refine using nonlinear optimization (Levenberg–Marquardt)
ď‚§ Compute distortion correction
ď‚§ Validate accuracy
ď‚§ Calibration improves with:
ď‚§ More images
ď‚§ Variety of pattern orientations
ď‚§ Good lens quality
49.
STEREO VISION &3D RECONSTRUCTION
ď‚§ Stereo uses two cameras to reconstruct 3D information using triangulation.
ď‚§ Inspired by human binocular vision.
STEREO
ď‚§ Two cameras observe same scene from two viewpoints.
Epipolar Geometry
ď‚§ Each point in left image lies on a horizontal epipolar line in right image
ď‚§ Reduces 2D search 1D search
→
ď‚§ GEOMETRY
50.
STEREO PIPELINE
Calibration
ď‚§ Bothcameras must be calibrated.
Rectification
ď‚§ Transforms images so epipolar lines become horizontal.
Correspondence Matching
ď‚§ Find matching pixels between left and right images.
Triangulation
ď‚§ Depth is computed: Where:
•f = focal length
•B = baseline distance between cameras
•d = disparity (difference in image coordinates)
51.
APPLICATIONS OF STEREOVISION
3D measurement
Robot navigation
Obstacle detection
Autonomous vehicles
3D scanning
Industrial inspection
52.
VISION SENSORS
ď‚§ Visionsensors are compact devices combining:
ď‚§ Imaging optics
ď‚§ Sensor
ď‚§ On-board image processing
ď‚§ Output interfaces
 They function as “smart cameras.”
53.
COMPONENTS OF AVISION SENSOR
Lens
Image sensor (CCD/CMOS)
Processor (DSP/FPGA)
Illumination
Software for detection/measurement
Communication (Ethernet, RS-485, Fieldbus)