Digital Image Processing (DIP)The Fundamentals - A MATLAB assisted non-mathematical approach By Abhishek Sharma (EEE 2k7)
Materials Some taken from slides by a NIT Surathkal friend : Varun Nagaraja. He worked in the same lab at IISc Bangalore. Joined as a PhD student in University of Maryland – College Park, a week back. Some material taken from online works of Sabih D. Khan, a French-Pakistani researcher. Some prepared by me!
Objective To take a quasi non-mathematical , MATLAB assisted approach to DIP. I cut out the equations because…1) I don’t want to scare you even before we start the climb2) I am pathetic at elucidating maths concepts3) I don’t have a lot of time.4) You don’t need maths to start working in DIP. If I am able to make you fall in love with DIP, I hope like real world love, you will be able to accept it with its’ shortcomings – maths!
SixthSense by Pranav Mistry, MIT Media Labs SixthSense is a wearable gestural interface that augments the physical world around us with digital information and lets us use natural hand gestures to interact with that information.
Image Processing - a definition Image Processing generally involves extraction of useful information from an image. This useful information may be the dimensions of an engineering component, size of diagnosed tumor, or even a 3D view of an unborn baby.
Intro to DIP with MATLAB Images can be conveniently represented as matrices in Matlab. One can open an image as a matrix using imread command. The matrix may simply be m x n form or it may be 3 dimensional array or it may be an indexed matrix, depending upon image type. The image processing may be done simply by matrix calculation or matrix manipulation. Image may be displayed with imshow command. Changes to image may then be saved with imwrite command.
Image types Images may be of three types i.e. black & white, grey scale and colored. In Matlab, however, there are four types of images. Black & White images are called binary images, containing 1 for white and 0 for black. Grey scale images are called intensity images, containing numbers in the range of 0 to 255 or 0 to 1. Colored images may be represented as RGB Image or Indexed Image.
Image types In RGB Images there exist three indexed images. First image contains all the red portion of the image, second green and third contains the blue portion. So for a 640 x 480 sized image the matrix will be 640 x 480 x 3. An alternate method of colored image representation is Indexed Image. It actually exist of two matrices namely image matrix and map matrix. Each color in the image is given an index number and in image matrix each color is represented as an index number. Map matrix contains the database of which index number belongs to which color.
MATLAB – image conversions RGB Image to Intensity Image (rgb2gray) RGB Image to Indexed Image (rgb2ind) RGB Image to Binary Image (im2bw) Indexed Image to RGB Image (ind2rgb) Indexed Image to Intensity Image (ind2gray) Indexed Image to Binary Image (im2bw) Intensity Image to Indexed Image (gray2ind) Intensity Image to Binary Image (im2bw) Intensity Image to RGB Image (gray2ind, ind2rgb)
Image Histograms There are a number of ways to get statistical information about data in the image. Image histogram is one such way. An image histogram is a chart that shows the distribution of intensities in an image. Each color level is represented as a point on x-axis and on y- axis is the number instances a color level repeats in the image. Histogram may be view with imhist command. Sometimes all the important information in an image lies only in a small region of colors, hence it usually is difficult to extract information out of that image. To balance the brightness level, we carry out an image processing operation termed histogram equalization. Use MATLAB histeq command
Some histogram experiments Read Bill Clinton’s image in your folder into MATLAB.
Simple character recognition code Detect only particular characters and numbers in an image. Characters are in white and of a fixed size. Background is black in color. The image is in binary format. We will explore various DIP concepts while we do this…. Let’s start
Let’s have some fun… I know my sense of humor sucks! Now read the image ‘same color.jpg’ and display it on a window. Once the image is displayed in the window, select Tools –Data Cursor or select the shortcut on the toolbar. Click on point A as shown, on the image. It displays three values (RGB) since it is a color image. You can try reading pixel values for the previous image. It will be either 0/1 since it is binary image. Hold Alt and Click on point B. This creates something called as a new datatip.Now for some fun What are the RGB values at the two points?
Morphological Operations These are image processing operations done on binary images based on certain morphologies or shapes. The value of each pixel in the output is based on the corresponding input pixel and its neighbors. By choosing appropriately shaped neighbors one can construct an operation that is sensitive to a certain shape in the input image.
Skeletonize It creates skeleton of an object, by removing pixels on the boundaries but does not allow objects to break apart. It is an extremely important operation in image processing as it removes complexities from an image without loosing details.
Erosion and Dilation These are the most fundamental of binary morphological operations. In dilation if any pixel in the input pixel’s neighborhood is on, the output pixel is on otherwise off. In actual dilation grows the area of the object. Small holes in the object are removed. In erosion if every pixel in the input pixel’s neighborhood is on the output pixel is on otherwise off This in actual works as shrinking the object’s area, thus small isolated regions disappear.
Dilation….Dilation does not necessarily mean dilation of the holes also. Theholes get contracted as shown above.Also try image erosion. Use MATLAB’s help.
Dilation… adds pixels to the boundaries of objects in an image. number of pixels added from the objects in an image depends on the size and shape of the structuring element function strel(…)can be used to generate the SEs.
Structuring Elements In mathematical morphology, structuring element is a shape, used to probe or interact with a given image, with the purpose of drawing conclusions on how this shape fits or misses the shapes in the image. Check out help on strel for various combinations
Continuing with the algo… When the dilated image of the character is subtracted from the original we get something like… Next we create such images for all the characters that we want to recognize. (For all those individual character images in the folder)
Hit or miss? Function, bwhitmiss is employed to check if a particular character is present in the given image. bwhitmiss(BW1,SE1,SE2)performs the hit‐miss operation defined by the structuring elements SE1 and SE2. The hit‐miss operation preserves pixels whose neighborhoods match the shape of SE1 and dont match the shape of SE2. If the matrix returned by bwhitmiss contains nonzero elements, then the character is found in the image. Also note the use of functions isempty and nonzeros You can now use charrec.m to recognize few characters in a crude way.
Image Segmentation The goal of image segmentation is to cluster pixels into salient image regions, i.e., regions corresponding to individual surfaces, objects, or natural parts of objects. A segmentation could be used for object recognition, image compression, image editing, or image database look-up.
Image Segmentation - Global ThresholdingDisadvantage is when there are multiple colors for objects andbackgrounds.
Otsu’s Method Based on a very simple idea: Find the threshold that minimizes the weighted within-class variance. This turns out to be the same as maximizing the between-class variance. Operates directly on the gray level histogram [e.g. 256 numbers, P(i)], so it’s fast (once the histogram is computed).
The weighted within-class variance is: (t) q1 (t) (t) q2 (t) (t) 2 w 2 1 2 2Where the class probabilities are estimated as: t I q1 (t) P(i) q2 (t) P(i) i t 1 i 1And the class means are given by: I t iP(i) iP(i) 1 (t) 2 (t) i 1 q1 (t) i t 1 q2 (t )
This was again a very crude way, since we are dependingonly on value of area which might not remain constant ifcamera changes position.Most of the times the standard features available withregionprops() is not sufficient. We will have to write ourown code to extract features.Also we used hard thresholds for are as to classify CCs.Again most of the times, this is not followed. Classifiersusing Pattern Recognition techniques are employed.
Why edges? Reduce dimensionality of data Preserve content information Useful in applications such as: ◦ object detection ◦ structure from motion ◦ tracking
Why not edges? But, sometimes not that useful, why? Difficulties: 1. Modeling assumptions 2. Parameters 3. Multiple sources of information (brightness, color, texture, …) 4. Real world conditions Is edge detection even well defined?
Canny difficulties1. Modeling assumptions Step edges, junctions, etc.2. Parameters Scales, threshold, etc.3. Multiple sources of information Only handles brightness4. Real world conditions Gaussian iid noise? Texture…
Edge Detection Edge detection extract edges of objects from an image. There are a number of algorithms for this, but these may be classified as derivative based or gradient based. In derivative based edge detection the algorithm takes first or second derivative on each pixel in the image. In case of first derivative at the edge of the image there is a rapid change of intensity. While in case of second derivative there is a zero pixel value, termed zero crossing. In gradient based edge detection a gradient of consecutive pixels is taken in x and y direction.
Taking derivative on each and every pixel of the image consumes a lot of computer resources and hence is not practical. So usually an operation called kernel operation is carried out. A kernel is a small matrix sliding over the image matrix containing coefficients which are multiplied to corresponding image matrix elements and their sum is put at the target pixel.
Best one is – Canny’s Method Uses both the derivative and the gradient to perform edge detection The maths is a bit complex Can through the PDF in your folder later. Pass ‘canny’ as a parameter to the edge function to perform canny.