Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Genetic Programming based Image Segmentation


Published on

Genetic Programming based Image Segmentation with Applications to Biomedical Object Detection. Published paper of our research work. Published at Genetic and Evolutionary Computation Conference (GECCO) 2009.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Genetic Programming based Image Segmentation

  1. 1. Genetic Programming based Image Segmentation with Applications to Biomedical Object Detection Tarundeep Singh Dhot, Nawwaf Kharma Department of Electrical and Computer Engineering Concordia University, Montreal, QC H3G 1M8, Mohammad Daoud Department of Electrical and Computer Engineering University of Western Ontario London, ON, N6A 3K7 Rabab Ward Department of Electrical and Computer Engineering University of British Columbia Vancouver, BC, V6T 1Z4 ABSTRACT 1. INTRODUCTION Image segmentation is an essential process in many image Image segmentation is the process of extraction of objects of analysis applications and is mainly used for automatic object interest from a given image. It allows certain regions in the image recognition purposes. In this paper, we define a new genetic to be identified as an object based on some distinguishing criteria, programming based image segmentation algorithm (GPIS). It uses for example, pixel intensity or texture. It is an important part of a primitive image-operator based approach to produce linear many image analysis techniques as it is a crucial first step of the sequences of MATLAB® code for image segmentation. We imaging process and greatly impacts any subsequent feature describe the evolutionary architecture of the approach and present extraction or classification. It plays a critical role in automatic results obtained after testing the algorithm on a biomedical image object recognition systems for a wide variety of applications like database for cell segmentation. We also compare our results with medical image analysis [8, 9, 14, 15], geosciences and remote another EC-based image segmentation tool called GENIE Pro. We sensing [2, 3, 4, 5, 10, 11], and target detection [10, 11, 16]. found the results obtained using GPIS were more accurate as However, image segmentation is an ill-defined problem. Even compared to GENIE Pro. In addition, our approach is simpler to though numerous approaches have been proposed in the past [7, apply and evolved programs are available to anyone with access 12, 13], there is still no general segmentation framework that can to MATLAB®. perform adequately across a diverse set of images [1]. In addition, most image segmentation techniques exhibit a strong domain or Categories and Subject Descriptors application-type dependency [7, 12, 17]. Automated segmentation I.4.6 [Image Processing and Computer Vision]: Segmentation – algorithms often include a priori information of its subjects [8], pixel classification. making use of well-designed segmentation techniques restricted to a small set of imagery. General Terms In this paper, we propose a new, simple image segmentation Algorithms, Experimentation. algorithm called Genetic Programming based Image Segmentation (GPIS) that uses a primitive image-operator based approach for Keywords segmentation and present results. The algorithm does not require Image Segmentation, Genetic Programming. any a priori information about objects to be segmented other than a set of training images. In addition, the algorithm is implemented on MATLAB® and uses its standard image-function library. This Permission to make digital or hard copies of all or part of this work for allows easy access to anyone with MATLAB®. personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that In the following sections, we provide a brief introduction to copies bear this notice and the full citation on the first page. To copy relevant work in GP based image segmentation and image otherwise, or republish, to post on servers or to redistribute to lists, analysis, followed by an overview of our approach in Section 1.3. requires prior specific permission and/or a fee. Section 2 describes the methodology of our algorithm and the GECCO’09, July 8–12, 2009, Montréal Québec, Canada. Copyright 2009 ACM 978-1-60558-325-9/09/07...$5.00.
  2. 2. experimental setup for compiling results. Finally, Section 3 and mutation. In order to compute fitness of a pipeline, the presents the results of the experiments conducted on a biomedical resultant segmentation produced by a pipeline is compared to a set image database for cell segmentation purposes. We also compare of training images. These training images are produced by manual our results with another EC-based image segmentation algorithm labeling of pixels by user as True (feature) or False (non-feature) called GENIE Pro. pixels using an in-built mark-up tool called ALLADIN. Finally, when a run of GENIE Pro is concluded, the fittest pipeline in the 1.1 Related Work population is selected and combined using a linear classifier (Fisher Discriminant) to form evolved solution that can be used to One of the initial works in this field was published by Tackett segment new images. [16] in 1993. He applied GP to develop a processing tree capable of classifying features extracted from IR images. These evolved GENIE Pro was developed for analyzing multispectral satellite features were later used to construct a classifier for target data. It has also been applied for biomedical feature-extraction detection. On the same lines, in 1995, Daida et al. [5, 6] used GP problems [9]. We have used it for comparison purposes. to derive spatial classifiers for remote sensing purposes. This was the first time GP was used for image processing applications in 1.3 Overview of Our Work geosciences and remote sensing. In this paper, we describe a new genetic programming based In 1996, Poli [14] proposed an interesting approach to image image segmentation algorithm, GPIS that uses a primitive image- analysis based on evolving optimal filters. The approach viewed operator based approach for segmentation. Each segmentation image segmentation, image enhancement and feature detection algorithm can be viewed as a unique combination of image purely as a filtering problem. In addition, he outlined key criteria analysis operators that are successfully able to extract desired while building terminal sets, function sets and fitness functions regions from an image. If we are able to describe a sufficient set for an image analysis application. of these image analysis operators, it is possible to build multiple segmentation algorithms that segment a wide variety of images. In In 1999, Howard et al. [10, 11] presented a series of works using GPIS, we define a pool of low level image analysis operators. The GP for automatic object detection in real world and military image GP searches the solution space for the best possible combination analysis applications. They proposed a staged evolutionary of these operators that are able to perform the most accurate approach for evolution of target detectors or discriminators. This segmentation. From now on, we refer to these image analysis resulted in achieving practical evolution times. operators as primitives. Each individual in a population is a In 1999, another interesting approach was proposed by Brumby et combination of these primitives and represents an image al. [4]. They used a hybrid evolutionary approach to evolve image segmentation program. Therefore, GPIS typically breeds a extraction algorithms for remote sensing applications. These population of segmentation programs in order to evolve one algorithms were evolved using a pool of low level image accurate image segmentation program. processing operators. On the same lines, Bhanu et al. [2, 3] used GP to evolve composite operators for object detection. These 2. METHODOLOGY operators were synthesized from combinations of primitive image The proposed algorithm GPIS is designed as a general tool for processing operations used in object detection. In order to control learning based segmentation of images. In this paper, particular the code-bloat problem, they also proposed size limits for the attention is given to the testing it on biomedical images. Our composite operators. approach does not require a particular image format or size and In 2003, Roberts and Claridge [15] proposed a GP based image works equally well on both color and grayscale images in any segmentation technique for segmenting skin lesion images. A key MATLAB® compatible format. feature of their work was the ability of the GP to generalize based For the purpose of learning, a directory with both input images on a small set of training images. and matching ground truths (GTs) must be provided. From this Our approach is motivated by the works of Tackett [16], Brumby point onwards, we call this a training set. Every input image must et al. [4] and Bhanu et al. [2, 3]. They all effectively implemented have a corresponding GT of the same size and format. The GT a primitive image operator based approach for image analysis. image is a binary image showing the best assessment of the This is similar to our approach. In addition, we have used the key boundaries of the objects of interest; all pixels inside those criteria outlined by Poli [14] as references while building our boundaries are by definition object pixels and all pixels outside algorithm. the boundaries are by definition, non-object pixels. Pixels on the boundary itself are by definition also object pixels. 1.2 GENIE Pro GENIE Pro [4, 9] is a general purpose, interactive and adaptive GPIS has two stages of operation. Stage 1 is a learning phase in GA-based image segmentation and classification tool. GENIE Pro which GPIS uses the training set to evolve a MATLAB® program uses a hybrid GA to assemble image-processing algorithms or which meets user-defined threshold of segmentation accuracy pipelines from a collection of low-level image processing relative to the input images of the training set. operators (for example edge detectors, textures measures, spectral In the second stage, this evolved individual is evaluated for its orientations and morphological filters). The role of each evolved ability to segment unseen images of the same type as the training pipeline is to classify each pixel as feature or non-feature. images. The accuracy results achieved here are from here on called validation accuracy. The GA begins with a population of random pipelines, performs fitness evaluation for each pipeline in the population and selects In a real world situation, due to lack of GTs for unseen images, the fitter pipelines to produce offspring pipelines using crossover validation accuracy will take the form of the subjective assessment
  3. 3. of a human user. However, for this paper, the authors evaluate the chromosome represents a complete MATLAB® segmentation quality i.e. the validation accuracy of the individual evolved by program. There is a one-to-one mapping between the genome and GPIS by comparing their segmentation results to their matching the phenome as shown in Figure 2 (c). It also shows the GT images. We report the results of our evaluation in the Results representation of the knowledge structure used by the genetic section (Section 3) of this paper. learning system. 2.1 Stage 1: Learning phase of GPIS [Operator Name, Input Plane 1, Input Plane 2, Weights, SE/FP] GPIS operates in a typical evolutionary cycle in which a population of potential program solutions (each meant to segment (a) images) is subjected to repeated selection and diversification until at least one of the individual meets the termination criteria. The [G1] [G2] [G3] [G4] [G5] ......... [Gn] flowchart of the learning stage is presented in Figure 1. (b) START .... d1 = input; Initialization h1 = fspecial(‘disk’,[6 6]); .... io1 = imfilter(d1, h1); SE1 = strel(‘square’, 2); .... Fitness io2 = imerode(io1, SE1); Evaluation io3 = imclose(io2, SE1); next generation .... Io4 = imadd(io2,io3); out = im2bw(io4, 0.55); .... Output Termination Yes STOP (Fittest Criteria met? individual) GENOME PHENOME No Elitism Parent Selection (c) Figure 2. (a) Typical layout of a gene (b) Typical layout of parents Genetic (copy) elite Diversification a chromosome comprising of n genes (c) One-to-one offspring mapping of the genome and phenome We use a pool of 20 primitive operators. Table 1 provides the Survivor complete list of all primitive image analysis operators in the gene Aggregation Injection pool along with the typical number of inputs required for each (Σ) operator. Initialization creates a starting population for the GP. The initial Figure 1. Flowchart of GPIS population to the GP is randomly generated i.e. chromosomes are 2.1.1 Representation and Initialization formed by a random assigned sequence of operators. The genomic initialization is also random i.e. parameter values of operators are In our scheme, the genome of an individual encodes a also assigned randomly, based on the operator type. For practical MATLAB® program that processes an image. The input to the reasons, the size of each chromosome is limited to a maximum program is an image file and the execution of the MATLAB® length of 15. In addition, at the time of initialization, the size of program is an image of the same size and format. This output the population along with values of crossover rates and mutation image file is a segmented version of the input image. rates assigned by the user. The general layout of a gene is a shown in Figure 1 (a). As seen in the figure, each gene specifies information about the primitive 2.1.2 Fitness Evaluation operator it encodes, the input images to the operator and A segmented image consists of positive (object) and negative parameter settings for the operator. This corresponds to a few (non-object) pixels. Ideally the segmentation of an image would lines (1-3) of the equivalent MATLAB® program. The gene result in an output image where positive pixels cover object pixels consists of five parts. The first part contains name of the primitive perfectly and the negative pixels cover non-object pixels perfectly. operator and the second and third part contain the possible input Based on this idea, we can view segmentation as a pixel- images to the operator. Based on nature of the primitive operator, classification problem. The task of the segmentation program now a gene may have one or two input images. The fourth part becomes assignment of the right class to every pixel in the image. contains weights or parameter values for the primitive operator As such, we can apply measure of classification accuracy to the and fifth part encodes the nature of the Structuring Element or SE problem of image segmentation. Every segmentation program can (only in case of morphological operations) or a secondary Filter be expected to identify not only pixels belonging to the objects of Parameter or FP (only in case of filter operators). interest (True Positives, TPs), but also some non-object pixels identified as objects (False Negatives, FNs). Further, in addition The phenomic representation (chromosome) is a linear to identifying non-object pixels (True Negatives, TNs), some combination of the genes, as shown in Figure 1 (b). The pixels belonging to non-objects can be identified as object pixels
  4. 4. Table 1. Primitive image analysis operators in the gene pool (1) where FPR represents False Positive Rate and FNR represents Operator Description Inputs Operator Type False Negative Rate. The above formula for accuracy extends Name image segmentation problem to a pixel-classification problem. Therefore, ideally value of accuracy should be 1 (or 100%) for a ADDP Add Planes 2 Arithmetic perfectly segmented image. We also see that the formula is mono- SUBP Subtract Planes 2 Arithmetic modal i.e. if image A is better segmented than image B  Accuracy (A) > Accuracy (B). MULTP Multiply Planes 2 Arithmetic However, we further extend this formula by introducing a term Absolute DIFF 2 Arithmetic that penalizes longer programs. The fitness function for GPIS is as Difference follows: AVER Averaging Filter 1 Filter (2) where FPR represents False Positive Rate, FNR represents False DISK Disk Filter 1 Filter Negative Rate, len represents length of the program, β is a scaling GAUS Gaussian Filter 1 Filter factor for the length of a program, such that β ϵ [0.004, 0.008]. We found this range sufficient for our purpose. LAPL Laplacian Filter 1 Filter 2.1.3 Termination Criteria UNSHARP Unsharp Filter 1 Filter Termination of the GP is purely fitness based and the evolutionary LP Lowpass Filter 1 Filter cycle continues till the time there is no major change in fitness over a 10 generations. In order to do this, first we calculate a HP Highpass Filter 1 Filter minimum acceptable fitness value based on our trial runs. This DIL Image Dilate 1 Morphological value was found to be 95% for the database in use. Till the time, these values of fitness were not achieved, the GP keeps running. ERODE Image Erode 1 Morphological Once, these values were reached, a mechanism of calculating cumulative means of the fitness of successive generations was OPEN Image Open 1 Morphological implemented. If the absolute difference between the means of 10 CLOSE Image Close 1 Morphological successive generations was less than 5% of the highest fitness achieved, the GP stops. If however, the GP is used on any other Image Open- database, a default value of 90% is set. The termination criteria OPCL 1 Morphological Close can be defined as follows: Image Close- |current fitness – mean fitness(10 gen)| < 0.05  highest fitness CLOP 1 Morphological Open 2.1.4 Parent Selection Histogram HISTEQ 1 Enhancement Parent selection is done to select chromosomes that undergo Equalization diversification operations. In order to do this, we use a ADJUST Image Adjust 1 Enhancement tournament selection scheme. It is chosen instead of rank selection as it is computationally more efficient. The size of the THRES Thresholding 1 Post-processing tournament window λ is kept at 10% of the size of the population. The number of parents selected is 50% of the size of the population. (False Positives, FPs). Therefore, for an ideal segmentation, the number of FPs and FNs 2.1.5 Elitism should be zero while the number of TPs and TNs should be We use elitism as a means of saving the top 1% chromosomes of a exactly equal to number of object and non-object pixels. If we population. Copies of the best 1% of the chromosomes in the normalize the value of TPs and TNs by the total number of object population are copied without change to the next generation. and non-object pixels respectively, their individual values in the best case scenario would be 1 and 0 in the worst case scenario. 2.1.6 Diversification However, for the segmentation problem, achieving this is a We employ five genetic operators in total: one crossover and four challenging task, thus we define two more measures based on mutation operators. These are selected probabilistically based on TPs, TNs, FPs and FNs called the False Positive Rate (FPR) and their respective rate of crossover and mutation. False Negative rate (FNR). FPR is the proportion of non-object Crossover: We use a 1-point crossover for our GP. Two parents pixels that were erroneously reported as being object pixels. FNR are chosen randomly from the parent pool. A random location is is the proportion of object pixels that were erroneously reported as chosen in each of the parent chromosomes. The subsequences non-object pixels. Therefore, for an ideal segmentation, the values before and after this location in the parents are exchanged creating of FPR and FNR should be zero. For finding accuracy of a two offspring chromosomes. segmentation program, we use a pixel-based accuracy formula based on FPR and FNR. This formula reflects the training and Mutation: We use four mutation operators for our GP. There are validation accuracy for GPIS. It is as follows: three inter-genomic mutation operators, namely, swap, insert and
  5. 5. delete and one intra-genomic mutation operator, alter, which algorithm. From here on, we refer to the above as training typically alters the weight element of the selected gene. The gene accuracy and validation accuracy respectively. to be mutated is randomly chosen from the selected parent The output of Stage 2 is a chromosome that performs equally well chromosome. on both training and validation sets and produces high overall validation accuracy. 2.1.7 Injection In order to overcome loss of diversity in a population, we use an 2.3 Experimental Setup injection mechanism. We inject a fixed percentage of new randomly initialized programs to the population after every n In order to test the effectiveness and efficacy of our algorithm, we generation. In the current configuration, we inject 20% new tested the algorithm on a biomedical image database that consisted of HeLa cell images (in culture) of size 512 pixels  384 programs every 5 generations. pixels . The task of the algorithm was to segment the cells present 2.1.8 Survivor Aggregation in the images. The procedure for obtaining results using our The aim of this phase is to collect chromosomes that have algorithm is given in Section We also compare the results qualified to be part of the next generation (parent, offspring, elite, of our algorithm with those produced by GENIE Pro. The injected) in order to build the population for the next generation. procedure used for obtaining results using GENIE Pro is given in Section The final parameter values used for GPIS is given This phase works in two modes: non- injection and injection in Table 2. mode. In the non-injection mode, copies of all parent Table 2. Parameter settings for GPIS chromosomes (50%), offspring chromosomes (49%) and elite chromosomes (1%) form the population of the next generation. In Population size: µ 200 the injection mode, since a fixed size population (20%) of new Crossover Rate: Pc 0.45 chromosomes is inserted into the population, the top 79% of parent-offspring population is selected along with the elite set Swap Mutation Rate: Pms 0.25 (1%) to form the population of the next generation. Insert Mutation Rate: Pmi 0.25 2.1.9 Output (Fittest Individual) Delete Mutation Rate: Pmd 0.2 Once the termination criterion has been satisfied, the output of the Alter Mutation Rate: Pma 0.7 GP is typically the ―fittest‖ chromosome present in the final population. This chromosome is then chosen to be tested on a set Scalability factor for length: β 0.005 of unseen test images and it is explained in Section 2.2. Our aim is to create a pool of such outputs (segmentation programs) which 2.3.1 Procedure for Training and Validation allows us to have multiple segmentation algorithms for the same In order to plan a run of the algorithm, we first decide size of the database. This is created by subsequent runs of the GP. training and validation sets. To do so, we define G as the global Note: When we apply percentages, the results are rounded to the total number of images in a database, T as the training set, V as closest integers. In case of elitism, if 1% < 1, 1 individual is the validation set, and R as the number of times optimal copied. individuals are evolved for the same database. The final values for the above used in the present configuration are: G = 1026, T = 30, 2.2 Stage 2: Evaluation Methodology V = 100 and R = 28. As mentioned in the previous section, the output of Stage 1 gives Procedure for Obtaining Results using GPIS us one chromosome, which was the fittest chromosome amongst Step 1. Randomly select T images and other V images from the the population of final generation. The accuracy of the G images in the database. segmentations produced by this chromosome on the training images is known as training accuracy of the run. The actual Step 2. Perform training on T images to choose fittest challenge for this individual is to produce similar segmentation individual for validation. accuracies on an unseen set of images known as the validation Step 3. Validate this individual on V images to check the images. applicability of this individual on unseen images. If individual produces high validation accuracy, save it in In order to do this, we randomly select a fixed number of new the result set, else discard it. images from outside the training set along with their corresponding GTs, from the image database. From this point Step 4. Repeat Steps 1 to 3, R times producing a set of optimal onwards, we refer to call this the validation set. Once the individuals (result set). validation set is chosen, the ―fittest chromosome‖ is applied on Step 5. Calculate values of average training and validation the entire set of images, one-by-one and segmentation accuracies accuracy of the result set. for each image is calculated based on the accuracy formula (1) given is Section 2.1.2. Once this process ends, the average Procedure for Obtaining Results using GENIE segmentation accuracy of set or validation accuracy of the run is Pro calculated. Step 1. Select the same T and V images from the G images in We repeat the above process for various runs and calculate the the database, used for the corresponding GPIS run. overall training accuracy (average training accuracies of runs) and validation accuracy (average validation accuracies of runs) for the
  6. 6. Step 2. Load each of the T images as a base image and create a Table 3. Segmentation accuracy: GPIS Vs GENIE Pro training overlay for each image by marking Foreground Algorithm Training Data Validation Data (object) and Background (non-object) pixels manually. GPIS 98.76% 97.01% Step 3. Train on these manually marked training overlays using the in-built Ifrit Pixel Classifier. GENIE Pro 94.12% 93.12% Step 4. Apply learned solution on V images to produce corresponding segmented images. Table 4. Cell count rate: GPIS Vs GENIE Pro Step 5. Calculate validation accuracy for these V images using formula (1). GPIS GENIE PRO Cell Step 6. Repeat Steps 1 to 5, R times, same as like GPIS. Count Training Validation Training Validation Measure Step 7. Calculate values of average training and validation Data Data Data Data accuracy of the result set. Detected 98.24% 97.98% 97.02% 96.56% 3. RESULTS Cells We have based our results on two criteria, effectiveness of the Type 1 algorithm to accurately segment the given images, and efficiency 100% 100% 100% 100% Cells of the algorithm in doing so. Type 2 Effectiveness is based on two measures, pixel accuracy of the 98.78% 98.22% 97.49% 96.89% Cells evolved solution and the cell count rate (percentage of cell structures correctly identified). In order to calculate the cell count Undetected 1.32% 1.55% 2.12% 2.25% rate, we have categorized cells into two types: Type1 and 2. Type Cells 1 cells are those which can be identified by eye with relative ease. Type 2 cells are those which are relatively difficult to be identified by eye. We also provide comparative results for effectiveness for Table 5. Performance of GPIS based on number of generations GENIE Pro. This is presented in Section 3.1.1. Statistical Measure Number Of Generations Efficiency reflects the time the algorithm takes to produce one individual of acceptable fitness. This is measured in terms of MEAN 122.07 number of generations. These results are presented in Section MEDIAN 122 3.1.2. We also briefly discuss one evolved program and also provide segmented images produced. This is presented in Section STANDARD DEVIATION 6.85 2.4.3 and Figure 5 and 6. UPPER BOUND 138 LOWER BOUND 112 3.1 Effectiveness Table 3 presents results obtained for training and validation 3.2 Efficiency accuracies of segmentation achieved for GPIS and GENIE Pro. These values represent each algorithm’s ability to correctly Table 5 reflects the efficiency of the process to produce the required results. We measure efficiency based on number of classify each pixel in an image as an object or non-object pixel. generations taken by GPIS to produce one individual of minimum We found that our algorithm performed better in segmenting the acceptable fitness. This acceptable fitness is 95% training cells in the images as compared to GENIE Pro. accuracy. In our runs, we observed that GPIS never failed to The second measure for effectiveness that we used was cell count produce an acceptable individual. rate. We extend the concept of TPs, TNs, FPs and FNs to object The experiments were performed on an Intel Pentium (R) 4 CPU, detection where a TP denotes an object that is correctly identified 3.06 GHz, 2GB RAM computer. To execute 1 generation, GPIS by the algorithm as cell, FN denotes an object incorrectly took at an average 4.21 minutes. The average time taken for a identified as a cell, FP denotes non-object incorrectly identified as complete run was approximately 513 minutes. The maximum time cell, and TN denotes a non-object correctly identified as the taken for a complete run was 580 minutes. background. In order to consider an object as belonging to any of the above four options, a minimum of 70% of object pixels must Since GPIS is designed to run as an offline tool and the time it correspond to any of the four options mentioned above. Cells takes to execute an evolved program is between 1-3 seconds, the identified were manually counted. period of evolution of an optimal program is within reasonable real world constraints. Also, the standard deviation for number of Similar to the accuracy formula, based on TPs, TNs, FPs and FNs, generations is low. This shows that GPIS runs consistently to we can define the FPR and FNR for cell count. FPR is the produce an optimal program within a tight window. proportion of non-cell structures that were erroneously reported as being cell structures. FNR is the proportion of cell structures that 3.3 Evolved Program were erroneously reported as non-cell structures. The cell count rate formula used is as follows: Figure 5 shows the chromosomal and genomic structure of an evolved program. The program evolved is a combination of filters Cell Count Rate = (1-FPR)  (1-FNR) (3)
  7. 7. and morphological operators. The first gene is a 6  6 Gaussian [5] J. M. Daida, J. D. Hommes, T. F. Bersano-Begey,S. J. Ross, and J. F. Vesecky, ―Algorithm Discovery using the Genetic low pass filter with a sigma value of 0.8435 followed by a 4  4 Programming Paradigm: Extracting Low-contrast Curvilinear averaging filter. The output image from gene 2 is eroded with a Features from SAR Images of Arctic Ice‖, Advances in flat, disk-shaped structuring element of radius 2. A 6  6 Gaussian low pass filter with a sigma value of 0.8435 followed by a 4  4 Genetic Programming II, P. J. Angeline, K. E. Kinnear, (Eds.), Chapter 21, The MIT Press, 1996, pp. 417-442. averaging filter. The output image from gene 2 is eroded with a flat, disk-shaped structuring element of radius 2. A 6  6 [6] B. Bhanu, Y. Lin, ―Learning Composite Operators for Object Detection‖, Proceedings of the Conference on Genetic and averaging filter is again applied to the output image of the eroded image. Its output image undergoes a composite morphological Evolutionary Computation, July 2002, pp. 1003–1010. operation of closing and opening with the same structuring [7] S. P. Brumby, J. P. Theiler, S. J. Perkins, N. R. Harvey, J. J. element as above. Finally this image is converted to a binary Szymanski, and J. J. Bloch, ―Investigation of Image Feature output image using a threshold of 0.09022. The validation Extraction by a Genetic Algorithm‖, Proceedings of SPIE, accuracy is calculated for this image. Vol. 3812, 1999, pp. 24-31. Figure 6 shows implementation of this evolved program on two [8] Bhanu, B.; Sungkee Lee; Das, S., ―Adaptive image validation images along with corresponding results from GENIE segmentation using genetic and hybrid search methods”, Pro. IEEE Transactions on Aerospace and Electronic Systems, Vol. 31, Issue 4, Oct 1995 Page(s):1268 – 1291. 4. CONCLUSIONS [9] B. Bhanu and Y. Lin, ―Object Detection in Multi-modal In this paper, we propose a simple approach to the complex Images using Genetic Programming‖, Applied Soft problem of image segmentation. The proposed algorithm, GPIS, Computing, Vol. 4, Issue 2, 2004, pp. 175-201. uses genetic programming to evolve image segmentation [10] Bhanu, B.; Sungkee Lee; Das, S., ―Adaptive image programs from a pool of primitive image analysis operators. The segmentation using genetic and hybrid search methods”, evolved solutions are simple MATLAB® based image IEEE Transactions on Aerospace and Electronic Systems, segmentation programs. They are easy to read and implement. In Vol. 31, Issue 4, Oct 1995 Page(s):1268 – 1291. addition, the algorithm does not require any a priori information of objects to be segmented from the images. We have tested our [11] B. Bhanu and Y. Lin, ―Object Detection in Multi-modal algorithm on a biomedical image database. We also compare the Images using Genetic Programming‖, Applied Soft results to another GA-based image segmentation algorithm, Computing, Vol. 4, Issue 2, 2004, pp. 175-201. GENIE Pro. We found that our algorithm consistently produced [12] B. Bhanu, Y. Lin, ―Learning Composite Operators for Object better results. Both the segmentation accuracy and cell count rate Detection‖, Proceedings of the Conference on Genetic and were higher than GENIE Pro. It also produced an optimal solution Evolutionary Computation, July 2002, pp. 1003–1010. within a reasonable time window. In addition, GPIS never failed to produce an optimal solution. [13] S. P. Brumby, J. P. Theiler, S. J. Perkins, N. R. Harvey, J. J. Szymanski, and J. J. Bloch, ―Investigation of Image Feature Extraction by a Genetic Algorithm‖, Proceedings of SPIE, 5. ACKNOWLEDGMENTS Vol. 3812, 1999, pp. 24-31. We are grateful to Ms Aida Abu-Baker and Ms Janet Laganiere [14] J. M. Daida, J. D. Hommes, T. F. Bersano-Begey,S. J. Ross, from CHUM Research Centre, Notre-Dame Hospital, Montreal and J. F. Vesecky, ―Algorithm Discovery using the Genetic for providing us with the images for the cell database. We would Programming Paradigm: Extracting Low-contrast Curvilinear also like to thank Dr James Lacefield from University of Western Features from SAR Images of Arctic Ice‖, Advances in Ontario, London for his help on this project. Genetic Programming II, P. J. Angeline, K. E. Kinnear, (Eds.), Chapter 21, The MIT Press, 1996, pp. 417-442. 6. REFERENCES [1] Bhanu, B.; Sungkee Lee; Das, S., ―Adaptive image [15] J. M. Daida, J. D. Hommes, S. J. Ross, A. D. Marshall, and J. F. Vesecky, ―Extracting Curvilinear Features from SAR segmentation using genetic and hybrid search methods”, Images of Arctic Ice: Algorithm Discovery Using the Genetic IEEE Transactions on Aerospace and Electronic Systems, Programming Paradigm,‖ Proceedings of the IEEE Vol. 31, Issue 4, Oct 1995 Page(s):1268 – 1291. International Geoscience and Remote Sensing Symposium, [2] B. Bhanu and Y. Lin, ―Object Detection in Multi-modal Italy, IEEE Press, 1995, pp. 673–75. Images using Genetic Programming‖, Applied Soft [16] K. S. Fu, and J. K. Mui, ―A Survey on Image Segmentation‖, Computing, Vol. 4, Issue 2, 2004, pp. 175-201. Pattern Recognition, 13, 1981, pp. 3-16. [3] B. Bhanu, Y. Lin, ―Learning Composite Operators for Object P. Ghosh and M. Mitchell, ―Segmentation of Medical Images Detection‖, Proceedings of the Conference on Genetic and using a Genetic Algorithm‖, Proceedings of the 8th Annual Evolutionary Computation, July 2002, pp. 1003–1010. Conference on Genetic and Evolutionary Computation, [4] S. P. Brumby, J. P. Theiler, S. J. Perkins, N. R. Harvey, J. J. 2006, pp. 1171—1178. Szymanski, and J. J. Bloch, ―Investigation of Image Feature Extraction by a Genetic Algorithm‖, Proceedings of SPIE, [17] Harvery, N. Levenson, R. M., Rimm, D. L. Investigation of automated feature extraction techniques for applications in Vol. 3812, 1999, pp. 24-31. cancer derection from multi-spectral histopathology images. Proceedings of SPIE, Vol. 5032, 2003, 557-556.
  8. 8. [18] D. Howard and S. C. Roberts, ―A Staged Genetic [23] M. E. Roberts and E. Claridge, ―An Artificially Evolved Programming Strategy for Image Analysis‖, Proceedings of Vision System for Segmenting Skin Lesion Images‖, the Genetic and Evolutionary Computation Conference, Proceedings of the 6th International Conference on Medical 1999, pp. 1047—1052. Image Computing and Computer-Assisted Intervention, Vol. 2878, 2003, pp. 655- 662. [19] D. Howard, S. C. Roberts, and R. Brankin, ―Evolution of Ship Detectors for Satellite SAR Imagery‖, Proceedings of [24] W. Tackett, ―Genetic Programming for Feature Discovery and Image Discrimination‖, In S. Forrest, editor, EuroGP'99, Vol. 1598, 1999, pp. 135- 148. Proceedings of 5th International Conference on Genetic [20] N. R. Pal, and S. K. Pal, ―A Review on Image Segmentation Algorithm, 1993, pp. 303–311. Techniques‖, Pattern Recognition, 26, 1993, pp. 1277-1294. [25] W. Tackett, ―Genetic Programming for Feature Discovery [21] D. L. Pham, C. Xu, J. L. Prince, ―Survey of Current Methods and Image Discrimination‖, In S. Forrest, editor, in Medical Image Segmentation‖, Annual Review of Proceedings of 5th International Conference on Genetic Biomedical Engineering, 2, 2000, pp. 315—337. Algorithm, 1993, pp. 303–311. [22] R. Poli, ―Genetic Programming for Feature Detection and [26] Y. J. Zhang, ―Influence of Segmentation over Feature Image Segmentation‖, T.C. Forgarty (Ed.), Evolutionary Measurement‖, Pattern Recognition Letters, 16(2), 1992, Computation, Springer- Verlag, Berlin, Germany, 1996, pp. 201-206. 110–125. [GAUSS, d1, 0, 6, 0.8435] [AVER, io1, 0, 4, 0] [EROD, io2, GAUS AVER EROD AVER CLOP THRES 0, 0, 1] [AVER, io3, 0, 6, 0] [CLOP, io4, 0, 0, 1] [THRESH, io5, 0, 0.09022, 0] (a) Genomic Structure MATLAB® Implementation d1 = input; [GAUSS, d1, 0, 6, 0.8435] h1 = fspecial(‘gaussian’, [6 6], 0.8435); io1 = imfilter(d1, h1); [AVER, io1, 0, 4, 0] h2 = fspecial(‘average’, [4 4]); io2 = imfilter(io1,h2); [EROD, io2, 0, 0, 1] SE1 = strel(‘disk’, 2); io3 = imerode(io2, SE1); [AVER, io3, 0, 6, 0] h3 = fspecial(‘average’, [6 6]); io4 = imfilter(io3,h3); [CLOP, io4, 0, 0, 1] io5 = imclose(io4, SE1); [THRESH, io5, 0, 0.09022, 0] output = im2bw(io5, 0.09022); Segmentation accuracy on validation set: 99.04 %; Number of operators used = 6; Average execution time = 1.252 seconds; Number of generation needed to converge = 114; Number of fitness evaluation = 10,532 (b) Figure 5. An evolved program: (a) Chromosomal and genomic structure for the evolved program, (b) Genomic structure and equivalent MATLAB® implementation of the evolved program with corresponding performance results (a) (b) (c) (d) Figure 6. (a) Segmentation produced by GPIS using evolved program shown above on validation image 1 (Validation Accuracy = 99.21%, Cell Count Rate = 100%), (b) Segmentation produced by GENIE Pro on validation image 1 (Validation Accuracy = 95.46%, Cell Count Rate = 97.89%), (c) Segmentation produced by GPIS using evolved program shown above on validation image 2 (Validation Accuracy = 98.93%, Cell Count Rate = 100%), (d) Segmentation produced by GENIE Pro on validation image 2 (Validation Accuracy = 94.22%, Cell Count Rate = 96.45%)