ISSN: 2277 – 9043                   International Journal of Advanced Research in Computer Science and Electronics Enginee...
ISSN: 2277 – 9043     International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE...
ISSN: 2277 – 9043                International Journal of Advanced Research in Computer Science and Electronics Engineerin...
ISSN: 2277 – 9043                International Journal of Advanced Research in Computer Science and Electronics Engineerin...
ISSN: 2277 – 9043                International Journal of Advanced Research in Computer Science and Electronics Engineerin...
ISSN: 2277 – 9043               International Journal of Advanced Research in Computer Science and Electronics Engineering...
ISSN: 2277 – 9043                    International Journal of Advanced Research in Computer Science and Electronics Engine...
Upcoming SlideShare
Loading in …5

109 115


Published on

Published in: Technology, Health & Medicine
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

109 115

  1. 1. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 Brain CT-Scan Images Classification Using PCA, Wavelet Transform and K-NN Kamaljeet Kaur, Daljit Singh Abstract: With rapid development of technology in It is primarily used to detect and locate structures insidebiomedical image processing, classification of tissues of human the body that cannot be located by other forms ofbody is very challenging task as it requires very accurate radiological investigation. CT-Scan produce images ofresults without any misclassification. By making use of this tissues, shows the exact location of structures within softtechnology along with neural network; a hybrid technique has tissues. It is useful to detect bleeding within skull or tumors.been proposed for classification of Brain CT-Scan images.This technique is not limited to medical field; it is also Traditional X-rays suffer from their inability to give anyapplicable to classification of natural images. Database perception of depth to the physician. CT-Scan is better thanconsists of CT-Scan images and Brodatz texture. Themethodology adopted in this paper consists of two stages: traditional X-rays in three different ways:firstly, features are extracted from given images using featureextraction algorithms PCA and Wavelet Transform. They are - CT-Scan uses a very narrow beam of X-rays that canfurther fed as an input to train the K-NN classifier to classify penetrate straight through the body in a straight line tobetween normal and abnormal images. For Brain CT-Scan the detector.images; features extracted by PCA gives 100% classification - X-Ray source is rotated around the body so that the X-accuracy with execution time of 0.6133 seconds whereas for rays pass through the entire structure in both directions.Brodatz texture images; features by Wavelet transform gives - A computer is used to reconstruct the intensity of X-classification accuracy of 100% with execution time of 0.1912 rays into an image showing the density of any point ofseconds. Code is developed by using MATLAB 2011a. the plane through which X-rays passed. Index Terms: CT-Scan, PCA, GLCM, K-NN, featureextraction II. METHODOLOGY Several feature extraction algorithms are in use. In this I. INTRODUCTION paper, authors have used only Principal Component Computed tomography (CT-Scan) is a medical Analysis (PCA) and Wavelet Transform algorithms forimaging procedure which utilizes computer-processed X- feature extraction from images. These extracted features arerays to produce tomographic images or slices of specific used as input to train the K-Nearest Neighbor (KNN)areas of the body including soft-tissues. These three- classifier which give results as normal and abnormaldimensional images of interior body tissue are used for images.diagnostic and therapeutic purposes in various medicaldisciplines. Digital geometry processing is used to generate III. DATABASEa three-dimensional image of the inside of an object from a Database used in this paper consists of different sets oflarge series of two-dimensional X-ray images taken around normal and abnormal CT-Scan images of different patients.a single axis of rotation. It consists of 17 images in train dataset and 7 images in test dataset. For Brodatz texture; it consists of 9 texture images in train dataset and 9 images in test dataset. Images areKamaljeet Kaur, Electronics and Communication Engineering, Ludhiana defined with their class labels. Below fig.1 and fig.2 showsCollege of Engineering and Technology, Ludhiana, India. brain CT-Scan images normal and abnormal respectively whereas fig.3 and fig. 4 shows images for Brodatz texture.Daljit Singh, Electronics and Communication Engineering, LudhianaCollege of Engineering and Technology, Malerkotla , India, 9465378987. 109 All Rights Reserved © 2012 IJARCSEE
  2. 2. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 Figure 1: Normal Brain CT-Scan image Figure 4: Brodatz texture image IV. FEATURE EXTRACTION Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. When performing analysis of complex data one of the major problems stems from the number of variables involved [3]. Analysis with a large number of variables generally requires a large amount of memory and computation power or a classification algorithm which over fits the training. Feature extraction is a general term which depicts to extract only valuable information from given raw data. The main objective of feature extraction is to representFigure 2: Abnormal Brain CT-Scan image raw image in its reduced form and also to reduce the original dataset by measuring certain properties to make decision process easier during classification. A. Principal Component Analysis (PCA) PCA is a transformation that converts the set of correlated variables into set of uncorrelated variables. The goal of PCA is to reduce the dimensions of data by making it computationally feasible, while retaining as much as possible of variation present in original dataset. It is used for first order feature extraction. Features extracted using PCA are: Mean, Variance and Standard-Deviation Figure 3: Brodatz texture image Following steps are to be followed while using PCA: Step 1: Get two-dimensional data-set to understand the concept of PCA. Arrange the data as a set of N data vectors X1…….XN with each XN representing a single grouped observation of the M variables. - Write X1……… XN as column vectors, each of which has M rows. 110 All Rights Reserved © 2012 IJARCSEE
  3. 3. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012- Place the column vectors into a single matrix X of B. Haar Wavelet Transform dimensions M x N. Transform of signal is another form of representation of signal but doesn‟t change the information content present inStep 2: Calculate the Mean signal. Transformation of signal is done to obtain theFor PCA to work properly on data; subtract the mean from further information which is not present in raw signal. Theeach of the data dimensions. Subtracted mean is the average wavelet transform decomposes the image into a set ofacross each dimension. different resolution sub-images, corresponds to various frequency bands. Basic idea of DWT is to approximate a- Calculate the mean along each dimension m= 1… M. signal through a set of given mathematical functions [1].- Store the calculated mean values into vector „u‟ of This yields a multi-resolution decomposition of signal into dimensions Mx1. four sub-bands called approximation and details. 𝑁 1 ∞ 𝑢 𝑚 = 𝑋(𝑚, 𝑛) 𝑁 𝑦𝑙𝑜𝑤 𝑛 = 𝑥 𝑘 𝑔[2𝑛 − 𝑘] 𝑛=1 𝑘=−∞Step 3: Calculate Zero Mean data ∞- Subtract the empirical mean vector „u‟ from each 𝑦ℎ𝑖𝑔ℎ 𝑛 = 𝑥 𝑘 ℎ[2𝑛 − 𝑘] column of the data matrix X. 𝑘=−∞- Store zero mean data in the M × N matrix B. Haar is a simplest wavelet and is a sequence of rescaled B = 𝑋 − 𝑢ℎ “square shaped” functions which together form a wavelet family [6]. Haar wavelet‟s mother wavelet function Ψ (t) canStep 4: Calculate the Covariance Matrix „C‟. be described as:- Find the Covariance Matrix C from matrix B with 1, 0 < 𝑡 < 1/2 itself. 𝜑(𝑡) = −1, 1/2 < 𝑡 < 1 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 As given data is two-dimensional thus covariance matrix is also two-dimensional. If the non-diagonal Its scaling function ∅(𝑡) can be described as: elements of co-variance matrix are positive, both variables increase together. If the non-diagonal −1, 0< 𝑡<1 ∅ 𝑡 = components are zero; then variable are independent of 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 each other and are uncorrelated. Haar wavelet decomposes an high-frequencyStep 5: Find Eigenvectors and Eigenvalues of Covariance components into further four sub-bands, LL, LH, HL, HHMatrix C which is a square matrix. resp. LLk, HLj , LHj , HHj ; where j= 1,2,….k- Compute the matrix V of Eigen vectors which diagonalizes the covariance matrix C. K denotes the image‟s decomposition scale levels of wavelet transform ,LLk denotes the kth level low-frequencyEigenvectors provide the information about patterns in data sub-image and HLj, LHj, HHj denote the jth scale highand are perpendicular to each other. frequency sub-images, where HL indicates variation alongStep 6: Rearrange the Eigenvectors and Eigenvalues in X-axis and LH indicates variation along Y-axis. Power isdecreasing order. more compacted in LL band. In order to obtain more image detail information, decompose further high-frequencyStep 7: Derive a new dataset. components using haar wavelet transform. It helps to obtain more detailed information in all level sub-images exceptDimensions are greatly reduced and most representative low-frequency sub-images.features of whole dataset still remain within only selectedEigen features. {HLj00, HLj01 HLj10 HLj11} {LHj00, LHj01 LHj10 LHj11} 111 All Rights Reserved © 2012 IJARCSEE
  4. 4. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 {HHj00, HHj01 HHj10 HHj11} into two phases: training phase and testing phase. In training phase; known data is given and in testing phase;Where j= 1, 2…. K and j00, j01, j10, j11 denote the unknown data is given. Classification is done by usingposition of four sub-images. classifier after training. A .K-Nearest Neighbor (K-NN) K-NN is a method for classifying objects based on closest training examples in feature space. K-NN is a type of instance based learning where the function is locally approximated. This is simplest of all machine learning algorithms. An object is classified by majority vote of its neighbors, with the object being assigned to class most common among its K nearest neighbor. K-Nearest Neighbors (KNN) classification divides given data into a training set and a test set. For each row of the test set; the K nearest (in Euclidean distance) training set objects are found, and the classification is determined by majority vote with ties broken at random [7]. If there is tie for the 𝑘 𝑡ℎ nearest vector, all candidates are included in the vote. The training examples are vectors in a multidimensional feature space, each with a class label. The Figure 5: Wavelet decomposition training phase of the algorithm consists only of storing the feature vectors and class labels of the training samples. InAbove process shown in fig. 5 depicts how the image is the classification phase, k is a user-defined constant and anydecomposed into different sub-bands. query point is classified by assigning the label which is C. Features extracted from images: most frequent among the k training samples nearest to that query point. There are several features in images which are able torepresent whole image if extracted carefully. In this work; By default, Euclidean distance method is used toonly three features are extracted and they are described calculate the distance between query points with k=1. Inbelow: case of text classification; hamming distance can be used.Mean: It is an average value and measures the general VI. FOLLOWED STEPS FOR FEATURE EXTRACTIONbrightness of an image. Variance: It is a measure of spread of data in data set. AND CLASSIFICATIONCovariance between one dimension and itself is called as As shown in fig.6; the proposed methodology is dividedvariance. into two phases. They are discussed in this section.Standard Deviation: It calculates the average distance from A. Training Phasemean of the data set to a point. 1. Obtain any image and convert it into matrix form. Store the resultant image in the form of array; it can be 2-dimensional or 3-dimensional depending upon the image. Matrix can be of class single, double, V. CLASSIFICATION uint8, uint16. Classification refers to the analysis of the properties of 2. Any image read by MATLAB is in RGB format, butan image. Depending upon the analysis; the dataset is RGB format is difficult to manipulate as pixel valuesfurther referred into different classes. Input features are range from 0-65535. So, convert the matrix into graycategorized as 0 and 1.The classification process is divided image; where pixel values range from 0-255. 0 112 All Rights Reserved © 2012 IJARCSEE
  5. 5. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 represents black pixel and 255 represents white pixel. In between values represents different gray levels. Start3. Apply an algorithm developed using PCA on matrix. It transforms the data into uncorrelated form by reducing redundant variables from matrix. It is done Brain CT-scan images as by calculating the variables having large amount of input variance. Extract the required features from newly derived dataset.4. Now, apply wavelet transform on matrix. Decompose Feature extraction the image into sub-image .They are further decomposed into four sub-bands LL,LH,HL,HH and provide values approximation detail, horizontal detail, Features by PCA Features by Wavelet vertical detail and diagonal detail. Out of these Transform approximation details have been selected for feature extraction.5. Save the above extracted features in your system with K-NN Classifier .mat extension for further process.6. Now, train the K-NN classifier using extracted features for classification. This can be done by Decision applying the features as an input to the classifier. It classifies between images by calculating the distance between test data and train data. Also by changing the Normal Abnormal value of K. In this thesis; K=1 have been considered by using Euclidean distance as distance method. Having large number of features in train data; Stop classifier will get trained more and leads to better accuracy. B. Test Phase Figure 6: Proposed Methodology7. Next step is to test the performance of classifier by using test images. Repeat the steps as discussed in VII. RESULTS points 1 to 5. Apply the test data to classifier. It As discussed above; three features have been extracted differentiates between images by calculating the from images using two different feature extraction Euclidean distance between the data points and value algorithms. In Brodatz texture classification; out of 9, 8 of K selected is 1. Images having minimum distance images are correctly classified with 1 misclassification with train images are taken as they belong to the same when features are extracted by PCA. Similarly when class. If not; their distance are further calculated with features extracted by Wavelet transform; all 9 images are another image in train data. correctly classified without any misclassification.8. Same process is repeated for Brodatz texture images. Below shown tables gives values for normal images forThe flow chart gives an idea of steps followed during variance and standard deviation. Images having featuretraining phase and test phase for feature extraction and values beyond these ranges are considered as abnormalclassification for both medical images and Brodatz texture. images. During classification process; values of standard deviation have been given more importance. The same results are also shown in graphical form. 113 All Rights Reserved © 2012 IJARCSEE
  6. 6. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012TABLE I: Classification results for Brain CT-Scan imageswith feature values by PCA FEATURES BY PCA MAXIMUM MINIMUM STANDARD DEVIATION 19.2869 17.9485 VARIANCE 3447.8268 1456.4312 TOTAL IMAGES IN TEST DATASET 7 ACCURATELY CLASSIFIED BY K-NN 7TABLE II: Classification results for Brain CT-Scan images Figure 8: Standard Deviation by PCAwith feature values by Wavelet Transform FEATURES BY MAXIMUM MINIMUM WAVELET TANSFORM STANDARD DEVIATION 108.000 77.000 VARIANCE 12897.6728 6794.5950 TOTAL IMAGES IN TEST DATASET 7 ACCURATELY CLASSIFIED BY K-NN 6 A. Graphs Figure 9: Variance by Wavelet Transform Figure 7: Variance by PCA Figure 10: Standard Deviation by Wavelet Transform 114 All Rights Reserved © 2012 IJARCSEE
  7. 7. ISSN: 2277 – 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering (IJARCSEE) Volume 1, Issue 6, August 2012 The above shown graphs represent the values of [3] M.Vasantha, “Medical Image Feature Extraction, Selection and Classification”, International Journal of Engineering Science andvariance and standard deviation in the form of bars for Technology, Vol.2 (6), 2010, 2071-2076.normal and abnormal Brain CT-Scan images extracted byPCA and Wavelet Transform. [4] Manimegalai.P, Revathy.P, Dr.K.Thanushkodi, “Micro-calcification Detection in Mammogram Image using Wavelet Transform and Neural VIII. CONCLUSION Network” .International Journal of Advanced Scientific Research and Technology. Issue2.Volume 1(February 2012). ISSN: 2249-9954. PCA and Wavelet transform are very efficient tools forfeature extraction and they are very successfully used in [5] R.Nithya, B.Santhi, “Comparative Study on Feature Extraction Method for Breast Cancer Classification” Journal of Theoretical and Appliedbiomedical image processing. In this paper, classification Information Technology, 30 Nov, 2011. Vol.33 No.2, ISSN: 1992 - 86technique is developed to automatically detect whether anabnormality in CT-Scan exists or not. If the relevant [6] Ms.Yogita K.Dubey, Milind M. Mushrif, “Extraction of Wavelet Basedfeatures are successfully extracted from brain CT-Scan Features for Classification of T2-Weighted MRI Brain Images”. Signal and Image Processing: International Journal (SIPIJ) Vol.3, No.1, Februaryimages; they can help in detection of abnormalities in 2012.human body at very initial stage which helps to save theprevious human life. Same features are extracted by both [7] N.Suguna, Dr. K.Thanushkodi, “An improved K-nearest neighbor classification using Genetic Algorithm” IJCSI- International Journal ofalgorithms. After classification; the performance of Computer Science Issues, Vol.7 , Issue-4, July 2010. ISSN: 1694-0784classifier is discussed below with both PCA and WaveletTransform with same features. [8] Ryszard S.Choras, “Image Feature Extraction Techniques and Their Applications for CBIR and Biometrics Systems”, International Journal of For Brodatz Texture images; features by PCA gives Biology and Biomedical Engineering. Issue 1, Vol.1, 2007.maximum classification accuracy of 88.88% with executiontime of 0.7480 seconds. On the other hand; featuresextracted by Wavelet Transform gives maximum accuracyof 100% with execution time of 0.1912 seconds. Kamaljeet Kaur is pursuing her M-Tech (regular) For Brain CT-Scan images; features by PCA gives thesis in Biomedical Image Processing. She hadmaximum classification accuracy of 100% with execution completed her B-Tech in Electronics and Communication Engineering in 2009.She hastime of 0.6133 seconds. Similarly features extracted by attended 2 national conferences on imageWavelet Transform give maximum classification accuracy processing. She had published 3 national papersof 85.71% with execution time of 0.5508 seconds. and 1 international journal. Her areas of interest are Digital Image Processing and Distance method used to calculate the distance between Microcontrollers.train points and test points are Euclidean method. From theabove discussion; it has been concluded that brain CT-Scanimages can be accurately classified if K-NN is used in Daljit Singh is pursuing his M-Tech (regular)combination with PCA. Similarly; Brodatz texture can be thesis in Biomedical Image Processing. He hadaccurately classified if K-NN is used in combination with completed his B-Tech in Electronics and Communication Engineering in 2010. He hadWavelet Transform. published 3 national papers and 1 international journal. He has attended several conferences REFERENCES on microcontrollers and image processing. His areas of interest are Digital Image Processing,[1] Amir Rajaei, Lalitha Rangarajan, “Wavelet Based Feature Extraction Signal Processing and Microcontrollers.for Medical Image Classification”. An International Journal of EngineeringSciences, ISSN: 2229-6913 Issue Sept 2011, Vol.4.[2] EL-Sayed, EL- Dahshan, Abdul- Badeeh M. Salem, Tamer H.Yousin,“A Hybrid Technique for Automatic MRI Brain Images Classification”.Studia Univ. Babes.Bolyai, Volume LIV, Number 1, 2009. 115 All Rights Reserved © 2012 IJARCSEE