Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based Texture Features and Fuzzy Logic based Hybrid Kernel Support Vector Machine Classifier
Cervical cancer is the highest rate of incidence after breast cancer, gastric cancer, colorectal cancer, thyroid cancer among all malignant that occurs to females ; also it is the most prevalent cancer among female genital cancers. Manual cervical cancer diagnosis methods are costly and sometimes result inaccurate diagnosis caused by human error but machine assisted classification system can reduce financial costs and increase screening accuracy. In this research article, we have developed multi class cervical classification system by using Pap Smear Images according to the WHO descriptive Classification of Cervical Histology. Then, this system classifies the cell of the Pap Smear image into anyone of five types of the classes of normal cell, mild dysplasia, moderate dysplasia, severe dysplasia and carcinoma in situ (CIS) by using individual and Combining individual feature extraction method with the classification technique. In this paper three Feature Extraction methods were used: From that three, two were individual feature extraction method namely Enriched Rough Set Texton Co-Occurrence Matrix (ERSTCM) and Enriched Micro Structure Descriptor (EMSD) and the remained one was combining individual feature extraction method namely concatenated feature extraction method (CFE). The CFE method represents all the individual feature extraction methods of ERSTCM & EMSD features are combining together to one feature to assess their joint performance. Then these three feature extraction methods are tested over Fuzzy Logic based Hybrid Kernel Support Vector Machine (FL-HKSVM) Classifier. This Examination was conducted over a set of single cervical cell based pap smear images. The dataset contains five classes of images, with a total of 952 images. The distribution of number of images per class is not uniform. Then the performance was evaluated in both the individual and combining individual feature extraction method with the classification techniques by using the statistical parameters of sensitivity, specificity & accuracy. Hence the resultant values of the statistical parameters described in individual feature extraction method with the classification technique, proposed EMSD+FLHKSVM Classifier had given the better results than the other ERSTCM+FLHKSVM Classifier and combining individual feature extraction method with the classification technique described, proposed CFE+FLHKSVM Classifier had given the better results than other EMSD+FLHKSVM & ERSTCM+FLHKSVM classifiers.
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...ijcsit
This document summarizes a research paper that proposes a modified binary particle swarm optimization (MBPSO) method for feature selection to improve the classification of lesions in mammograms as benign or malignant. The MBPSO method incorporates iteration best into the velocity update calculation and selects solutions with fewer features when fitness values are equal. A total of 117 shape, texture, and histogram features are extracted from mammogram regions of interest. MBPSO is used to select an optimal feature subset, which is then used to train a neural network classifier. Experimental results show MBPSO obtains better classification accuracy than using the full feature set, and performs comparably to other feature selection techniques.
Feature selection/extraction methods aimed to reduce the Microarray data. Basically in this comparative analysis, we have taken into account different feature selection and extraction strategies used up till now in the field of Biomedical. In the field of pattern recognition and biomedical imaging, dimensionality reduction is the central area of the research. Some mostly used features selection/extraction methods aim to analyze the most efficient data and achieve the stable performance of the algorithms, as well as improve the accuracy and performance of the classifier. This analysis also highlights widely used dimensionality reduction techniques used up till now in the field of biomedical imaging for the purpose to explore their potency, and weak points.
IRJET- A Survey on Categorization of Breast Cancer in Histopathological ImagesIRJET Journal
This document summarizes various methods for categorizing breast cancer in histopathological images. It discusses machine learning and image processing techniques that have been used to build computer-aided diagnosis (CAD) systems to help pathologists diagnose breast cancer more objectively and consistently. The document reviews different classification methods that have been proposed, including those using fuzzy logic, level set methods, convolutional neural networks, texture features and ensemble methods. It concludes that accurately classifying histopathological images remains challenging due to limited publicly available datasets and variability in tissue appearance, but that machine learning and advanced image analysis offer promising approaches to improve cancer detection and diagnosis.
Detecting in situ melanoma using multi parameter extraction and neural classi...IAEME Publication
The document discusses a Multi Parameter Extraction and Classification System (MPECS) for detecting melanoma skin cancer. The MPECS represents dermoscopic images as a set of 21 extracted features and uses a neural network classifier to classify the images. The MPECS uses a six phase approach to extract parameters from the images. A Multilayer Feed Forward Neural Network classifier trained with backpropagation is used to accurately diagnose and classify early signs of melanoma based on the extracted features. Experimental results presented in the paper validate the effectiveness of the MPECS at classifying skin lesions.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IRJET - Histogram Analysis for Melanoma Discrimination in Real Time ImageIRJET Journal
The document presents a novel framework for recognizing melanoma in real-time skin images using k-means clustering and support vector machine algorithms. It discusses the challenges of automated melanoma recognition due to variations in melanoma appearance and similarities to non-melanoma lesions. A two-stage approach is proposed involving lesion segmentation followed by classification using deep learning networks to extract discriminative features for accurate melanoma recognition.
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...IRJET Journal
This document presents a framework for early prediction of bladder cancer using DNA methylation analysis and feature selection methods. It uses two feature selection techniques: Correlation-based Feature Weighting (CFW) which assigns weights to features based on correlation with the class and average correlation between features; and Differential Mean Feature Selection (DMFS) which calculates the mean difference between normal and cancer samples to select features. The selected features are then used in a Naive Bayes classifier to classify samples as normal or cancer. The framework aims to reduce the number of features in DNA methylation datasets to improve early bladder cancer prediction.
This document presents a method for detecting cancer in Pap smear cytological images using bag of texture features. The method involves segmenting the nucleus region from the images, extracting texture features from blocks within the nucleus region, clustering the features to build a visual dictionary, and representing each image as a histogram of visual words present. The histograms are then used to retrieve similar images from a database using histogram intersection as the distance measure. Experiments were conducted using different block sizes and number of clusters, achieving up to 90% accuracy in identifying cancerous versus normal cells.
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...ijcsit
This document summarizes a research paper that proposes a modified binary particle swarm optimization (MBPSO) method for feature selection to improve the classification of lesions in mammograms as benign or malignant. The MBPSO method incorporates iteration best into the velocity update calculation and selects solutions with fewer features when fitness values are equal. A total of 117 shape, texture, and histogram features are extracted from mammogram regions of interest. MBPSO is used to select an optimal feature subset, which is then used to train a neural network classifier. Experimental results show MBPSO obtains better classification accuracy than using the full feature set, and performs comparably to other feature selection techniques.
Feature selection/extraction methods aimed to reduce the Microarray data. Basically in this comparative analysis, we have taken into account different feature selection and extraction strategies used up till now in the field of Biomedical. In the field of pattern recognition and biomedical imaging, dimensionality reduction is the central area of the research. Some mostly used features selection/extraction methods aim to analyze the most efficient data and achieve the stable performance of the algorithms, as well as improve the accuracy and performance of the classifier. This analysis also highlights widely used dimensionality reduction techniques used up till now in the field of biomedical imaging for the purpose to explore their potency, and weak points.
IRJET- A Survey on Categorization of Breast Cancer in Histopathological ImagesIRJET Journal
This document summarizes various methods for categorizing breast cancer in histopathological images. It discusses machine learning and image processing techniques that have been used to build computer-aided diagnosis (CAD) systems to help pathologists diagnose breast cancer more objectively and consistently. The document reviews different classification methods that have been proposed, including those using fuzzy logic, level set methods, convolutional neural networks, texture features and ensemble methods. It concludes that accurately classifying histopathological images remains challenging due to limited publicly available datasets and variability in tissue appearance, but that machine learning and advanced image analysis offer promising approaches to improve cancer detection and diagnosis.
Detecting in situ melanoma using multi parameter extraction and neural classi...IAEME Publication
The document discusses a Multi Parameter Extraction and Classification System (MPECS) for detecting melanoma skin cancer. The MPECS represents dermoscopic images as a set of 21 extracted features and uses a neural network classifier to classify the images. The MPECS uses a six phase approach to extract parameters from the images. A Multilayer Feed Forward Neural Network classifier trained with backpropagation is used to accurately diagnose and classify early signs of melanoma based on the extracted features. Experimental results presented in the paper validate the effectiveness of the MPECS at classifying skin lesions.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IRJET - Histogram Analysis for Melanoma Discrimination in Real Time ImageIRJET Journal
The document presents a novel framework for recognizing melanoma in real-time skin images using k-means clustering and support vector machine algorithms. It discusses the challenges of automated melanoma recognition due to variations in melanoma appearance and similarities to non-melanoma lesions. A two-stage approach is proposed involving lesion segmentation followed by classification using deep learning networks to extract discriminative features for accurate melanoma recognition.
IRJET- A Feature Selection Framework for DNA Methylation Analysis in Predicti...IRJET Journal
This document presents a framework for early prediction of bladder cancer using DNA methylation analysis and feature selection methods. It uses two feature selection techniques: Correlation-based Feature Weighting (CFW) which assigns weights to features based on correlation with the class and average correlation between features; and Differential Mean Feature Selection (DMFS) which calculates the mean difference between normal and cancer samples to select features. The selected features are then used in a Naive Bayes classifier to classify samples as normal or cancer. The framework aims to reduce the number of features in DNA methylation datasets to improve early bladder cancer prediction.
This document presents a method for detecting cancer in Pap smear cytological images using bag of texture features. The method involves segmenting the nucleus region from the images, extracting texture features from blocks within the nucleus region, clustering the features to build a visual dictionary, and representing each image as a histogram of visual words present. The histograms are then used to retrieve similar images from a database using histogram intersection as the distance measure. Experiments were conducted using different block sizes and number of clusters, achieving up to 90% accuracy in identifying cancerous versus normal cells.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...CSCJournals
An intelligent computer-aided diagnosis system can be very helpful for radiologist in detecting and diagnosing microcalcifications’ patterns earlier and faster than typical screening programs. In this paper, we present a system based on fuzzy-C Means clustering and feature extraction techniques using texture based segmentation and genetic algorithm for detecting and diagnosing micro calcifications’ patterns in digital mammograms.We have investigated and analyzed a number of feature extraction techniques and found that a combination of three features, such as entropy, standard deviation, and number of pixels, is the best combination to distinguish a benign micro calcification pattern from one that is malignant. A fuzzy C Means technique in conjunction with three features was used to detect a micro calcification pattern and a neural network to classify it into benign/malignant. The system was developed on a Windows platform. It is an easy to use intelligent system that gives the user options to diagnose, detect, enlarge, zoom, and measure distances of areas in digital mammograms. The present study focused on the investigation of the application of artificial intelligence and data mining techniques to the prediction models of breast cancer. The artificial neural network, decision tree,Fuzzy C Means, and genetic algorithm were used for the comparative studies and the accuracy and positive predictive value of each algorithm were used as the evaluation indicators. 699 records acquired from the breast cancer patients at the MIAS database, 9 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of Fuzzy C Means were 0.9534 (sensitivity 0.98716 and specificity 0.9582), the decision tree model 0.9634 (sensitivity 0.98615, specificity 0.9305), the neural network model 0.96502 (sensitivity 0.98628, specificity 0.9473), the genetic algorithm model 0.9878 (sensitivity 1, specificity 0.9802). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.9612. The predicted outcome of the Fuzzy C Means model was higher than that of the neural network model but no significant difference was observed. The average predicted accuracy of the decision tree model was 0.9635 which was the lowest of all 4 predictive models. The standard deviation of the 10-fold cross-validation was rather unreliable. The results showed that the genetic algorithm described in the present study was able to produce accurate results in the classification of breast cancer data and the classification rule identified was more acceptable and comprehensible. Keywords: Fuzzy C Means, Decision Tree Induction, Genetic algorithm, data mining, breast cancer, rule discovery.
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
This document proposes a new approach called IGANFIS that combines information gain (IG) and adaptive neuro fuzzy inference system (ANFIS) to classify cancer threats from medical data. IG is used to select the most important cancer features from the data to reduce dimensionality before being input to ANFIS. ANFIS then trains on the selected features to build a fuzzy inference system for cancer diagnosis and prediction. The approach is tested on breast cancer datasets and achieves higher classification accuracy compared to other methods.
PSO-SVM hybrid system for melanoma detection from histo-pathological imagesIJECEIAES
This paper introduces an automated system for skin cancer (melanoma) detection from Histo-pathological images sampled from microscopic slides of skin biopsy. The proposed system is a hybrid system based on Particle Swarm Optimization and Support Vector Machine (PSO-SVM). The features used are extracted from the grayscale image histogram, the co-occurrence matrix and the energy of the wavelet coefficients resulting from the wavelet packet decomposition. The PSO-SVM system selects the best feature set and the best values for the SVM parameters (C and γ) that optimize the performance of the SVM classifier. The system performance is tested on a real dataset obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. Evaluation results show a classification accuracy of 87.13%, a sensitivity of 94.1% and a specificity of 80.22%.The sensitivity and specificity results are comparable to those obtained by dermatologists.
This document discusses a proposed method for detecting breast cancer by identifying circulating tumor cells using association rule mining. The method involves identifying major genes associated with breast cancer by applying the principle of association rule mining to breast cancer datasets. Association rules are generated between genes to determine high priority genes that are responsible for causing cancer. The Baum-Welch algorithm is used to estimate probabilities and identify the high priority gene values associated with different cancer clusters and levels. Obtaining these high priority genes can help identify the major genes responsible for breast cancer.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
inAbstract— Pattern Recognition (PR) plays an important role in field of Bioinformatics. PR is concerned with processing raw measurement data by a computer to arrive at a prediction that can be used to formulate a decision to be taken. The important problem in which pattern recognition are applied have common that they are too complex to model explicitly. Diverse methods of this PR are used to analyze, segment and manage the high dimensional microarray gene data for classification. PR is concerned with the development of systems that learn to solve a given problem using a set of instances, each instances represented by a number of features. The microarray expression technologies are possible to monitor the expression levels of thousands of genes simultaneously. The microarrays generated large amount of data has stimulate the development of various computational methods to different biological processes by gene expression profiling. Microarray Gene Expression Profiling (MGEP) is important in Bioinformatics, it yield various high dimensional data used in various clinical applications like cancer diagnostics and drug designing. In this work a new schema has developed for classification of unknown malignant tumors into known class. According to this work an new classification scheme includes the transformation of very high dimensional microarray data into mahalanobis space before classification. The eligibility of the proposed classification scheme has proved to 10 commonly available cancer gene datasets, this contains both the binary and multiclass data sets. To improve the performance of the classification gene selection method is applied to the datasets as a preprocessing and data extraction step.
Intelligent algorithms for cell tracking and image segmentationijcsit
Sensitive and accurate cell tracking system is important to cell motility studies. Recently, researchers have
developed several methods for detecting and tracking the living cells. To improve the living cells tracking
systems performance and accuracy, we focused on developing a novel technique for image processing. The
algorithm we propose presents novel image segmentation and tracking system technique to incorporate the
advantages of both Topological Alignments and snakes for more accurate tracking approach. The results
demonstrate that the proposed algorithm achieves accurate tracking for detecting and analyzing the
mobility of the living cells. The RMSE between the manual and the computed displacement was less than
12% on average. Where the Active Contour method gave a velocity RMSE of less than 11%, improves to
less than 8% by using the novel Algorithm. We have achieved better tracking and detecting for the cells,
also the ability of the system to improve the low contrast, under and over segmentation which is the most
cell tracking challenge problems and responsible for lacking accuracy in cell tracking techniques.
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...Kiogyf
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND FIREFLY ALGORITHM
ABSTRACT
Cancer is a globally recognized cause of death. A proper cancer analysis demands the classification of several types of tumor. Investigations into microarray gene expressions seem to be a successful platform for revising genetic diseases. Although the standard machine learning (ML) approaches have been efficient in the realization of significant genes and in the classification of new types of cancer cases, their medical and logical application has faced several drawbacks such as DNA microarray data analysis limitation, which includes an incredible number of features and the relatively small size of an instance. To achieve a reasonable and efficient DNA microarray dataset information, there is a need to extend the level of interpretability and forecast approach while maintaining a great level of precision. In this work, a novel way of cancer classification based on based gene expression profiles is presented. This method is a combination of both Firefly algorithm and Mutual Information Method. First, the features are used to select the features before using the Firefly algorithm for feature reduction. Finally, the Support Vector Machine is used to classify cancer into types. The performance of the proposed system was evaluated by using it to classify datasets from colon cancer; the results of the evaluation were compared with some recent approaches.
Keywords: Feature Selection, Firefly Algorithm, Cancer Disease, Mutual Information
Feature Selection Approach based on Firefly Algorithm and Chi-square IJECEIAES
Dimensionality problem is a well-known challenging issue for most classifiers in which datasets have unbalanced number of samples and features. Features may contain unreliable data which may lead the classification process to produce undesirable results. Feature selection approach is considered a solution for this kind of problems. In this paperan enhanced firefly algorithm is proposed to serve as a feature selection solution for reducing dimensionality and picking the most informative features to be used in classification. The main purpose of the proposedmodel is to improve the classification accuracy through using the selected features produced from the model, thus classification errors will decrease. Modeling firefly in this research appears through simulating firefly position by cell chi-square value which is changed after every move, and simulating firefly intensity by calculating a set of different fitness functionsas a weight for each feature. Knearest neighbor and Discriminant analysis are used as classifiers to test the proposed firefly algorithm in selecting features. Experimental results showed that the proposed enhanced algorithmbased on firefly algorithm with chisquare and different fitness functions can provide better results than others. Results showed that reduction of dataset is useful for gaining higher accuracy in classification.
Brain tumor segmentation based on local independent projection based classifi...ieeepondy
This document presents a new method for brain tumor segmentation from MRI images using local independent projection-based classification (LIPC). LIPC treats tumor segmentation as a classification problem and introduces locality and independent projections into the classification model. It also considers the data distribution of different classes. The method is evaluated on 80 brain tumor MRI images and achieves Dice similarity scores of 0.84 for complete tumor segmentation, 0.685 for tumor core segmentation, and 0.585 for contrast-enhancing tumor segmentation, comparable to state-of-the-art methods.
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IRJET- Detection of Breast Cancer using Machine Learning TechniquesIRJET Journal
The document discusses using machine learning techniques to detect breast cancer. It begins with an introduction to machine learning and data mining, and how they can be applied to medical diagnosis like cancer detection. The study uses different machine learning classifiers and the Wisconsin Breast Cancer Dataset to classify cancers as benign or malignant. The classifiers tested were J48 and LMT, and the best accuracy obtained was 97.36% without using feature filters on the data.
This document presents a new method for analyzing asymmetry in breast MR images to detect early signs of breast cancer. The method extracts six features from bilateral breast MR images, including directional statistics and texture measurements. Linear discriminant analysis with leave-one-out cross validation was used to classify images as cancer or non-cancer. The best classification results used combinations of two or three features, achieving average accuracies from 65.6% to 78.1%, sensitivities from 53.3% to 73.3%, and specificities from 60% to 94.1%. Preliminary results suggest this asymmetry analysis method shows potential for detecting breast cancer on MR images despite using a small number of features and a simple classifier.
ABSTRACT
This paper presents an effective feature selection method that can be applied to build a computer aided diagnosis system for breast cancer in order to discriminate between healthy, benign and malignant parenchyma. Determining the optimal feature set from a large set of original features is an important preprocessing step which removes irrelevant and redundant features and thus improves computational efficiency, classification accuracy and also simplifies the classifier structure. A modified binary particle swarm optimized feature selection method (MBPSO)has been proposed where k-Nearest Neighbour algorithm with leave-one-out cross validation serves as the fitness function. Digital mammograms obtained from Regional Cancer Centre, Thiruvananthapuram and the mammograms from web accessible mini-MIAS database has been used as the dataset for this experiment. Region of interests from the mammograms are automatically detected and segmented. A total of 117 shape, texture and histogram features are extracted from the ROIs. Significant features are selected using the proposed feature selection method.Classification is performed using feed forward artificial neural networks with back propagation learning. Receiver operating characteristics (ROC) and confusion matrix are used to evaluate the performance. Experimental results show that the modified binary PSO feature selection method not only obtains better classification accuracy but also simplifies the classification process as compared to full set of features. The performance of the modified BPSO is found to be at par with other widely used feature selection techniques.
Exploring the performance of feature selection method using breast cancer dat...nooriasukmaningtyas
Breast cancer is the most common type of cancer occurring mostly in females. In recent years, many researchers have devoted to automate diagnosis of breast cancer by developing different machine learning model. However, the quality and quantity of feature in breast cancer diagnostic dataset have significant effect on the accuracy and efficiency of predictive model. Feature selection is effective method for reducing the dimensionality and improving the accuracy of predictive model. The use of feature selection is to determine feature required for training model and to remove irrelevant and duplicate feature. Duplicate feature is a feature that is highly correlated to another feature. The objective of this study is to conduct experimental research on three different feature selection methods for breast cancer prediction. Sequential, embedded and chi-square feature selection are implemented using breast cancer diagnostic dataset. The study compares the performance of sequential embedded and chi-square feature selection on test set. The experimental result evidently shows that sequential feature selection outperforms as compared to chi-square (X2) statistics and embedded feature selection. Overall, sequential feature selection achieves better accuracy of 98.3% as compared to chi-square (X2) statistics and embedded feature selection.
This document presents a genetic algorithm-based classification method for classifying different types of lung cancer in needle biopsy images. It first segments cell nuclei from biopsy images and extracts color, texture, and shape features from the nuclei. A dictionary learning approach is used to build discriminative subdictionaries for each feature type. In testing, features from an image are classified at the cell level and then fused at the image level via majority voting. The method achieves higher accuracy than using single features or existing classification methods, demonstrating its effectiveness in classifying lung cancer types in biopsy images.
BFO – AIS: A Framework for Medical Image Classification Using Soft Computing ...ijsc
Medical images provide diagnostic evidence/information about anatomical pathology. The growth in database is enormous as medical digital image equipment’s like Magnetic Resonance Images (MRI), Computed Tomography (CT), and Positron Emission Tomography CT (PET-CT) are part of clinical work. CT images distinguish various tissues according to gray levels to help medical diagnosis. Ct is more reliable for early tumours and haemorrhages detection as it provides anatomical information to plan radio therapy. Medical information systems goals are to deliver information to right persons at the right time and place to improve care process quality and efficiency. This paper proposes an Artificial Immune System (AIS) classifier and proposed feature selection based on hybrid Bacterial Foraging Optimization (BFO) with Local Search (LS) for medical image classification.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...CSCJournals
An intelligent computer-aided diagnosis system can be very helpful for radiologist in detecting and diagnosing microcalcifications’ patterns earlier and faster than typical screening programs. In this paper, we present a system based on fuzzy-C Means clustering and feature extraction techniques using texture based segmentation and genetic algorithm for detecting and diagnosing micro calcifications’ patterns in digital mammograms.We have investigated and analyzed a number of feature extraction techniques and found that a combination of three features, such as entropy, standard deviation, and number of pixels, is the best combination to distinguish a benign micro calcification pattern from one that is malignant. A fuzzy C Means technique in conjunction with three features was used to detect a micro calcification pattern and a neural network to classify it into benign/malignant. The system was developed on a Windows platform. It is an easy to use intelligent system that gives the user options to diagnose, detect, enlarge, zoom, and measure distances of areas in digital mammograms. The present study focused on the investigation of the application of artificial intelligence and data mining techniques to the prediction models of breast cancer. The artificial neural network, decision tree,Fuzzy C Means, and genetic algorithm were used for the comparative studies and the accuracy and positive predictive value of each algorithm were used as the evaluation indicators. 699 records acquired from the breast cancer patients at the MIAS database, 9 predictor variables, and 1 outcome variable were incorporated for the data analysis followed by the 10-fold cross-validation. The results revealed that the accuracies of Fuzzy C Means were 0.9534 (sensitivity 0.98716 and specificity 0.9582), the decision tree model 0.9634 (sensitivity 0.98615, specificity 0.9305), the neural network model 0.96502 (sensitivity 0.98628, specificity 0.9473), the genetic algorithm model 0.9878 (sensitivity 1, specificity 0.9802). The accuracy of the genetic algorithm was significantly higher than the average predicted accuracy of 0.9612. The predicted outcome of the Fuzzy C Means model was higher than that of the neural network model but no significant difference was observed. The average predicted accuracy of the decision tree model was 0.9635 which was the lowest of all 4 predictive models. The standard deviation of the 10-fold cross-validation was rather unreliable. The results showed that the genetic algorithm described in the present study was able to produce accurate results in the classification of breast cancer data and the classification rule identified was more acceptable and comprehensible. Keywords: Fuzzy C Means, Decision Tree Induction, Genetic algorithm, data mining, breast cancer, rule discovery.
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
This document proposes a new approach called IGANFIS that combines information gain (IG) and adaptive neuro fuzzy inference system (ANFIS) to classify cancer threats from medical data. IG is used to select the most important cancer features from the data to reduce dimensionality before being input to ANFIS. ANFIS then trains on the selected features to build a fuzzy inference system for cancer diagnosis and prediction. The approach is tested on breast cancer datasets and achieves higher classification accuracy compared to other methods.
PSO-SVM hybrid system for melanoma detection from histo-pathological imagesIJECEIAES
This paper introduces an automated system for skin cancer (melanoma) detection from Histo-pathological images sampled from microscopic slides of skin biopsy. The proposed system is a hybrid system based on Particle Swarm Optimization and Support Vector Machine (PSO-SVM). The features used are extracted from the grayscale image histogram, the co-occurrence matrix and the energy of the wavelet coefficients resulting from the wavelet packet decomposition. The PSO-SVM system selects the best feature set and the best values for the SVM parameters (C and γ) that optimize the performance of the SVM classifier. The system performance is tested on a real dataset obtained from the Southern Pathology Laboratory in Wollongong NSW, Australia. Evaluation results show a classification accuracy of 87.13%, a sensitivity of 94.1% and a specificity of 80.22%.The sensitivity and specificity results are comparable to those obtained by dermatologists.
This document discusses a proposed method for detecting breast cancer by identifying circulating tumor cells using association rule mining. The method involves identifying major genes associated with breast cancer by applying the principle of association rule mining to breast cancer datasets. Association rules are generated between genes to determine high priority genes that are responsible for causing cancer. The Baum-Welch algorithm is used to estimate probabilities and identify the high priority gene values associated with different cancer clusters and levels. Obtaining these high priority genes can help identify the major genes responsible for breast cancer.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
inAbstract— Pattern Recognition (PR) plays an important role in field of Bioinformatics. PR is concerned with processing raw measurement data by a computer to arrive at a prediction that can be used to formulate a decision to be taken. The important problem in which pattern recognition are applied have common that they are too complex to model explicitly. Diverse methods of this PR are used to analyze, segment and manage the high dimensional microarray gene data for classification. PR is concerned with the development of systems that learn to solve a given problem using a set of instances, each instances represented by a number of features. The microarray expression technologies are possible to monitor the expression levels of thousands of genes simultaneously. The microarrays generated large amount of data has stimulate the development of various computational methods to different biological processes by gene expression profiling. Microarray Gene Expression Profiling (MGEP) is important in Bioinformatics, it yield various high dimensional data used in various clinical applications like cancer diagnostics and drug designing. In this work a new schema has developed for classification of unknown malignant tumors into known class. According to this work an new classification scheme includes the transformation of very high dimensional microarray data into mahalanobis space before classification. The eligibility of the proposed classification scheme has proved to 10 commonly available cancer gene datasets, this contains both the binary and multiclass data sets. To improve the performance of the classification gene selection method is applied to the datasets as a preprocessing and data extraction step.
Intelligent algorithms for cell tracking and image segmentationijcsit
Sensitive and accurate cell tracking system is important to cell motility studies. Recently, researchers have
developed several methods for detecting and tracking the living cells. To improve the living cells tracking
systems performance and accuracy, we focused on developing a novel technique for image processing. The
algorithm we propose presents novel image segmentation and tracking system technique to incorporate the
advantages of both Topological Alignments and snakes for more accurate tracking approach. The results
demonstrate that the proposed algorithm achieves accurate tracking for detecting and analyzing the
mobility of the living cells. The RMSE between the manual and the computed displacement was less than
12% on average. Where the Active Contour method gave a velocity RMSE of less than 11%, improves to
less than 8% by using the novel Algorithm. We have achieved better tracking and detecting for the cells,
also the ability of the system to improve the low contrast, under and over segmentation which is the most
cell tracking challenge problems and responsible for lacking accuracy in cell tracking techniques.
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...Kiogyf
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND FIREFLY ALGORITHM
ABSTRACT
Cancer is a globally recognized cause of death. A proper cancer analysis demands the classification of several types of tumor. Investigations into microarray gene expressions seem to be a successful platform for revising genetic diseases. Although the standard machine learning (ML) approaches have been efficient in the realization of significant genes and in the classification of new types of cancer cases, their medical and logical application has faced several drawbacks such as DNA microarray data analysis limitation, which includes an incredible number of features and the relatively small size of an instance. To achieve a reasonable and efficient DNA microarray dataset information, there is a need to extend the level of interpretability and forecast approach while maintaining a great level of precision. In this work, a novel way of cancer classification based on based gene expression profiles is presented. This method is a combination of both Firefly algorithm and Mutual Information Method. First, the features are used to select the features before using the Firefly algorithm for feature reduction. Finally, the Support Vector Machine is used to classify cancer into types. The performance of the proposed system was evaluated by using it to classify datasets from colon cancer; the results of the evaluation were compared with some recent approaches.
Keywords: Feature Selection, Firefly Algorithm, Cancer Disease, Mutual Information
Feature Selection Approach based on Firefly Algorithm and Chi-square IJECEIAES
Dimensionality problem is a well-known challenging issue for most classifiers in which datasets have unbalanced number of samples and features. Features may contain unreliable data which may lead the classification process to produce undesirable results. Feature selection approach is considered a solution for this kind of problems. In this paperan enhanced firefly algorithm is proposed to serve as a feature selection solution for reducing dimensionality and picking the most informative features to be used in classification. The main purpose of the proposedmodel is to improve the classification accuracy through using the selected features produced from the model, thus classification errors will decrease. Modeling firefly in this research appears through simulating firefly position by cell chi-square value which is changed after every move, and simulating firefly intensity by calculating a set of different fitness functionsas a weight for each feature. Knearest neighbor and Discriminant analysis are used as classifiers to test the proposed firefly algorithm in selecting features. Experimental results showed that the proposed enhanced algorithmbased on firefly algorithm with chisquare and different fitness functions can provide better results than others. Results showed that reduction of dataset is useful for gaining higher accuracy in classification.
Brain tumor segmentation based on local independent projection based classifi...ieeepondy
This document presents a new method for brain tumor segmentation from MRI images using local independent projection-based classification (LIPC). LIPC treats tumor segmentation as a classification problem and introduces locality and independent projections into the classification model. It also considers the data distribution of different classes. The method is evaluated on 80 brain tumor MRI images and achieves Dice similarity scores of 0.84 for complete tumor segmentation, 0.685 for tumor core segmentation, and 0.585 for contrast-enhancing tumor segmentation, comparable to state-of-the-art methods.
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IRJET- Detection of Breast Cancer using Machine Learning TechniquesIRJET Journal
The document discusses using machine learning techniques to detect breast cancer. It begins with an introduction to machine learning and data mining, and how they can be applied to medical diagnosis like cancer detection. The study uses different machine learning classifiers and the Wisconsin Breast Cancer Dataset to classify cancers as benign or malignant. The classifiers tested were J48 and LMT, and the best accuracy obtained was 97.36% without using feature filters on the data.
This document presents a new method for analyzing asymmetry in breast MR images to detect early signs of breast cancer. The method extracts six features from bilateral breast MR images, including directional statistics and texture measurements. Linear discriminant analysis with leave-one-out cross validation was used to classify images as cancer or non-cancer. The best classification results used combinations of two or three features, achieving average accuracies from 65.6% to 78.1%, sensitivities from 53.3% to 73.3%, and specificities from 60% to 94.1%. Preliminary results suggest this asymmetry analysis method shows potential for detecting breast cancer on MR images despite using a small number of features and a simple classifier.
Similar to Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based Texture Features and Fuzzy Logic based Hybrid Kernel Support Vector Machine Classifier
ABSTRACT
This paper presents an effective feature selection method that can be applied to build a computer aided diagnosis system for breast cancer in order to discriminate between healthy, benign and malignant parenchyma. Determining the optimal feature set from a large set of original features is an important preprocessing step which removes irrelevant and redundant features and thus improves computational efficiency, classification accuracy and also simplifies the classifier structure. A modified binary particle swarm optimized feature selection method (MBPSO)has been proposed where k-Nearest Neighbour algorithm with leave-one-out cross validation serves as the fitness function. Digital mammograms obtained from Regional Cancer Centre, Thiruvananthapuram and the mammograms from web accessible mini-MIAS database has been used as the dataset for this experiment. Region of interests from the mammograms are automatically detected and segmented. A total of 117 shape, texture and histogram features are extracted from the ROIs. Significant features are selected using the proposed feature selection method.Classification is performed using feed forward artificial neural networks with back propagation learning. Receiver operating characteristics (ROC) and confusion matrix are used to evaluate the performance. Experimental results show that the modified binary PSO feature selection method not only obtains better classification accuracy but also simplifies the classification process as compared to full set of features. The performance of the modified BPSO is found to be at par with other widely used feature selection techniques.
Exploring the performance of feature selection method using breast cancer dat...nooriasukmaningtyas
Breast cancer is the most common type of cancer occurring mostly in females. In recent years, many researchers have devoted to automate diagnosis of breast cancer by developing different machine learning model. However, the quality and quantity of feature in breast cancer diagnostic dataset have significant effect on the accuracy and efficiency of predictive model. Feature selection is effective method for reducing the dimensionality and improving the accuracy of predictive model. The use of feature selection is to determine feature required for training model and to remove irrelevant and duplicate feature. Duplicate feature is a feature that is highly correlated to another feature. The objective of this study is to conduct experimental research on three different feature selection methods for breast cancer prediction. Sequential, embedded and chi-square feature selection are implemented using breast cancer diagnostic dataset. The study compares the performance of sequential embedded and chi-square feature selection on test set. The experimental result evidently shows that sequential feature selection outperforms as compared to chi-square (X2) statistics and embedded feature selection. Overall, sequential feature selection achieves better accuracy of 98.3% as compared to chi-square (X2) statistics and embedded feature selection.
This document presents a genetic algorithm-based classification method for classifying different types of lung cancer in needle biopsy images. It first segments cell nuclei from biopsy images and extracts color, texture, and shape features from the nuclei. A dictionary learning approach is used to build discriminative subdictionaries for each feature type. In testing, features from an image are classified at the cell level and then fused at the image level via majority voting. The method achieves higher accuracy than using single features or existing classification methods, demonstrating its effectiveness in classifying lung cancer types in biopsy images.
BFO – AIS: A Framework for Medical Image Classification Using Soft Computing ...ijsc
Medical images provide diagnostic evidence/information about anatomical pathology. The growth in database is enormous as medical digital image equipment’s like Magnetic Resonance Images (MRI), Computed Tomography (CT), and Positron Emission Tomography CT (PET-CT) are part of clinical work. CT images distinguish various tissues according to gray levels to help medical diagnosis. Ct is more reliable for early tumours and haemorrhages detection as it provides anatomical information to plan radio therapy. Medical information systems goals are to deliver information to right persons at the right time and place to improve care process quality and efficiency. This paper proposes an Artificial Immune System (AIS) classifier and proposed feature selection based on hybrid Bacterial Foraging Optimization (BFO) with Local Search (LS) for medical image classification.
BFO – AIS: A FRAME WORK FOR MEDICAL IMAGE CLASSIFICATION USING SOFT COMPUTING...ijsc
Medical images provide diagnostic evidence/information about anatomical pathology. The growth in
database is enormous as medical digital image equipment’s like Magnetic Resonance Images (MRI),
Computed Tomography (CT), and Positron Emission Tomography CT (PET-CT) are part of clinical work.
CT images distinguish various tissues according to gray levels to help medical diagnosis. Ct is more
reliable for early tumours and haemorrhages detection as it provides anatomical information to plan radio
therapy. Medical information systems goals are to deliver information to right persons at the right time and
place to improve care process quality and efficiency. This paper proposes an Artificial Immune System
(AIS) classifier and proposed feature selection based on hybrid Bacterial Foraging Optimization (BFO)
with Local Search (LS) for medical image classification.
Iaetsd classification of lung tumour usingIaetsd Iaetsd
This document describes a study that aims to classify lung tumors using geometric and texture features extracted from chest x-ray images. The study uses 75 chest x-ray images (25 from small-cell lung cancer, 25 from non-small cell lung cancer, and 25 from tuberculosis) to extract geometric features like area, shape, and distance from texture features calculated using gray level co-occurrence matrices. Active shape models are used to segment the lung fields for feature extraction. The extracted features are then analyzed to determine the optimal features for classifying different types of lung abnormalities.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
https://jst.org.in/index.html
Our journal has dynamic landscape of academia and industry, the pursuit of knowledge extends across multiple domains, creating a tapestry where engineering, management, science, and mathematics converge. Welcome to our international journal, where we embark on a journey through the realms of cutting-edge technologies and innovative marketing strategies.
Detection of Cancer in Pap smear Cytological Images Using Bag of Texture Feat...IOSR Journals
This document presents a method for detecting cancer in Pap smear cytological images using bag of texture features. The method involves segmenting the nucleus region from the images, extracting texture features from blocks of the nucleus region, clustering the features to build a visual dictionary, and representing each image as a histogram of visual words present. The histograms are then used to retrieve similar images from a database using histogram intersection as the distance measure. Experiments were conducted with different block sizes and number of clusters, achieving up to 90% accuracy in identifying cancerous versus normal cells.
Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
Comparison of Feature selection methods for diagnosis of cervical cancer usin...IJERA Editor
Even though a great attention has been given on the cervical cancer diagnosis, it is a tuff task to observe the
pap smear slide through microscope. Image Processing and Machine learning techniques helps the pathologist
to take proper decision. In this paper, we presented the diagnosis method using cervical cell image which is
obtained by Pap smear test. Image segmentation performed by multi-thresholding method and texture and shape
features are extracted related to cervical cancer. Feature selection is achieved using Mutual Information(MI),
Sequential Forward Search (SFS), Sequential Floating Forward Search (SFFS) and Random Subset Feature
Selection(RSFS) methods
Comparison of Feature selection methods for diagnosis of cervical cancer usin...IJERA Editor
Even though a great attention has been given on the cervical cancer diagnosis, it is a tuff task to observe the
pap smear slide through microscope. Image Processing and Machine learning techniques helps the pathologist
to take proper decision. In this paper, we presented the diagnosis method using cervical cell image which is
obtained by Pap smear test. Image segmentation performed by multi-thresholding method and texture and shape
features are extracted related to cervical cancer. Feature selection is achieved using Mutual Information(MI),
Sequential Forward Search (SFS), Sequential Floating Forward Search (SFFS) and Random Subset Feature
Selection(RSFS) methods.
Cervical cancer diagnosis based on cytology pap smear image classification us...TELKOMNIKA JOURNAL
This document presents a study on classifying cytology images from pap smear tests to diagnose cervical cancer. The study uses discrete cosine transform (DCT) and Haar transform coefficients as features extracted from the images. Five different sized feature vectors are formed using fractional coefficients. Seven machine learning classifiers are tested on the feature vectors, with random forest classifier achieving the highest accuracy of 81.11%. The study aims to assist pathologists in cervical cancer diagnosis by providing an automated second opinion based on pap smear image analysis and classification.
A new model for large dataset dimensionality reduction based on teaching lear...TELKOMNIKA JOURNAL
One of the human diseases with a high rate of mortality each year is breast cancer (BC). Among all the forms of cancer, BC is the commonest cause of death among women globally. Some of the effective ways of data classification are data mining and classification methods. These methods are particularly efficient in the medical field due to the presence of irrelevant and redundant attributes in medical datasets. Such redundant attributes are not needed to obtain an accurate estimation of disease diagnosis. Teaching learning-based optimization (TLBO) is a new metaheuristic that has been successfully applied to several intractable optimization problems in recent years. This paper presents the use of a multi-objective TLBO algorithm for the selection of feature subsets in automatic BC diagnosis. For the classification task in this work, the logistic regression (LR) method was deployed. From the results, the projected method produced better BC dataset classification accuracy (classified into malignant and benign). This result showed that the projected TLBO is an efficient features optimization technique for sustaining data-based decision-making systems.
Computer Aided System for Detection and Classification of Breast CancerIJITCA Journal
Breast cancer is one of the most important causes of death among all type of cancers for grown-up and
older women, mainly in developed countries, and its rate is rising. Since the cause of this disease is not yet
known, early detection is the best way to decrease the breast cancer mortality. At present, early detection of
breast cancer is attained by means of mammography. An intelligent computer-aided diagnosis system can
be very helpful for radiologist in detecting and diagnosing cancerous cell patterns earlier and faster than
typical screening programs. This paper proposes a computer aided system for automatic detection and
classification of breast cancer in mammogram images. Intuitionistic Fuzzy C-Means clustering technique
has been used to identify the suspicious region or the Region of Interest automatically. Then, the feature
data base is designed using histogram features, Gray Level Concurrence wavelet features and wavelet
energy features. Finally, the feature database is submitted to self-adaptive resource allocation network
classifier for classification of mammogram image as normal, benign or malignant. The proposed system is
verified with 322 mammograms from the Mammographic Image Analysis Society Database. The results
show that the proposed system produces better results.
Predictive modeling for breast cancer based on machine learning algorithms an...IJECEIAES
Breast cancer is one of the leading causes of death among women worldwide. However, early prediction of breast cancer plays a crucial role. Therefore, strong needs exist for automatic accurate early prediction of breast cancer. In this paper, machine learning (ML) classifiers combined with features selection methods are used to build an intelligent tool for breast cancer prediction. The Wisconsin diagnostic breast cancer (WDBC) dataset is used to train and test the model. Classification algorithms, including support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), logistic regression (LR), k-nearest neighbors (k-NN), and naïve Bayes, were employed. Performance measures for each of them were obtained, namely: accuracy, precision, recall, F-score, Kappa, Matthews correlation coefficient (MCC), and time. The results indicate that without feature selection, LightGBM achieves the highest accuracy at 95%. With minimum redundancy maximum relevance (mRMR) feature selection (15 features), LightGBM outperforms other classifiers, achieving an accuracy of 98%. For Pearson correlation coefficient feature selection (15 features), LightGBM also excels with a 95% accuracy rate. Lasso feature selection (5 features) produces varied results across classifiers, with logistic regression achieving the highest accuracy at 96%. These findings underscore the importance of feature selection in refining model performance and in improving detection for breast cancer.
This document presents a novel fuzzy k-nearest neighbor equality (FK-NNE) algorithm for classifying masses in mammograms as benign or malignant. The algorithm assigns membership values to different classes based on distances to k-nearest neighbors. It achieved 94.46% sensitivity, 96.81% specificity, and 96.52% accuracy, outperforming k-nearest neighbors, fuzzy k-nearest neighbors, and k-nearest neighbor equality algorithms. The algorithm considers relative importance of neighbors and assigns partial membership to classes, addressing issues with insufficient knowledge faced by other techniques. Experimental results demonstrated FK-NNE had the best performance with an area under the ROC curve of 0.9734, indicating high diagnostic accuracy.
An Ensemble of Filters and Wrappers for Microarray Data Classification mlaij
The development of microarray technology has suppli
ed a large volume of data to many fields. The gene
microarray analysis and classification have demonst
rated an effective way for the effective diagnosis
of
diseases and cancers. In as much as the data achiev
ing from microarray technology is very noisy and al
so
has thousands of features, feature selection plays
an important role in removing irrelevant and redund
ant
features and also reducing computational complexity
. There are two important approaches for gene
selection in microarray data analysis, the filters
and the wrappers. To select a concise subset of inf
ormative
genes, we introduce a hybrid feature selection whic
h combines two approaches. The fact of the matter i
s
that candidate’s features are first selected from t
he original set via several effective filters. The
candidate
feature set is further refined by more accurate wra
ppers. Thus, we can take advantage of both the filt
ers
and wrappers. Experimental results based on 11 micr
oarray datasets show that our mechanism can be
effected with a smaller feature set. Moreover, thes
e feature subsets can be obtained in a reasonable t
ime
An Ensemble of Filters and Wrappers for Microarray Data Classification mlaij
The development of microarray technology has supplied a large volume of data to many fields. The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. In as much as the data achieving from microarray technology is very noisy and also has thousands of features, feature selection plays an important role in removing irrelevant and redundant features and also reducing computational complexity. There are two important approaches for gene selection in microarray data analysis, the filters and the wrappers. To select a concise subset of informative genes, we introduce a hybrid feature selection which combines two approaches. The fact of the matter is that candidate’s features are first selected from the original set via several effective filters. The candidate feature set is further refined by more accurate wrappers. Thus, we can take advantage of both the filters and wrappers. Experimental results based on 11 microarray datasets show that our mechanism can be effected with a smaller feature set. Moreover, these feature subsets can be obtained in a reasonable time.
Controlling informative features for improved accuracy and faster predictions...Damian R. Mingle, MBA
Identification of suitable biomarkers for accurate prediction of phenotypic outcomes is a goal for personalized medicine. However, current machine learning approaches are either too complex or perform poorly.
For more information:
http://societyofdatascientists.com/controlling-informative-features-for-improved-accuracy-and-faster-predictions-in-omentum-cancer-models/?src=slideshare
Similar to Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based Texture Features and Fuzzy Logic based Hybrid Kernel Support Vector Machine Classifier (20)
Land Cover maps supply information about the physical material at the surface of the Earth (i.e. grass, trees, bare ground, asphalt, water, etc.). Usually they are 2D representations so to present variability of land covers about latitude and longitude or other type of earth coordinates. Possibility to link this variability to the terrain elevation is very useful because it permits to investigate probable correlations between the type of physical material at the surface and the relief. This paper is aimed to describe the approach to be followed to obtain 3D visualizations of land cover maps in GIS (Geographic Information System) environment. Particularly Corine Land Cover vector files concerning Campania Region (Italy) are considered: transformed raster files are overlapped to DEM (Digital Elevation Model) with adequate resolution and 3D visualizations of them are obtained using GIS tool. The resulting models are discussed in terms of their possible use to support scientific studies on Campania Land Cover.
Comparison between Cisco ACI and VMWARE NSXIOSRjournaljce
Software-Defined Networking(SDN) allows you to have a logical image of the components in the data center, also you could arrange the components logically and use them according to the software application needs. This paper gives an overview about the architectural features of Cisco’s Application Centric Infrastructure (ACI) and Vmware’s NSX and also compares both the architectures and their benefit
Student’s Skills Evaluation Techniques using Data Mining.IOSRjournaljce
This document discusses techniques for evaluating students' programming skills using data mining. It proposes using association rule mining to analyze student data and accurately predict their programming abilities. The key steps involve collecting student performance data, preprocessing the data, extracting features, clustering students based on skills, generating rules for skill evaluation, and predicting student skill types based on the rules. The overall goal is to help educational institutions and companies better evaluate programming skills to improve student training and increase placement opportunities.
Data mining is a process to extract information from a huge amount of data and transform it into an
understandable structure. Data mining provides the number of tasks to extract data from large databases such
as Classification, Clustering, Regression, Association rule mining. This paper provides the concept of
Classification. Classification is an important data mining technique based on machine learning which is used to
classify the each item on the bases of features of the item with respect to the predefined set of classes or groups.
This paper summarises various techniques that are implemented for the classification such as k-NN, Decision
Tree, Naïve Bayes, SVM, ANN and RF. The techniques are analyzed and compared on the basis of their
advantages and disadvantages
Analyzing the Difference of Cluster, Grid, Utility & Cloud ComputingIOSRjournaljce
: Virtualization and cloud computing is creating a fundamental change in computer architecture,
software and tools development, in the way we store, distribute and consume information. In the recent era of
autonomic computing it comes the importance and need of basic concepts of having and sharing various
hardware and software and other resources & applications that can manage themself with high level of human
guidance. Virtualization or Autonomic computing is not a new to the world, but it developed rapidly with Cloud
computing. In this paper there give an overview of various types of computing. There will be discussion on
Cluster, Grid computing, Utility & Cloud Computing. Analysis architecture, differences between them,
characteristics , its working, advantages and disadvantages
: Cloud processing is turning into an inexorably mainstream endeavor demonstrate in which figuring assets are made accessible on-request to the client as required. The one of a kind incentivized offer of distributed computing makes new chances to adjust IT and business objectives. Distributed computing utilizes the web advancements for conveyance of IT-Enabled abilities 'as an administration' to any required clients i.e. through distributed computing we can get to anything that we need from anyplace to any PC without agonizing over anything like about their stockpiling, cost, administration etc. In this paper, I give a far-reaching study on the inspiration variables of receiving distributed computing, audit the few cloud sending and administration models. It additionally investigates certain advantages of distributed computing over customary IT benefit environment-including versatility, adaptability, decreased capital and higher asset usage are considered as appropriation explanations behind distributed computing environment. I additionally incorporate security, protection, and web reliance and accessibility as shirking issues. The later incorporates vertical versatility as specialized test in cloud environment.
An Experimental Study of Diabetes Disease Prediction System Using Classificat...IOSRjournaljce
Data mining means to the process of collecting, searching through, and analyzing a large amount of data in a database. Classification in one of the well-known data mining techniques for analyzing the performance of Naive Bayes, Random Forest, and Naïve Bayes tree (NB-Tree) classifier during the classification to improve precision, recall, f-measure, and accuracy. These three algorithms, of Naive Bayes, Random Forest, and NB-Tree are useful and efficient, has been tested in the medical dataset for diabetes disease and solving classification problem in data mining. In this paper, we compare the three different algorithms, and results indicate the Naive Bayes algorithms are able to achieve high accuracy rate along with minimum error rate when compared to other algorithms.
Candidate Ranking and Evaluation System based on Digital FootprintsIOSRjournaljce
Digital resume provides insights about a candidate to the organization. This paper proposes a system where digital resumes of candidates are generated by extracting data from social networking sites like Facebook, Twitter and LinkedIn. Data which is relevant to recruitment is obtained from unstructured data using Data Mining algorithms. Candidates are evaluated based on their digital resumes and ranked accordingly. Ranking is done based on the requirements specified by an organization for a key position. The key aspects of this paper are a) Specification and design of system. b) Generation of digital Resume. c) Ranking of candidates. According to the ranking provided by this system, Recruiters can shortlist candidates for interviews. Thus, it revolutionizes the traditional recruitment process.
The Systematic Methodology for Accurate Test Packet Generation and Fault Loca...IOSRjournaljce
As we probably aware now a days networks are broadly dispersed so administrators relies on upon different devices, for example, pings and follow course to troubleshoot the issue in network. So we proposed a robotized and orderly approach for testing and troubleshooting network called "Automatic Test Packet Generation"(ATPG). ATPG first peruses switch arrangement and produces a gadget free model. The model is utilized to produce least number of test packets to cover each connection in a network and every control in network. ATPG is equipped for researching both useful and execution issues. Test packets are sent at customary interims and separate strategy is utilized to confine flaws. The working of few disconnected devices which automatically create test packets are additionally given, yet ATPG goes past the prior work in static (checking liveness and fault localization).
The Use of K-NN and Bees Algorithm for Big Data Intrusion Detection SystemIOSRjournaljce
Big data problem in intrusion detection system is mainly due to the large volume of the data. The dimension of the original data is 41. Some of the feature of original data are unnecessary. In this process, the volume of data has expanded into hundreds and thousands of gigabytes(GB) of information. The dimension span of data and volume can be reduced and the system is enhanced by using K-NN and BA. The reduction ratio of KDD datasets and processing speed is very slow so the data has been reduced for extracting features by Bees Algorithm (AB) and use K-nearest neighbors as classification (KNN). So, the KDD99 datasets applied in the experiments with significant features. The results have gave higher detection and accuracy rate as well as reduced false positive rate. Keywords: Big Data; Intru
Study and analysis of E-Governance Information Security (InfoSec) in Indian C...IOSRjournaljce
The purpose of the study is to explore and find a research gap in E-Governance Information Security (InfoSec) domain in Indian Context. The study identifies the research gap in E-Governance InfoSec domain and substantiates given research gap with relevant literature review. The study outcomes clearly depict the requirement of research in the field of InfoSec in e-governance domain in a country like India.
This document analyzes the security vulnerabilities of 26 e-governance websites and web applications in Gujarat, India. It finds that the majority (61.88%) used Microsoft technologies while 38.13% used Apache. Both technologies were most vulnerable to low and medium severity issues. Microsoft sites saw more validation vulnerabilities while Apache sites saw more configuration and informational vulnerabilities. High severity flaws were usually configuration-based, medium flaws validation-based, and low/informational flaws informational-based. The study aims to improve e-governance security by analyzing technology risks and vulnerability types.
Exploring 3D-Virtual Learning Environments with Adaptive RepetitionsIOSRjournaljce
: In spatial tasks, the use of cognitive aids reduce mental load and therefore being appealing to trainers and trainees. However, these aids can act as shortcuts and prevents the trainees from active exploration which is necessary to perform the task independently in non-supervised environment. In this paper we used adaptive repetition as control strategy to explore the 3D- Virtual Learning environments. The proposed approach enables the trainee to get the benefits of cognitive support while at the same time he is actively involved in the learning process. Experimental results show the effectiveness of the proposed approach
Human Face Detection Systemin ANew AlgorithmIOSRjournaljce
Trying to detecting a face from any photo is big problem and got these days a focusing because of it importance, in face recognition system,face detection in one of the basic components. A lot of troubles are there to be solved in order to create a successful face detection algorithm. The skin of face has its properties in color domain and also a texture which may help in the algorithm for detecting faces because of its ability to find skins from photo. Here we are going to create a new algorithm for human face detection depending on skin color tone specially YCbCr color tone as an approach to slice the photo into parts. In addition, Gray level has been used to detect the area which contains a skin, after that anotherlevel used to erase the area that does not contain skin. The system which proposed applied on many photos and passed with great accuracy of detecting faces and it has a good efficient especially to separate the area that does not contain skin or face from the area which contain face and skin. It has been agreed and approved that the accuracy of the proposed system is 98% in human face detection.
Value Based Decision Control: Preferences Portfolio Allocation, Winer and Col...IOSRjournaljce
The paper presents an innovative approach to mathematical modeling of complex systems „humandynamical process”. The approach is based on the theory of measurement and utility theory and permits inclusion of human preferences in the objective function. The objective utility function is constructed by recurrent stochastic procedure which represents machine learning based on the human preferences. The approach is demonstrated by two case studies, portfolio allocation with Wiener process and portfolio allocation in the case of financial process with colored noise. The presented formulations could serve as foundation of development of decision support tools for design of management/control. This value-oriented modeling leads to the development of preferences-based decision support in machine learning environment and control/management value based design.
Assessment of the Approaches Used in Indigenous Software Products Development...IOSRjournaljce
This document summarizes a study that assessed approaches used in indigenous software product development in Nigeria. It involved surveying software development firms and educational institutions. The commonly used approaches identified were spiral, agile, prototyping, object-oriented, rotational unified process, incremental, waterfall, and integrated. Spiral, agile and prototyping approaches had the highest ratings. The study found significant differences in how these approaches were used and their impact on domestic software use and industry growth in Nigeria.
Secure Data Sharing and Search in Cloud Based Data Using Authoritywise Dynami...IOSRjournaljce
The Data sharing is an important functionality in cloud storage. We describe new public key crypto systems which produce constant-size cipher texts such that efficient delegation of decryption rights for any set of cipher texts are possible. The novelty is that one can aggregate any set of secret keys and make them as compact as a single key, but encompassing the power of all the keys being aggregated. Ensuring the security of cloud computing is second major factor and dealing with because of service availability failure the single cloud providers demonstrated less famous failure and possibility malicious insiders in the single cloud. A movement towards Multi-Clouds, In other words ”Inter-Clouds” or ”Cloud-Of-Clouds” as emerged recently. This works aim to reduce security risk and better flexibility and efficiency to the user. Multi-cloud environment has ability to reduce the security risks as well as it can ensure the security and reliability.
Panorama Technique for 3D Animation movie, Design and EvaluatingIOSRjournaljce
This paper presents an applied approach for Panorama 3D movies enhanced with visual sound effects. The case study that considered is IIPS@UOIT. Many selected S/W have been used to introduce the 3D Movie. 3D Animation is a modern technology in the field of the world of filmmaking and is considered the core of multimedia, where the vast majority of movies such as Hollywood movies that we see today, it was using 3D technology. Where this technique is used in all the magazines, such as medical experiments, engineering, astronomy, planets and stars, to prove scientific theories, history, geography, etc., where they are building models or scenes or characters simulates reality and the movement of the viewer to the heart of the event. A three-dimensional film was made to (IIPS @ UOITC) to give it a future vision and published in the global sites such as YouTube, Facebook and Google earth. By using many specialized 3d software and cinematic tricks, with a focusing on movement, characters, lighting, cameras and final render.
Density Driven Image Coding for Tumor Detection in mri ImageIOSRjournaljce
The significant of multi spectral band resolution is explored towards selection of feature coefficients based on its energy density. Toward the feature representiaon in transformed domain, multi wavelet transformations were used for finer spectral representation. However, due to a large feature count these features are not optimal under low resource computing system. In the recognition units, running with low resources a new coding approach of feature selection, considering the band spectral density is developed. The effective selection of feature element, based on its spectral density achieve two objective of pattern recognition, the feature coefficient representiaon is minimized, hence leading to lower resource requirement, and dominant feature representation, resulting in higher retrieval performance.
Analysis of the Waveform of the Acoustic Emission Signal via Analogue Modulat...IOSRjournaljce
The acoustic emission (AE) technique is a non-destructive testing technique applied to pressurized rigid pipelines in order to identify metallurgical discontinuities. This study analyses the dynamic behavior of the propagation of discontinuities in cracks via AE, in the following propagation classes: no propagation (NP), stable propagation (SP), and unstable propagation (UP). The methodology involves applying the concept of modulation of analogue signals, which are used in signal transmission in telecommunications, for the development of a neural network in order to determine new parameters for the AE waveform. The classification of AE signals into propagation classes occurs, therefore, through the extraction of information related to the dynamics of AE signals, by means of the parameters of the analogue carriers of the modulations (in amplitude and in angle) that make up the AE signal. This set of parameters enables an efficient classification (average of 90%) through the identification of patterns from the AE signals in the classes for monitoring the state of the discontinuity, through the use of computational intelligence techniques (artificial neural networks and nonlinear classification).
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based Texture Features and Fuzzy Logic based Hybrid Kernel Support Vector Machine Classifier
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 1, Ver. IV (Jan.-Feb. 2017), PP 23-34
www.iosrjournals.org
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 23 | Page
Multi Class Cervical Cancer Classification by using ERSTCM,
EMSD & CFE methods based Texture Features and Fuzzy Logic
based Hybrid Kernel Support Vector Machine Classifier
S.Athinarayanan1
, Dr.M.V.Srinath2
, R.Kavitha3
1
(Research Scholar, Mononmaniam Sundaranar University, India)
2
(Director, Department of MCA, STET Women’s College, Sundarakkottai, India)
3
(Assistant Professor, Department of BCA, The MDT Hindu College, Pettai, India)
Abstract: Cervical cancer is the highest rate of incidence after breast cancer, gastric cancer, colorectal
cancer, thyroid cancer among all malignant that occurs to females ; also it is the most prevalent cancer among
female genital cancers. Manual cervical cancer diagnosis methods are costly and sometimes result inaccurate
diagnosis caused by human error but machine assisted classification system can reduce financial costs and increase
screening accuracy. In this research article, we have developed multi class cervical classification system by
using Pap Smear Images according to the WHO descriptive Classification of Cervical Histology. Then, this
system classifies the cell of the Pap Smear image into anyone of five types of the classes of normal cell, mild
dysplasia, moderate dysplasia, severe dysplasia and carcinoma in situ (CIS) by using individual and Combining
individual feature extraction method with the classification technique. In this paper three Feature Extraction
methods were used: From that three, two were individual feature extraction method namely Enriched Rough Set
Texton Co-Occurrence Matrix (ERSTCM) and Enriched Micro Structure Descriptor (EMSD) and the remained
one was combining individual feature extraction method namely concatenated feature extraction method (CFE).
The CFE method represents all the individual feature extraction methods of ERSTCM & EMSD features are
combining together to one feature to assess their joint performance. Then these three feature extraction methods
are tested over Fuzzy Logic based Hybrid Kernel Support Vector Machine (FL-HKSVM) Classifier. This
Examination was conducted over a set of single cervical cell based pap smear images. The dataset contains five
classes of images, with a total of 952 images. The distribution of number of images per class is not uniform.
Then the performance was evaluated in both the individual and combining individual feature extraction method
with the classification techniques by using the statistical parameters of sensitivity, specificity & accuracy.
Hence the resultant values of the statistical parameters described in individual feature extraction method with
the classification technique, proposed EMSD+FLHKSVM Classifier had given the better results than the other
ERSTCM+FLHKSVM Classifier and combining individual feature extraction method with the classification
technique described, proposed CFE+FLHKSVM Classifier had given the better results than other
EMSD+FLHKSVM & ERSTCM+FLHKSVM classifiers.
Keywords: Cervical Cancer, Pap Smear Images, Feature Extraction, Classification.
I. Introduction
Cervical cancer is a second most common cancers affecting women in worldwide [1]. It can be cured,
if it is detected at early stages and also identify the stage for providing the proper treatment. In the meanwhile,
causes of the death rate of this cancer is high. It is reported that yearly 1 lakh new cases were diagnosed among
these 75 thousand were deaths in India, which is nearly one-third of the global cancer deaths [2]. Pap test is a
effective screening test of cervical cancer, which is said to be the gold standard forever. Due to the subjective
discrepancy of different cytologists, this test results show more of irregularities [3]. The test output shows more
of false positive and false negative results, which make the trustworthiness of the screening process is a question
mark [4]. Also the manual screening process, hundreds of images are analyzed daily; the classification of cells
become difficult and the chance of human errors become high.Many automatic and semiautomatic methods have
been proposed in several times to detect various stages of cervical cancer. Many of these methods were not
supported in achieving the objectives of providing measured variables which could eliminate the interpretation
errors and inter observer discrepancy [5]. Pap smear images are rich in various features like color, shape, and
texture. The process of accurate extraction of unique visual features from these images would very well help in
developing an automated screening device. This can be achieved only through texture feature than others since
all the cellular changes are observed only using these features. Since the texture parameters are the simple
mathematical representations like smooth, rough, and grainy, then the analysis becomes easier [6]. Analyzing all
the above issues, two important challenges are considered. First, selection of unique texture features suitable for
classification. Second, selection of the most efficient and scalable classifier improves the accuracy more.
2. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 24 | Page
II. Related Works
Plissiti et al. [7] have developed the fully automated method to detect the nucleus in Pap smear images.
Using morphological analysis, the nuclei centroids are detected by applying distance dependent rule and
classification algorithms on resulted centroids, the undesirable artifacts were removed from the cell. By
considering nucleus as the most informative region of the cell, Sobrevilla et al. [8] have been presented an
algorithm for automatic nuclei detection of cytology cell. This algorithm combines color, cyto pathologist‟s
knowledge, and fuzzy systems which show high performance and more computational speed.
Bergmeir et al. [9] have developed an algorithm used to detect cell nuclei and cytoplasm. This
algorithm used the combination of voting scheme and prior knowledge to locate the cell nuclei and elastic
segmentation to determine the shape of nucleus. The noise is removed with mean-shift and median filters, and
edges were extracted with canny edge detection algorithm. The automatic classification of Pap smear images
focuses on marking of single cells into any one of binary classes (normal and abnormal) or multiple classes
(based on severity). Multiple classification of Pap smear images became popular when CIN based classification
on cervical cytology images was proposed.
Holmquist et al. [10] developed a binary classification method to distinguish between normal and
abnormal cells. Marinakis et al. [11] proposed a metaheuristic approach to classify the Pap smear cells.
Uniquely described twenty features are extracted from each cell image and classified as normal and abnormal
type. Chen et al. [12] have developed an algorithm for segmenting nucleus and cytoplasm counters. This system
classifies the Pap smear cells into anyone of four different types of classes using SVM. Two experiments were
conducted to validate the classification performance which showed the best performance outputs.
III. multi class cervical cancer classification system
Current manual screening methods are costly and sometimes result in inaccurate diagnosis caused by
human error. The introduction of machine assisted screening will bring significant benefits to the community,
which can reduce financial costs and increase screening accuracy. In this research article, we have developed,
Multi class cervical cancer classification system based on individual and combining individual texture features
and Fuzzy Logic based hybrid kernel support vector machine using pap smear images. Two major contribution
of the proposed system is feature extraction and feature classification. The architecture of the proposed
classification system is depicted in Figure 1. The system involves the following steps: preparation of Pap smear
images and pre-processing, segmentation of nucleus, extraction of texture features, and classification.
Figure.1: Architecture of the Proposed System
3.1 Feature Extraction
The purpose of feature extraction is to reduce the original data set by measuring certain properties or
features, that distinguish one 5input pattern from another pattern. The extracted feature is expected to provide
the characteristics of the input type to the classifier by considering the description of the relevant properties of
the image into a feature space. In this paper, we proposed three novel feature extraction methods. From that
three, two were individual feature extraction methods, they are Enriched Rough Set Texton Co-Occurrence
Matrix (ERSTCM) & Enriched Micro Structure Descriptor (EMSD) and the remained one was combining
3. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 25 | Page
individual feature extraction features method named as Concatenated Feature Extraction (CFE). The CFE
method represents all the individual feature extraction methods of ERSTCM & EMSD features are combining
together to one feature to assess their joint performance. The types of individual and combining individual
feature extraction features method is given below.
Individual Feature Extraction Methods:
Computation of Feature Vector F(V1) using ERSTCM.
Computation of Feature Vector F(V2) using EMSD.
Combining Individual Feature Extraction Features Method:
Computation of Feature Vector F(V3) using CFE.
The detailed Process of the above feature extraction method is given below.
3.1.1.Individual Feature Extraction Methods:
3.1.1.1.Computation of Feature Vector F(V1) using ERSTCM:
As per Julesz [13] description a texton is a pattern which is shared by an image as a common property
all over the image. Textures will be formed only if the adjacent elements lie within the neighbourhood. Texton
Image has the discrimination power of color, texture and shape features. Based on the Julez‟s [13] textons
theory the texton co-occurrence matrices (TCM) algorithm, can describe the spatial correlation of textons for
image retrieval. With the use of limited number of selected pixels the algorithm computes different features.
Extraction of effective texture information will be increased by calculating texton using Rough texture. This
method is represented Enriched Rough Set Texton Co-occurrence Matrices. In this Method. As a texture
descriptor, good retrieval accuracy can be achieved especially for directional textures. The texton can be
inclined with the use of critical distances between texture elements. Texture element size determines the critical
distance. Texture can be resolved into minute units, like orientation, Texton classes of colors, elongated blobs of
specific width, aspect ratios and terminators of elongated blobs. If texture elements are expanded to a large
extent in one orientation discrimination reduces. If the elongated elements are not jittered in orientation texture
gradients increase at boundaries. Hence by using a sub image of size 3 x 3 a texton gradient can be obtained.
Here 20 textons of 3 x 3 grids are proposed as shown in Figure 2. Even the co-occurrence probability of same
valued pixels in 3 x 3 grids is smaller than that of 2 x2 grid, but the textons developed using 2 x 2 grid may not
give complete information regarding direction. The computational complexity for using the overlapped
components of 20 textons is also less to obtain final texton image. The 20 textons of 3 X 3 grid can detect
textons in all directions and also the corners of the textures. If three pixels are highlighted and have same value
then, grid will form a texton as shown in Figure 3.
Texton Detection
The Working Mechanism of Texton detection is illustrated in figure.4. In the final segmented nucleus
image, we move 3 x 3 block from left to right and top to bottom throughout the image to detect textons with 3
pixel as step length. If a texton is detected, the same value of the three pixels in the 3 x 3 grids are kept
unchanged and the remained pixels are zero value. Finally we will obtain a Texton image, denoted by T(x,y).
After the formation of final texton image, the feature vector F(V1) (Seven features such as contrast, inverse
difference moment, correlation, variance, cluster shade, cluster prominence and Homogeneity) is extracted from
it.
4. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 26 | Page
Figure.2 : 20 types of 3 x 3 Textons
Figure.3 : Texton Detection Process
3.1.1.2.Computation of Feature Vector F(V2) using EMSD:
Julesz‟s texton theory mainly focuses on analyzing regular textures, while the micro-structures can be
considered as the extension of Julesz‟s textons or the color version of textons. Since micro-structures involve
color, texture and shape information, they can better present image features for image retrieval. In our method in
order to find the micro-structures, we partition the original image into 3 x 3 block. Then We move the 3x3 block
from left-to-right and top-to-bottom throughout the image to detect micro-structures. In the 3x3 block, if one of
the eight nearest neighbors has the same value as the center pixel, then it is kept unchanged; otherwise it is set to
empty. If all the eight nearest neighboring pixels are empty, then the 3x3 block is not considered as a micro-
structure and all pixels in the 3x3 block will be set to empty. The pattern resulted from this operation is called a
fundamental micro-structure. Figure. 4(a-d) shows an example of the micro-structure detection process.
Figure.4: An example of micro-structure detection: (a) a 3x3 grid of edge orientation map; (b)–(c) show the
micro-structure detection process; and (d) shows the detected fundamental micro-structure.
5. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 27 | Page
The micro structure map extraction process is a four-step process as described below and is shown in Fig. 5a.
i. Divide the original image f (x, y) into 3 × 3 blocks.
ii. Move the 3×3 block horizontally and vertically from left to right and top to bottom throughout the original
image f (x, y) with a step length of three pixels from the origin (0, 0). Then, generate the Micro structure map
M1(x, y), where 0 ≤ x ≤ W −1, 0 ≤ y ≤ N −1.
iii. Repeatedly doing the step (ii) from the origin (0, 1), (1, 0) and (1, 1) and generate the micro structure maps
M2(x, y), where 0 ≤ x ≤ W − 1, 1 ≤ y ≤ N − 1, M3(x, y), where 1 ≤ x ≤ W − 1, 0 ≤ y ≤ N − 1 and M4(x, y), where
1 ≤ x ≤ W − 1, 1 ≤ y ≤ N − 1, respectively.
iv. Generate the final micro structure image M (x, y) using the fusion of micro structure map as per Eq.1.
M(x, y) = Max {M1(x, y), M2(x, y), M3(x, y), M4(x, y)} ----(1).
The formation of final micro structure map M (x, y) using fusion of micro structure map M1(x, y), M2 (x, y), M3
(x, y) and M4 (x, y) is shown in Fig. 5b.
After the final micro-structure map M(x,y) is extracted from the original image f(x,y), The micro structure map
is applied to the same original image. The pixels that do not match the map are set to empty. Based on this
process, we get the final micro structure image. This process was shown in Fig. 5c. After the formation of final
texton image, the feature vector F(V2) (Seven features such as contrast, inverse difference moment, correlation,
variance, cluster shade, cluster prominence and Homogeneity) is extracted from it.
Figure.5: Micro Structure Image Extraction Process: a) Micro structure map M1(x, y) extraction process,
b) formation of final Micro structure map M(x, y) using fusion of texton structure map M1(x, y), M2(x, y),
M3(x, y) and M4(x, y). c) Micro structure image extraction process
Difference between ERSTCM and EEETCM:
The formation of Micro Structure Map based process of EMSD contains rich information than 20 types
of Texton based ERSTCM. This was shown in Figure 6. Then the Computational Complexity of EMSD is also
less to obtain final image than ERSTCM, Because ERSTCM contains 20 textons, so each 3 x 3 block of the
image it needs to apply that 20 textons for getting the final image, but in EMSD 3 x 3 block is similar, but the
process applying direction is less when compared to ERSTCM.
6. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 28 | Page
Figure.6: The Result of Texton detection a) EMSD b) ERSTCM c) Difference between EMSD & ERSTCM.
3.1.2.Combining Individual Feature Extraction Features Method:
3.1.2.1 Computation of Feature Vector F(V3) using CFE:
The name CFE Stands for Concatenated Feature Extraction. The CFE method represents all the
individual feature extraction methods of ERSTCM & EMSD features are Combining together to one feature to
assess their joint performance. Hence, total Feature vector uses F(V3) = F(V1) + F(V2) dimensional vector as
the concluding image features in cervical cell classification.
Table.1. Summary of Texture Features
The three feature extraction methods and its features details are given in the following Table 1.
Feature category Features Number of features
ERSTCM Contrast, inverse difference moment, correlation, variance,
cluster shade, cluster prominence and Homogeneity
7 features
EMSD Contrast, Inverse difference moment, correlation, variance,
cluster shade, cluster prominence and Homogeneity
7 features
CFE The features of ERSTCM & EMSD 14 features
Table.1: Features Summary Table.
3.2.Nucleus feature Classification:
The individual feature extraction methods of ERSTCM & EMSD & Combining individual feature
extraction features method (CFE) features are tested over Fuzzy Logic based Hybrid Kernel Support Vector
Machine (FL-HKSVM) classifier. The result of this classifier represent classifies the cervical cell of the pap
smear image into any one of the classes of normal, mild dysplasia, moderate dysplasia, severe dysplasia and
Carcinoma in Situ (CIS). This Examination was conducted over a set of single cervical cell based pap smear
images. The dataset contains four classes of abnormal cell images, with a total of 952 images. From that 912
images, normal – 440, mild dysplasia - 107, moderate dysplasia - 124, severe dysplasia - 139, & Carcinoma in
Situ (CIS) were 142 images.
3.2.1. Feature Classification using FL-HKSVM:
The proposed classification system classify the Pap smear images into any one of five classes, namely
normal, mild dysplasia, moderate dysplasia, severe dysplasia, and carcinoma in situ. The manual classification
of the entire dataset has been already done by the experts. In the proposed classification system, we used Fuzzy
Logic based Hybrid Kernel SVM algorithm to classify the Pap smear images.
There are two phases in the support vector machine namely.
(1) Training phase and (2) Testing phase.
Fuzzy logic based SVM (FSVM) is an effective supervised classifier and accurate learning technique, which
was first proposed by Lin and Wang [14]. In FSVM is to assign each data point a membership value according
to its relative importance in the class. Since each data point ix has an assigned membership value i , the
training set fs and is given by Eq.2.
1
, ,
n
f i i i i
s x y
(2)
For positive class 1iy , the set of membership values are denoted as i
, and are denoted as i
for
negative class 1iy , they are assigned independently. The main process of fuzzy SVM is to maximize the
margin of separation and minimize the classification error.
The optimal hyperplane problem of FSVM can be defined as the following problem [15] and is given by Eq.3.
7. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 29 | Page
2
,
1
1
min
2
n
i i
w
i
w C f
(3)
Subject to
( . ) 1
0 , 1,..........
i i i
i
y w x b
i n
Where (0 1)i if f is the fuzzy membership function, i if is a error of different weights and C is a constant
The inputs to FSVM algorithm are the feature subset selected via ERSTCM, EMSD & CFE. It follows
the structural risk minimization principle from the statistical learning theory. Its kernel is to control the practical
risk and classification capacity in order to broaden the margin between the classes and reduce the true costs. A
Fuzzy support vector machine searches an optimal separating hyper-plane between members and non-members
of a given class in a high dimension feature space.
The Lagrange multiplier function of FSVM is Eq.4.
2
1 1 1
1
( , , , ) ( ( ) 1 )
2
n n n
i i i i i i i
i i i
L w b w C f y wz b
(4)
Which satisfies the following parameter condition
1
1
0
0
0
n
i i i
i
n
i i
i
i i i
w y z
y
f C
Then the optimization problem can be transferred and defined in Eq.5.
1
( ) ( , )
2
i i j i j i jMax W y y H x x (5)
Subject to:
0
0 , 1,2,...,
i i
i i
y
f C i n
Where the parameter i can be solved by the sequential minimal optimization (SMO) quadratic programming
approach. We have analyzed the kernel equation from the existing work (Admasua et al., 2003) and used them
in the proposed work, namely, RBF and quadratic function.
Radial basis function:
The Support Vector will be the centre of the RBF and will determine the area of influence. This support vector
has the data space, it is given in Eq. 6:
----- (6)
Quadratic Kernel function:
Polynomial kernels are the form . Where d=1, a linear kernel and d=2, a quadratic
kernel are commonly used. Let K1(RBF), K2 (Quadratic) be kernels over and K3 be a kernel over
RP
x RP
. Let function . The two kernels based formulations are represented in Eq. 7 and 8:
is a kernel. ------ (7)
is a kernel. ------- (8)
Substitute Eq. 7 and 8 lagrange multiplier Eq. 5 and get the proposed hybrid kernel. It is exposed in Eq.9:
--------- (9)
8. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 30 | Page
IV. Experimental Results And Discussion
4.1 Experimental image Data set.
For any machine learning algorithm, the database with which it is trained plays an important role. It is
said that a machine can be made to learn and reproduce any human behaviour, provided it is trained with
suitably precise database. The database prepared in this work consists of five classes of normal & abnormal cell
based 952 single cervical cell presented in pap smear images. These Images were collected from Muthamil
Hospital, Tirunelveli District, Tamilnadu State, India in 2013 and these images were taken with 100X lens
magnification using Olympus ch20i Microscope. Each image was examined and diagnosed by pathologists of
that hospital before being used as reference for this study. The Sample Image data sets are shown in the table.2.
The sample Pap smear experimental images of various classes is shown in Figure 7. The data set contains 952
images and these images were histologically diagnosed and graded based on World Health Organization (WHO)
criteria as the following abnormal class distribution:
Normal. 440 images.
Mild dysplasia, 107 images.
Moderate dysplasia, 124 images.
Severe dysplasia, 139 images.
Carcinoma in Situ. 142 images.
(A) (B) (C) (D) (E)
Figure.7: Some of the cells found in cervix: (A-D) normal, mild, moderate and severe dysplasia,(E) Carcinoma
in situ
The Implementation was done in the tool of matlab.
4.2. Experimental Results & Comparative Analysis
This section describes the experimental results of the proposed classification method using Pap smear
images with different types of cervical cancer. In the proposed method, the experimental image data set is
divided into two sets such as training set and testing set. The details of this set was shown in Table 2. The
classifiers are trained with the training images and the classification accuracy is calculated only with the testing
images. In the testing phase, the testing dataset is given to the proposed technique to find the cancers type in
smear images and the obtained results are evaluated through evaluation metrics namely, sensitivity, specificity
and accuracy [16], it is given by (eqn.10-12)
FN)TP/(TPySensitivit -----------(10)
Specificity TN/(TN FP) -----------(11)
FP)FNTPTP)/(TNTNAccuracy ( -------- (12)
For example, Where TP corresponds to True Positive, TN corresponds to True Negative, FP
corresponds to False Positive and FN corresponds to False Negative. These parameters for a specific category,
say, Mild dysplasia are as follows: TP is True Positive (an image of „Mild dysplasia‟ type is
categorized correctly to the same type), TN = True Negative (an image of „Non- Mild dysplasia‟ type is
categorized correctly as „Non- Mild dysplasia‟ type), FP =False Positive (an image of „Non- Mild
dysplasia‟ type is categorized wrongly as „Mild dysplasia‟ type) and FN is False Negative (an image of „Mild
dysplasia‟ type is categorized wrongly as „Non- Mild dysplasia‟ type).„Non- Mild dysplasia‟ actually
corresponds to any of the three categories other than „Mild dysplasia‟. Thus, „TP & TN‟ corresponds to the
correctly classier images and „FP & FN‟ corresponds to the misclassified images.
9. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 31 | Page
Table.2: Experimental image dataset for classification
Cancer Type Training data Testing data Total no of images
Normal 200 240 440
Mild dysplasia 50 57 107
Moderate dysplasia 50 74 124
Severe dysplasia 50 89 139
carcinoma in situ 50 92 142
Total 400 552 952
4.2.1. Individual feature Extraction and Classification Combination Results and Comparison:
Thus, different parameter values are obtained for each class and also for the different classifiers. These
parameters are estimated from the confusion matrix which provides the details about the false and successful
classification of images from all categories for each classifiers. The confusion matrix of the ERSTCM+FL-
HKSVM method is illustrated in Table 3.
Table.3: Confusion matrix of proposed method (ERSTCM+FL-HKSVM)
Class predicted Ground Truth Class (Assigned by Pathologist)
Normal Mild dysplasia Moderate dysplasia Severe dysplasia Carcinoma in situ
Normal 179 16 19 15 11
Mild dysplasia 1 50 1 3 2
Moderate dysplasia 4 4 62 2 2
Severe dysplasia 2 1 2 82 2
Carcinoma in situ 1 2 1 2 86
In the Table 3, the row-wise elements correspond to the four categories and the column-wise elements
correspond to the target class associated with that abnormal category. Hence, the number of images correctly
classified (TP) under each category is determined by the diagonal elements of the matrix. The row-wise
summation of elements for each category other than the diagonal elements corresponds to the „FN‟ of that
category. The column-wise summation of elements for each category other than the diagonal element
corresponds to the „FP‟ of that category. Similarly, „TN‟ of the specific category is determined by summing the
elements of the matrix other than the elements in the corresponding row and column of the specific category.
For example, among the 57 Mild dysplasia testing images, 50 images have been successfully
classified (TP) and the remaining 7 images (first row-wise summation) have been misclassified to any of
the non- Mild dysplasia categories (FN). Similarly 23 images (first column-wise summation) from the other
three categories (non- Mild dysplasia) have been misclassified as Mild dysplasia category (FP). In the Table 4,
the classification accuracy of ERSTCM with FL-HKSVM in class 1 (Normal) is 87.50%, class 2 (Mild
dysplasia) type cancer is 94.57%, class 3(Moderate dysplasia) type cancer is 93.66%, class 4(Severe dysplasia)
type cancer is 94.75% and class 5(carcinoma in situ) type cancer is 95.83%. then the overall sensitivity,
specificity & accuracy of this ERSTCM + FL-HKSVM is 83.15%, 95.79% & 93.26%. The performance of
overall calculation of sensitivity, specificity and accuracy by using this Eq.13-15.
The miss classification rate of class 1(Normal) non cancer type, Class 2 (Mild dysplasia), class 3 (Moderate
dysplasia) & class 4 (Severe dysplasia) of cancer type is high compared to the other class of class 5 (Carcinoma
in Situ).
Table.4: Performance measure of ERSTCM with FL-HKSVM
Class predicted TP TN FP FN Sensitivity (%) Specificity (%) Accuracy (%)
Normal 179 304 8 61 74.58 97.44 87.50
Mild dysplasia 50 472 23 7 87.72 95.35 94.57
Moderate dysplasia 62 455 23 12 83.78 95.19 93.66
Severe dysplasia 82 441 22 7 92.13 95.25 94.75
carcinoma in situ 86 443 17 6 93.48 96.30 95.83
Overall Sensitivity, Specificity & Accuracy Results 83.15 95.79 93.26
------- (13)
Overall classification Specificity = -----------(14)
Overall classification accuracy = ----(15)
Based on the similarly approach of the Table 3, the confusion matrix of the EMSD+FL-HKSVM method is
illustrated in Table 5.
10. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 32 | Page
Table.5: Confusion matrix of proposed method (EMSD+FL-HKSVM)
Class predicted Ground Truth Class (Assigned by Pathologist)
Normal Mild dysplasia Moderate dysplasia Severe dysplasia Carcinoma in situ
Normal 195 12 13 12 8
Mild dysplasia 0 52 2 1 2
Moderate dysplasia 1 3 67 1 2
Severe dysplasia 2 1 2 83 1
Carcinoma in situ 1 1 2 1 87
In the Table 5, the row-wise elements correspond to the four categories and the column-wise elements
correspond to the target class associated with that abnormal category. Hence, the number of images correctly
classified (TP) under each category is determined by the diagonal elements of the matrix. The row-wise
summation of elements for each category other than the diagonal elements corresponds to the „FN‟ of that
category. The column-wise summation of elements for each category other than the diagonal element
corresponds to the „FP‟ of that category. Similarly, „TN‟ of the specific category is determined by summing the
elements of the matrix other than the elements in the corresponding row and column of the specific category.
For example, among the 57 Mild dysplasia testing images, 52 images have been successfully
classified (TP) and the remaining 5 images (first row-wise summation) have been misclassified to any of
the non- Mild dysplasia categories (FN). Similarly 17 images (first column-wise summation) from the other
three categories (non- Mild dysplasia) have been misclassified as Mild dysplasia category (FP). In the Table 6,
the classification accuracy of EMSD with FL-HKSVM in class 1 (Normal) is 91.12%, class 2 (Mild dysplasia)
type cancer is 96.01%, class 3(Moderate dysplasia) type cancer is 95.29%, class 4(Severe dysplasia) type cancer
is 96.20% and class 5(carcinoma in situ) type cancer is 96.74%. then the overall sensitivity, specificity &
accuracy of this ERSTCM + FL-HKSVM is 87.68%, 96.92% & 95.07%. The performance of overall calculation
of sensitivity, specificity and accuracy by using this Eq.13 -15.
The miss classification rate of class 1(Normal) non cancer type, Class 2 (Mild dysplasia), class 3
(Moderate dysplasia) & class 4 (Severe dysplasia) of cancer type is high compared to the other class of class 5
(Carcinoma in Situ).
Table.6: Performance measure of EMSD with FL-HKSVM
Class predicted TP TN FP FN Sensitivity (%) Specificity (%) Accuracy (%)
Normal 195 308 4 45 81.25 98.72 91.12
Mild dysplasia 52 478 17 5 91.23 96.57 96.01
Moderate dysplasia 67 459 19 7 90.54 96.03 95.29
Severe dysplasia 83 448 15 6 93.26 96.76 96.20
Carcinoma in situ 87 447 13 5 94.57 97.17 96.74
Overall Sensitivity, Specificity & Accuracy Results 87.68 96.92 95.07
For comparative analysis, the proposed cervical cancer classification system (EMSD + FL-HKSVM) is
compared to other classification system (ERSTCM + FL-HKSVM). The overall classification performance
measured parameters of sensitivity, specificity & accuracy of the proposed method is 87.68%, 96.92% &
95.07% and this method will be provide the better results than the existing method ERSTCM + FL-HKSVM
results in terms of 83.15%, 95.79% & 93.26%. The overall classification results of sensitivity, specificity and
accuracy of existing and proposed method are shown in Figure 8.
Figure.8: Comparison Results of the Proposed & Existing System in individual feature Extraction methods with
Classification Technique.
11. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 33 | Page
4.2.2. Individual and Combining individual features & Classification combination comparison:
Based on the similarly approach of the Table.3, the confusion matrix of the CFE+FL-HKSVM method is
illustrated in Table 7.
Table.7: Confusion matrix of proposed method (CFE+FL-HKSVM)
Class predicted Ground Truth Class (Assigned by Pathologist)
Normal Mild dysplasia Moderate dysplasia Severe dysplasia Carcinoma in situ
Normal 224 5 4 3 4
Mild dysplasia 0 56 0 1 0
Moderate dysplasia 2 1 70 0 1
Severe dysplasia 0 1 0 87 1
Carcinoma in situ 0 0 1 0 91
In the Table 7, the row-wise elements correspond to the four categories and the column-wise elements
correspond to the target class associated with that abnormal category. Hence, the number of images correctly
classified (TP) under each category is determined by the diagonal elements of the matrix. The row-wise
summation of elements for each category other than the diagonal elements corresponds to the „FN‟ of that
category. The column-wise summation of elements for each category other than the diagonal element
corresponds to the „FP‟ of that category. Similarly, „TN‟ of the specific category is determined by summing the
elements of the matrix other than the elements in the corresponding row and column of the specific category.
For example, among the 57 Mild dysplasia testing images, 56 images have been successfully classified (TP)
and the remaining 1 image (first row-wise summation) have been misclassified to any of the non- Mild
dysplasia categories (FN). Similarly, 7 images (first column-wise summation) from the other three categories
(non- Mild dysplasia) have been misclassified as Mild dysplasia category (FP). In the Table 8, the classification
accuracy of CFE with FL-HKSVM in class 1 (Normal) is 96.74%, class 2 (Mild dysplasia) type cancer is
98.55%, class 3(Moderate dysplasia) type cancer is 98.37%, class 4(Severe dysplasia) type cancer is 98.91%
and class 5(carcinoma in situ) type cancer is 98.73%. then the overall sensitivity, specificity & accuracy of this
CFE + FL-HKSVM is 95.65%, 98.91% & 98.26%. The performance of overall calculation of sensitivity,
specificity and accuracy by using this Eq.13-15. The miss classification rate of class 1(Normal) non cancer type,
Class 2 (Mild dysplasia), class 3 (Moderate dysplasia) & class 5 (Carcinoma in Situ) of cancer type is high
compared to the other class of class 4 (Severe dysplasia).
Table.8: Performance measure of CFE with FL-HKSVM
Class predicted TP TN FP FN Sensitivity (%) Specificity (%) Accuracy (%)
Normal 224 310 2 16 93.33 99.36 96.74
Mild dysplasia 56 488 7 1 98.25 98.59 98.55
Moderate dysplasia 70 473 5 4 94.59 98.95 98.37
Severe dysplasia 87 459 4 2 97.75 99.14 98.91
carcinoma in situ 91 454 6 1 98.91 98.70 98.73
Overall Sensitivity, Specificity & Accuracy Results 95.65 98.91 98.26
For comparative analysis, the proposed cervical cancer classification system (CFE + FL-HKSVM) is
compared to the other methods (ERSTCM + FL-HKSVM & EMSD + FL-HKSVM), the overall classification
performance measured parameters of sensitivity, specificity & accuracy of the proposed CFE + FL-HKSVM
method is 95.65%, 98.91% & 98.26% & and this method will be provide the better results than the existing
methods of ERSTCM + FL-HKSVM & EMSD + FL-HKSVM Results. The overall classification results of
sensitivity, specificity and accuracy of existing and proposed method are shown in Figure 9.
Figure.9. Comparison Results of the Proposed CFE + FL-HKSVM with Existing ERSTCM+FL-HKSVM,
EMSD+FL-HKSVM Classification Technique
12. Multi Class Cervical Cancer Classification by using ERSTCM, EMSD & CFE methods based ..
DOI: 10.9790/0661-1901042334 www.iosrjournals.org 34 | Page
V. Conclusion
In this paper, a novel approach of individual and combining individual feature extraction method with
classification technique developed for identify the class for the cervical cyto cell of the pap smear image. Two
major contributions of this paper are feature extraction and classification. In feature extraction, we have taken
the advantage of combining individual methods such as ERSTCM & EMSD texture features as one feature to
assess the joint performance. In Classification, multiple kernels are combined and developed a Fuzzy Logic
based hybrid kernel SVM classifier for improving the classification process. For comparative analysis, our
proposed approach is compared with existing works in individual feature extraction with the classification
method, the proposed EMSD+FL-HKSVM produce better result than ERSTCM+FL-HKSVM and combining
individual feature extraction with the classification method, the proposed CFE+FL-HKSVM produce better
result than ERSTCM+FL-HKSVM & EMSD+FL-HKSVM methods in terms of the statistical parameters results
sensitivity, specificity & accuracy. Hence, finally our proposed individual and combining individual method are
proved good at detecting anyone of the five types of the classes (normal, mild, moderate, severe & carcinoma in
situ) of cervical cancer in the pap smear image.
References
[1] S.Athinarayanan, Dr.M.V.Srinath & R.Kavitha “ Computer aided diagnosis for detection and stage identification of cervical cancer
by using pap smear screening test images” ictact journal of image and video processing, ISSN: 0976-9102 vol. 6, no.4, May
2016. pp. 1244 – 1251.
[2] S. Aswathy, M. A. Quereshi, B. Kurian, and K. Leelamoni, “Cervical cancer screening: current knowledge & practice among
women in a rural population of Kerala, India,” Indian Journal of Medical Research, vol. 136, no. 2, pp. 205–210, 2012.
[3] S.Athinarayanan, Dr.M.V.Srinath, “Classification of Cervical Cancer Cells in Pap Smea Screenig Test” ictact journal of image and
video processing, ISSN: 0976-9102 vol. 6, no.4, May 2016. pp. 1234 – 1238.
[4] B. L. Craine, E. R. Craine, J. R. Engel, and N. T. Wemple, “A clinical system for digital imaging Colposcopy,” in Medical Imaging
II, vol. 0914 of Proceedings of SPIE, pp. 505–511, Newport Beach, Calif, USA, June 1998.
[5] H. Doornewaard, Y. T. van der Schouw, Y. van der Graaf, A. B. Bos, and J. G. van den Tweel, “Observer variation in cytologic
grading for cervical dysplasia of Papanicolaou smears with the PAPNET testing system,” Cancer, vol. 87, no. 4, pp. 178–183, 1999.
[6] G. Castellano, L. Bonilha, L. M. Li, and F. Cendes, “Texture analysis of medical images,” Clinical Radiology, vol. 59, no. 12, pp.
1061–1069, 2004.
[7] M. E. Plissiti, C. Nikou, and A. Charchanti, “Automated detection of cell nuclei in Pap smear images using morphological
reconstruction and clustering,” IEEE Transactions on Information Technology in Biomedicine, vol. 15, no. 2, pp. 233–241, 2011.
[8] P. Sobrevilla, E. Montseny, F. Vaschetto, and E. Lerma, “Fuzzy based analysis of microscopic color cervical pap smear images:
nuclei detection,” International Journal of Computational Intelligence and Applications, vol. 9, no. 3, pp. 187–206, 2010.
[9] C. Bergmeir, M. Garc´ıa Silvente, and J. M. Ben´ıtez, “Segmentation of cervical cell nuclei in high-resolution microscopic images:
a new algorithmand a web-based software framework,” ComputerMethods and Programs in Biomedicine, vol. 107, no. 3, pp. 497–
512, 2012.
[10] J. Holmquist, E. Bengtsson, O. Eriksson, B. Nordin, and B. Stenkvist, “Computer analysis of cervical cells. Automatic feature
extraction and classification,” Journal of Histochemistry & Cytochemistry, vol. 26, no. 11, pp. 1000–1017, 1978.
[11] Y. Marinakis, G. Dounias, and J. Jantzen, “Pap smear diagnosis using a hybrid intelligent scheme focusing on genetic algorithm
based feature selection and nearest neighbor classification,” Computers in Biology and Medicine, vol. 39, no. 1, pp. 69–78, 2009.
[12] Y.-F. Chen, P.-C. Huang, K.-C. Lin et al., “Semi-automatic segmentation and classification of pap smear cells,” IEEE Journal of
Biomedical and Health Informatics, vol. 18, no. 1, pp. 94–108, 2014.
[13] B. Julesz. Texton gradients: The texton theory revisited. Biological Cybernetics, 54(1986) 245-251.
[14] Wells. W.M. W.E.L. Grimson. R. Kikinis and F.A.Jolez, 1996. Adaptive segmentation of MRI data. IEEE Trans. Med.Imagin, 15:
429-442.
[15] Zijdenbos, A.P., R. Forghani and A.C. Evans, 2002. Automatic pipeline analysis of 3-D MRI data for clininical trials: Application
to multiple sclerosis. IEEE. Trans. Med. Imaging 21: 1280 – 1291.
[16] Chen, L.; Chen, C.L.P.; Lu, M.: A multiple-kernel fuzzy C-means algorithm for image segmentation. IEEE Trans. Syst.Man
Cybern. B Cybern. 41(5), 1263–1274 (2011).