Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
This document summarizes a research paper that proposed a hybrid genetic algorithm and support vector machine (SVM) approach for breast cancer detection and feature selection. The proposed approach uses genetic algorithms to select the optimal features for an SVM classifier. It evaluated this approach on a breast cancer dataset and found that the sequential minimal optimization (SMO) SVM algorithm with genetic feature selection achieved very high accuracy, recall, and F-measure for classifying breast cancer compared to other classification algorithms. The genetic clustering algorithm was also able to accurately cluster the benign and malignant cases in the dataset within 38 seconds. In conclusion, the hybrid genetic algorithm and SVM approach provided an effective and accurate model for breast cancer detection and diagnosis.
A Novel Approach for Breast Cancer Detection using Data Mining Techniquesahmad abdelhafeez
The document presents an agenda for a talk on using data mining techniques for breast cancer detection. It begins with background on cancer and breast cancer. It then discusses pattern recognition, data mining tools like Weka, and various classification techniques including Naive Bayes, SVM, C4.5, KNN, BF tree, and IBK that could be used. The talk will cover preprocessing data, feature selection, applying classification algorithms, and evaluating performance. The goal is to introduce a novel approach for breast cancer detection using these data mining methods.
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information (Abeer Alzubaidi, Georgina Cosma, David Brown and Graham Pockley)
Interactive Technologies and Games (ITAG) Conference 2016
Health, Disability and EducationDates: Wednesday 26 October 2016 - Thursday 27 October 2016 Location: The Council House, NG1 2DT
Cancer is a disease caused by abnormal cell growth and can affect different cell types. This paper focuses on breast cancer, which is classified based on the type of affected cell. There are several risk factors that can increase the likelihood of developing cancer, including gender, age, genetics, family history, weight, alcohol use, and smoking. The dataset used contains information from 569 breast cancer cases, including demographic and cell feature data. Machine learning algorithms like decision trees can be applied to build models for diagnosing breast cancer based on these attributes with up to 96.46% accuracy.
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
Data Mining Techniques In Computer Aided Cancer DiagnosisDataminingTools Inc
Data mining techniques are used in computer aided cancer diagnosis and detection. They help physicians interpret complex diagnoses, combine information from multiple sources, and provide support for differential diagnosis. Specific techniques like neural networks, decision trees, and cluster detection are used in ALL diagnosis. Data mining can also be applied to detect gastric cancer using single nucleotide polymorphism information. It helps organize healthcare claims data to detect cancer patterns and evaluate treatment efficacy. New applications of data mining and neural networks are also helping detect cancers like breast cancer sooner.
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
Abstract Breast Cancer has become the common cause of death among women. Due to long hours invested in manual diagnosis and lesser diagnostic system available emphasize the development of automated diagnosis for early diagnosis of the disease. Our aim is to classify whether the breast cancer is benign or malignant and predict the recurrence and non-recurrence of malignant cases after a certain period. To achieve this we have used machine learning techniques such as Support Vector Machine, Logistic Regression, KNN and Naive Bayes. These techniques are coded in MATLAB using UCI machine learning depository. We have compared the accuracies of different techniques and observed the results. We found SVM most suited for predictive analysis and KNN performed best for our overall methodology. Keywords: Breast Cancer, SVM, KNN, Naive Bayes, Logistic Regression, Classification.
IRJET- A Survey on Categorization of Breast Cancer in Histopathological ImagesIRJET Journal
This document summarizes various methods for categorizing breast cancer in histopathological images. It discusses machine learning and image processing techniques that have been used to build computer-aided diagnosis (CAD) systems to help pathologists diagnose breast cancer more objectively and consistently. The document reviews different classification methods that have been proposed, including those using fuzzy logic, level set methods, convolutional neural networks, texture features and ensemble methods. It concludes that accurately classifying histopathological images remains challenging due to limited publicly available datasets and variability in tissue appearance, but that machine learning and advanced image analysis offer promising approaches to improve cancer detection and diagnosis.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
This document summarizes a research paper that proposed a hybrid genetic algorithm and support vector machine (SVM) approach for breast cancer detection and feature selection. The proposed approach uses genetic algorithms to select the optimal features for an SVM classifier. It evaluated this approach on a breast cancer dataset and found that the sequential minimal optimization (SMO) SVM algorithm with genetic feature selection achieved very high accuracy, recall, and F-measure for classifying breast cancer compared to other classification algorithms. The genetic clustering algorithm was also able to accurately cluster the benign and malignant cases in the dataset within 38 seconds. In conclusion, the hybrid genetic algorithm and SVM approach provided an effective and accurate model for breast cancer detection and diagnosis.
A Novel Approach for Breast Cancer Detection using Data Mining Techniquesahmad abdelhafeez
The document presents an agenda for a talk on using data mining techniques for breast cancer detection. It begins with background on cancer and breast cancer. It then discusses pattern recognition, data mining tools like Weka, and various classification techniques including Naive Bayes, SVM, C4.5, KNN, BF tree, and IBK that could be used. The talk will cover preprocessing data, feature selection, applying classification algorithms, and evaluating performance. The goal is to introduce a novel approach for breast cancer detection using these data mining methods.
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selection based on Mutual Information (Abeer Alzubaidi, Georgina Cosma, David Brown and Graham Pockley)
Interactive Technologies and Games (ITAG) Conference 2016
Health, Disability and EducationDates: Wednesday 26 October 2016 - Thursday 27 October 2016 Location: The Council House, NG1 2DT
Cancer is a disease caused by abnormal cell growth and can affect different cell types. This paper focuses on breast cancer, which is classified based on the type of affected cell. There are several risk factors that can increase the likelihood of developing cancer, including gender, age, genetics, family history, weight, alcohol use, and smoking. The dataset used contains information from 569 breast cancer cases, including demographic and cell feature data. Machine learning algorithms like decision trees can be applied to build models for diagnosing breast cancer based on these attributes with up to 96.46% accuracy.
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
Data Mining Techniques In Computer Aided Cancer DiagnosisDataminingTools Inc
Data mining techniques are used in computer aided cancer diagnosis and detection. They help physicians interpret complex diagnoses, combine information from multiple sources, and provide support for differential diagnosis. Specific techniques like neural networks, decision trees, and cluster detection are used in ALL diagnosis. Data mining can also be applied to detect gastric cancer using single nucleotide polymorphism information. It helps organize healthcare claims data to detect cancer patterns and evaluate treatment efficacy. New applications of data mining and neural networks are also helping detect cancers like breast cancer sooner.
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
Abstract Breast Cancer has become the common cause of death among women. Due to long hours invested in manual diagnosis and lesser diagnostic system available emphasize the development of automated diagnosis for early diagnosis of the disease. Our aim is to classify whether the breast cancer is benign or malignant and predict the recurrence and non-recurrence of malignant cases after a certain period. To achieve this we have used machine learning techniques such as Support Vector Machine, Logistic Regression, KNN and Naive Bayes. These techniques are coded in MATLAB using UCI machine learning depository. We have compared the accuracies of different techniques and observed the results. We found SVM most suited for predictive analysis and KNN performed best for our overall methodology. Keywords: Breast Cancer, SVM, KNN, Naive Bayes, Logistic Regression, Classification.
IRJET- A Survey on Categorization of Breast Cancer in Histopathological ImagesIRJET Journal
This document summarizes various methods for categorizing breast cancer in histopathological images. It discusses machine learning and image processing techniques that have been used to build computer-aided diagnosis (CAD) systems to help pathologists diagnose breast cancer more objectively and consistently. The document reviews different classification methods that have been proposed, including those using fuzzy logic, level set methods, convolutional neural networks, texture features and ensemble methods. It concludes that accurately classifying histopathological images remains challenging due to limited publicly available datasets and variability in tissue appearance, but that machine learning and advanced image analysis offer promising approaches to improve cancer detection and diagnosis.
IRJET- Breast Cancer Prediction using Supervised Machine Learning AlgorithmsIRJET Journal
This document describes a study that used machine learning algorithms to predict breast cancer. The researchers trained decision tree, logistic regression, and random forest models on a breast cancer dataset. They preprocessed the data, which included converting categorical variables to numeric. The random forest model achieved the best prediction accuracy of 98.6%. The study aims to help detect breast cancer at earlier stages to improve treatment outcomes.
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...ijaia
The early detection of Breast Cancer, the deadly disease that mostly affects women is extremely complex because it requires various features of the cell type. Therefore, the efficient approach to diagnosing Breast Cancer at the early stage was to apply artificial intelligence where machines are simulated with intelligence and programmed to think and act like a human. This allows machines to passively learn and find a pattern, which can be used later to detect any new changes that may occur. In general, machine learning is quite useful particularly in the medical field, which depends on complex genomic measurements such as microarray technique and would increase the accuracy and precision of results. With this technology, doctors can easily diagnose patients with cancer quickly and apply the proper treatment in a timely manner. Therefore, the goal of this paper is to address and propose a robust Breast Cancer diagnostic system using complex genomic analysis via microarray technology. The system will combine two machine learning methods, K-means cluster, and linear regression.
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET Journal
This document compares three machine learning techniques - Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB) - for predicting breast cancer using a dataset of 198 patient records. It finds that SVM achieved the highest accuracy of 96.97% for classification, followed by RF at 96.45% and NB at 95.45%. SVM also had the highest recall rate at 0.97, indicating it was best at correctly identifying malignant tumors. While NB had the lowest precision of 0.92, meaning it incorrectly identified some benign cases as malignant, all three techniques showed high performance in predicting breast cancer.
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET Journal
This document describes a study that uses supervised machine learning algorithms to predict breast cancer. Three algorithms - decision tree, logistic regression, and random forest - are applied to preprocessed breast cancer data. The random forest model achieved the best accuracy at 98.6% for predicting whether a tumor was benign or malignant. The study aims to develop an early prediction system for breast cancer using machine learning techniques.
Cancer prognosis prediction using balanced stratified samplingijscai
High accuracy in cancer prediction is important to improve the quality of the treatment and to improve the
rate of survivability of patients. As the data volume is increasing rapidly in the healthcare research, the
analytical challenge exists in double. The use of effective sampling technique in classification algorithms
always yields good prediction accuracy. The SEER public use cancer database provides various prominent
class labels for prognosis prediction. The main objective of this paper is to find the effect of sampling
techniques in classifying the prognosis variable and propose an ideal sampling method based on the
outcome of the experimentation. In the first phase of this work the traditional random sampling and
stratified sampling techniques have been used. At the next level the balanced stratified sampling with
variations as per the choice of the prognosis class labels have been tested. Much of the initial time has been
focused on performing the pre-processing of the SEER data set. The classification model for
experimentation has been built using the breast cancer, respiratory cancer and mixed cancer data sets with
three traditional classifiers namely Decision Tree, Naïve Bayes and K-Nearest Neighbour. The three
prognosis factors survival, stage and metastasis have been used as class labels for experimental
comparisons. The results shows a steady increase in the prediction accuracy of balanced stratified model
as the sample size increases, but the traditional approach fluctuates before the optimum results.
Hybrid prediction model with missing value imputation for medical data 2015-g...Jitender Grover
The document presents a novel hybrid prediction model called HPM-MI that uses K-means clustering and multilayer perceptron (MLP) to improve predictive classification for medical data with missing values. The model first analyzes 11 different imputation techniques using K-means clustering to select the best one for filling missing values in the data. It then uses K-means clustering again to validate class labels and remove incorrectly classified instances before applying the MLP classifier. The model is tested on three medical datasets from the UCI repository and shows improved accuracy, sensitivity, specificity and other metrics compared to existing methods, particularly when datasets have large numbers of missing values.
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
IRJET- Breast Cancer Prediction using Support Vector MachineIRJET Journal
This document discusses using a support vector machine to predict whether breast cancer is benign or malignant. The SVM model is trained on the Wisconsin Diagnostic Breast Cancer dataset containing 569 samples with 32 features each. The SVM achieves 96.09% accuracy at classifying samples into benign versus malignant categories on the test data, demonstrating its effectiveness at predicting breast cancer diagnosis.
This document proposes using a DenseNet-II neural network model to classify mammogram images as benign or malignant. It first preprocesses mammogram images through normalization and data augmentation. It then improves the original DenseNet model by replacing the first convolutional layer with an Inception structure, creating a new DenseNet-II model. This model, along with other common models, are tested on mammogram data and the DenseNet-II model achieves the highest average accuracy of 94.55% for benign-malignant classification.
Diagnosis of Cancer using Fuzzy Rough Set TheoryIRJET Journal
This document presents a study that uses fuzzy rough set theory to diagnose cancer using medical data. It has four main modules: 1) feature selection to identify relevant features using fuzzy rough subset evaluation and particle swarm optimization, 2) instance selection to remove missing/noisy data, 3) classification of the data using fuzzy rough nearest neighbor algorithm, and 4) performance analysis using metrics like accuracy, sensitivity and AUC. The study aims to classify different cancer types by reducing noise and selecting optimal features/instances to improve classifier performance. It is found that fuzzy rough set approaches help preserve meaning during reduction and improve classification compared to other methods.
This document summarizes Mansi Chowkkar's MSc research project on detecting breast cancer from histopathological images using deep learning and transfer learning. The research aims to classify images as malignant or benign with high accuracy and efficiency. It implements CNN and DenseNet-121 models, with the latter using transfer learning with pre-trained ImageNet weights. The research achieved 90.9% test accuracy with CNN and 88.03% accuracy with transfer learning. Related work discusses deep learning applications in healthcare, image processing techniques, prior research on DenseNet, and the use of transfer learning for medical image classification with limited data.
Skin Lesion Classification using Supervised Algorithm in Data Miningijtsrd
Skin cancer is one of the major types of cancers with an increasing incidence over the past decades. Accurately diagnosing skin lesions to discriminate between benign and skin lesions is crucial.J48 Algorithm and SVM SUPPORT VECTOR MACHINE based techniques to estimate effort. In this work proposed system of the project is using data mining techniques for collecting the datasets for skin cancer. So that system can overcome to diagnosing the disease quickly and accuracy. Comparing to other algorithm proposed algorithm has more accuracy. When we have to using two kind of algorithm .They are J48, SVM. J48 Algorithm produced better accuracy more than SVM algorithm. The accuracy of the proposed system is 90.2381 . It means this prediction is very close to the actual values. G. Saranya | Dr. S. M. Uma "Skin Lesion Classification using Supervised Algorithm in Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29346.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/29346/skin-lesion-classification-using-supervised-algorithm-in-data-mining/g-saranya
IRJET- Breast Cancer Detection from Histopathology Images: A ReviewIRJET Journal
This document provides a review of techniques for detecting breast cancer from histopathology images. It discusses how histopathology examines tissue samples under a microscope to study diseases at a microscopic level. Detecting cell nuclei is an important first step, as is identifying mitosis (cell division) and metastasis (cancer spreading). The document reviews several techniques that use convolutional neural networks to automatically analyze histopathology images and detect breast cancer, including techniques for nuclei detection and segmentation. These automatic methods aim to assist pathologists by improving efficiency and reducing human error compared to manual analysis.
ICU Patient Deterioration Prediction : A Data-Mining Approachcsandit
A huge amount of medical data is generated every da
y, which presents a challenge in analysing
these data. The obvious solution to this challenge
is to reduce the amount of data without
information loss. Dimension reduction is considered
the most popular approach for reducing
data size and also to reduce noise and redundancies
in data. In this paper, we investigate the
effect of feature selection in improving the predic
tion of patient deterioration in ICUs. We
consider lab tests as features. Thus, choosing a su
bset of features would mean choosing the
most important lab tests to perform. If the number
of tests can be reduced by identifying the
most important tests, then we could also identify t
he redundant tests. By omitting the redundant
tests, observation time could be reduced and early
treatment could be provided to avoid the risk.
Additionally, unnecessary monetary cost would be av
oided. Our approach uses state-of-the-art
feature selection for predicting ICU patient deteri
oration using the medical lab results. We
apply our technique on the publicly available MIMIC
-II database and show the effectiveness of
the feature selection. We also provide a detailed a
nalysis of the best features identified by our
approach.
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
This document proposes a new approach called IGANFIS that combines information gain (IG) and adaptive neuro fuzzy inference system (ANFIS) to classify cancer threats from medical data. IG is used to select the most important cancer features from the data to reduce dimensionality before being input to ANFIS. ANFIS then trains on the selected features to build a fuzzy inference system for cancer diagnosis and prediction. The approach is tested on breast cancer datasets and achieves higher classification accuracy compared to other methods.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
Definiens technology provides automated digital pathology image analysis to transform pathology into a quantitative science. It handles challenges like tissue variability and staining intensities to automatically identify regions of interest and quantify objects with accuracy and consistency. Definiens supports various digital image and slide formats as well as staining protocols. It has been deployed in over 1,400 applications and provides detailed quantification to support research and clinical decision making. A study using Definiens image analysis achieved statistically significant survival prediction for esophageal cancer patients compared to manual evaluation.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...ijcsit
This document summarizes a research paper that proposes a modified binary particle swarm optimization (MBPSO) method for feature selection to improve the classification of lesions in mammograms as benign or malignant. The MBPSO method incorporates iteration best into the velocity update calculation and selects solutions with fewer features when fitness values are equal. A total of 117 shape, texture, and histogram features are extracted from mammogram regions of interest. MBPSO is used to select an optimal feature subset, which is then used to train a neural network classifier. Experimental results show MBPSO obtains better classification accuracy than using the full feature set, and performs comparably to other feature selection techniques.
ABSTRACT
This paper presents an effective feature selection method that can be applied to build a computer aided diagnosis system for breast cancer in order to discriminate between healthy, benign and malignant parenchyma. Determining the optimal feature set from a large set of original features is an important preprocessing step which removes irrelevant and redundant features and thus improves computational efficiency, classification accuracy and also simplifies the classifier structure. A modified binary particle swarm optimized feature selection method (MBPSO)has been proposed where k-Nearest Neighbour algorithm with leave-one-out cross validation serves as the fitness function. Digital mammograms obtained from Regional Cancer Centre, Thiruvananthapuram and the mammograms from web accessible mini-MIAS database has been used as the dataset for this experiment. Region of interests from the mammograms are automatically detected and segmented. A total of 117 shape, texture and histogram features are extracted from the ROIs. Significant features are selected using the proposed feature selection method.Classification is performed using feed forward artificial neural networks with back propagation learning. Receiver operating characteristics (ROC) and confusion matrix are used to evaluate the performance. Experimental results show that the modified binary PSO feature selection method not only obtains better classification accuracy but also simplifies the classification process as compared to full set of features. The performance of the modified BPSO is found to be at par with other widely used feature selection techniques.
Logistic Regression Model for Predicting the Malignancy of Breast CancerIRJET Journal
1. The document describes using a logistic regression machine learning model to predict whether breast cancer is benign or malignant based on characteristics of breast cell samples.
2. The model was trained on a dataset of 569 samples described by 33 characteristics and achieved 94.94% accuracy on the training data and 92.10% accuracy on the test data.
3. The model provides a way to efficiently diagnose breast cancer and determine the appropriate treatment needed based on whether the cancer is predicted as benign or malignant.
IRJET- Breast Cancer Prediction using Supervised Machine Learning AlgorithmsIRJET Journal
This document describes a study that used machine learning algorithms to predict breast cancer. The researchers trained decision tree, logistic regression, and random forest models on a breast cancer dataset. They preprocessed the data, which included converting categorical variables to numeric. The random forest model achieved the best prediction accuracy of 98.6%. The study aims to help detect breast cancer at earlier stages to improve treatment outcomes.
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...ijaia
The early detection of Breast Cancer, the deadly disease that mostly affects women is extremely complex because it requires various features of the cell type. Therefore, the efficient approach to diagnosing Breast Cancer at the early stage was to apply artificial intelligence where machines are simulated with intelligence and programmed to think and act like a human. This allows machines to passively learn and find a pattern, which can be used later to detect any new changes that may occur. In general, machine learning is quite useful particularly in the medical field, which depends on complex genomic measurements such as microarray technique and would increase the accuracy and precision of results. With this technology, doctors can easily diagnose patients with cancer quickly and apply the proper treatment in a timely manner. Therefore, the goal of this paper is to address and propose a robust Breast Cancer diagnostic system using complex genomic analysis via microarray technology. The system will combine two machine learning methods, K-means cluster, and linear regression.
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET Journal
This document compares three machine learning techniques - Support Vector Machine (SVM), Random Forest (RF), and Naive Bayes (NB) - for predicting breast cancer using a dataset of 198 patient records. It finds that SVM achieved the highest accuracy of 96.97% for classification, followed by RF at 96.45% and NB at 95.45%. SVM also had the highest recall rate at 0.97, indicating it was best at correctly identifying malignant tumors. While NB had the lowest precision of 0.92, meaning it incorrectly identified some benign cases as malignant, all three techniques showed high performance in predicting breast cancer.
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET Journal
This document describes a study that uses supervised machine learning algorithms to predict breast cancer. Three algorithms - decision tree, logistic regression, and random forest - are applied to preprocessed breast cancer data. The random forest model achieved the best accuracy at 98.6% for predicting whether a tumor was benign or malignant. The study aims to develop an early prediction system for breast cancer using machine learning techniques.
Cancer prognosis prediction using balanced stratified samplingijscai
High accuracy in cancer prediction is important to improve the quality of the treatment and to improve the
rate of survivability of patients. As the data volume is increasing rapidly in the healthcare research, the
analytical challenge exists in double. The use of effective sampling technique in classification algorithms
always yields good prediction accuracy. The SEER public use cancer database provides various prominent
class labels for prognosis prediction. The main objective of this paper is to find the effect of sampling
techniques in classifying the prognosis variable and propose an ideal sampling method based on the
outcome of the experimentation. In the first phase of this work the traditional random sampling and
stratified sampling techniques have been used. At the next level the balanced stratified sampling with
variations as per the choice of the prognosis class labels have been tested. Much of the initial time has been
focused on performing the pre-processing of the SEER data set. The classification model for
experimentation has been built using the breast cancer, respiratory cancer and mixed cancer data sets with
three traditional classifiers namely Decision Tree, Naïve Bayes and K-Nearest Neighbour. The three
prognosis factors survival, stage and metastasis have been used as class labels for experimental
comparisons. The results shows a steady increase in the prediction accuracy of balanced stratified model
as the sample size increases, but the traditional approach fluctuates before the optimum results.
Hybrid prediction model with missing value imputation for medical data 2015-g...Jitender Grover
The document presents a novel hybrid prediction model called HPM-MI that uses K-means clustering and multilayer perceptron (MLP) to improve predictive classification for medical data with missing values. The model first analyzes 11 different imputation techniques using K-means clustering to select the best one for filling missing values in the data. It then uses K-means clustering again to validate class labels and remove incorrectly classified instances before applying the MLP classifier. The model is tested on three medical datasets from the UCI repository and shows improved accuracy, sensitivity, specificity and other metrics compared to existing methods, particularly when datasets have large numbers of missing values.
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
IRJET- Breast Cancer Prediction using Support Vector MachineIRJET Journal
This document discusses using a support vector machine to predict whether breast cancer is benign or malignant. The SVM model is trained on the Wisconsin Diagnostic Breast Cancer dataset containing 569 samples with 32 features each. The SVM achieves 96.09% accuracy at classifying samples into benign versus malignant categories on the test data, demonstrating its effectiveness at predicting breast cancer diagnosis.
This document proposes using a DenseNet-II neural network model to classify mammogram images as benign or malignant. It first preprocesses mammogram images through normalization and data augmentation. It then improves the original DenseNet model by replacing the first convolutional layer with an Inception structure, creating a new DenseNet-II model. This model, along with other common models, are tested on mammogram data and the DenseNet-II model achieves the highest average accuracy of 94.55% for benign-malignant classification.
Diagnosis of Cancer using Fuzzy Rough Set TheoryIRJET Journal
This document presents a study that uses fuzzy rough set theory to diagnose cancer using medical data. It has four main modules: 1) feature selection to identify relevant features using fuzzy rough subset evaluation and particle swarm optimization, 2) instance selection to remove missing/noisy data, 3) classification of the data using fuzzy rough nearest neighbor algorithm, and 4) performance analysis using metrics like accuracy, sensitivity and AUC. The study aims to classify different cancer types by reducing noise and selecting optimal features/instances to improve classifier performance. It is found that fuzzy rough set approaches help preserve meaning during reduction and improve classification compared to other methods.
This document summarizes Mansi Chowkkar's MSc research project on detecting breast cancer from histopathological images using deep learning and transfer learning. The research aims to classify images as malignant or benign with high accuracy and efficiency. It implements CNN and DenseNet-121 models, with the latter using transfer learning with pre-trained ImageNet weights. The research achieved 90.9% test accuracy with CNN and 88.03% accuracy with transfer learning. Related work discusses deep learning applications in healthcare, image processing techniques, prior research on DenseNet, and the use of transfer learning for medical image classification with limited data.
Skin Lesion Classification using Supervised Algorithm in Data Miningijtsrd
Skin cancer is one of the major types of cancers with an increasing incidence over the past decades. Accurately diagnosing skin lesions to discriminate between benign and skin lesions is crucial.J48 Algorithm and SVM SUPPORT VECTOR MACHINE based techniques to estimate effort. In this work proposed system of the project is using data mining techniques for collecting the datasets for skin cancer. So that system can overcome to diagnosing the disease quickly and accuracy. Comparing to other algorithm proposed algorithm has more accuracy. When we have to using two kind of algorithm .They are J48, SVM. J48 Algorithm produced better accuracy more than SVM algorithm. The accuracy of the proposed system is 90.2381 . It means this prediction is very close to the actual values. G. Saranya | Dr. S. M. Uma "Skin Lesion Classification using Supervised Algorithm in Data Mining" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd29346.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/29346/skin-lesion-classification-using-supervised-algorithm-in-data-mining/g-saranya
IRJET- Breast Cancer Detection from Histopathology Images: A ReviewIRJET Journal
This document provides a review of techniques for detecting breast cancer from histopathology images. It discusses how histopathology examines tissue samples under a microscope to study diseases at a microscopic level. Detecting cell nuclei is an important first step, as is identifying mitosis (cell division) and metastasis (cancer spreading). The document reviews several techniques that use convolutional neural networks to automatically analyze histopathology images and detect breast cancer, including techniques for nuclei detection and segmentation. These automatic methods aim to assist pathologists by improving efficiency and reducing human error compared to manual analysis.
ICU Patient Deterioration Prediction : A Data-Mining Approachcsandit
A huge amount of medical data is generated every da
y, which presents a challenge in analysing
these data. The obvious solution to this challenge
is to reduce the amount of data without
information loss. Dimension reduction is considered
the most popular approach for reducing
data size and also to reduce noise and redundancies
in data. In this paper, we investigate the
effect of feature selection in improving the predic
tion of patient deterioration in ICUs. We
consider lab tests as features. Thus, choosing a su
bset of features would mean choosing the
most important lab tests to perform. If the number
of tests can be reduced by identifying the
most important tests, then we could also identify t
he redundant tests. By omitting the redundant
tests, observation time could be reduced and early
treatment could be provided to avoid the risk.
Additionally, unnecessary monetary cost would be av
oided. Our approach uses state-of-the-art
feature selection for predicting ICU patient deteri
oration using the medical lab results. We
apply our technique on the publicly available MIMIC
-II database and show the effectiveness of
the feature selection. We also provide a detailed a
nalysis of the best features identified by our
approach.
Iganfis Data Mining Approach for Forecasting Cancer Threatsijsrd.com
This document proposes a new approach called IGANFIS that combines information gain (IG) and adaptive neuro fuzzy inference system (ANFIS) to classify cancer threats from medical data. IG is used to select the most important cancer features from the data to reduce dimensionality before being input to ANFIS. ANFIS then trains on the selected features to build a fuzzy inference system for cancer diagnosis and prediction. The approach is tested on breast cancer datasets and achieves higher classification accuracy compared to other methods.
A Comparative Study on the Methods Used for the Detection of Breast Cancerrahulmonikasharma
Among women in the world, the death caused by the Breast cancer has become the leading role. At an initial stage, the tumor in the breast is hard to detect. Manual attempt have proven to be time consuming and inefficient in many cases. Hence there is a need for efficient methods that diagnoses the cancerous cell without human involvement with high accuracy. Mammography is a special case of CT scan which adopts X-ray method with high resolution film. so that it can detect well the tumors in the breast. This paper describes the comparative study of the various data mining methods on the detection of the breast cancer by using image processing techniques.
Definiens technology provides automated digital pathology image analysis to transform pathology into a quantitative science. It handles challenges like tissue variability and staining intensities to automatically identify regions of interest and quantify objects with accuracy and consistency. Definiens supports various digital image and slide formats as well as staining protocols. It has been deployed in over 1,400 applications and provides detailed quantification to support research and clinical decision making. A study using Definiens image analysis achieved statistically significant survival prediction for esophageal cancer patients compared to manual evaluation.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
A MODIFIED BINARY PSO BASED FEATURE SELECTION FOR AUTOMATIC LESION DETECTION ...ijcsit
This document summarizes a research paper that proposes a modified binary particle swarm optimization (MBPSO) method for feature selection to improve the classification of lesions in mammograms as benign or malignant. The MBPSO method incorporates iteration best into the velocity update calculation and selects solutions with fewer features when fitness values are equal. A total of 117 shape, texture, and histogram features are extracted from mammogram regions of interest. MBPSO is used to select an optimal feature subset, which is then used to train a neural network classifier. Experimental results show MBPSO obtains better classification accuracy than using the full feature set, and performs comparably to other feature selection techniques.
ABSTRACT
This paper presents an effective feature selection method that can be applied to build a computer aided diagnosis system for breast cancer in order to discriminate between healthy, benign and malignant parenchyma. Determining the optimal feature set from a large set of original features is an important preprocessing step which removes irrelevant and redundant features and thus improves computational efficiency, classification accuracy and also simplifies the classifier structure. A modified binary particle swarm optimized feature selection method (MBPSO)has been proposed where k-Nearest Neighbour algorithm with leave-one-out cross validation serves as the fitness function. Digital mammograms obtained from Regional Cancer Centre, Thiruvananthapuram and the mammograms from web accessible mini-MIAS database has been used as the dataset for this experiment. Region of interests from the mammograms are automatically detected and segmented. A total of 117 shape, texture and histogram features are extracted from the ROIs. Significant features are selected using the proposed feature selection method.Classification is performed using feed forward artificial neural networks with back propagation learning. Receiver operating characteristics (ROC) and confusion matrix are used to evaluate the performance. Experimental results show that the modified binary PSO feature selection method not only obtains better classification accuracy but also simplifies the classification process as compared to full set of features. The performance of the modified BPSO is found to be at par with other widely used feature selection techniques.
Logistic Regression Model for Predicting the Malignancy of Breast CancerIRJET Journal
1. The document describes using a logistic regression machine learning model to predict whether breast cancer is benign or malignant based on characteristics of breast cell samples.
2. The model was trained on a dataset of 569 samples described by 33 characteristics and achieved 94.94% accuracy on the training data and 92.10% accuracy on the test data.
3. The model provides a way to efficiently diagnose breast cancer and determine the appropriate treatment needed based on whether the cancer is predicted as benign or malignant.
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
Abstract- The goal of this paper is to compare between different classifiers or multi-classifiers fusion with respect to accuracy in discovering breast cancer for four different data sets. We present an implementation among various classification techniques which represent the most known algorithms in this field on four different datasets of breast cancer two for diagnosis and two for prognosis. We present a fusion between classifiers to get the best multi-classifier fusion approach to each data set individually. By using confusion matrix to get classification accuracy which built in 10-fold cross validation technique. Also, using fusion majority voting (the mode of the classifier output). The experimental results show that no classification technique is better than the other if used for all datasets, since the classification task is affected by the type of dataset. By using multi-classifiers fusion the results show that accuracy improved in three datasets out of four.
The document presents a proposal for developing a hybrid machine learning model for prostate cancer detection. It aims to combine deep convolutional neural networks (DCNN) and fuzzy support vector machines (SVMs) to overcome limitations of individual models. The methodology section outlines the steps as: pre-processing data, extracting features using DCNN, training and testing the fuzzy SVM classifier, and evaluating performance. Key aspects of the DCNN and fuzzy SVM approaches are also summarized, such as convolutional and pooling layers, fully connected layers, and the SVM classification technique. The proposal seeks to improve prostate cancer detection accuracy through this hybrid modeling approach.
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...mlaij
The classification of different types of tumor is of great importance in cancer diagnosis and drug discovery.
Earlier studies on cancer classification have limited diagnostic ability. The recent development of DNA
microarray technology has made monitoring of thousands of gene expression simultaneously. By using this
abundance of gene expression data researchers are exploring the possibilities of cancer classification.
There are number of methods proposed with good results, but lot of issues still need to be addressed. This
paper present an overview of various cancer classification methods and evaluate these proposed methods
based on their classification accuracy, computational time and ability to reveal gene information. We have
also evaluated and introduced various proposed gene selection method. In this paper, several issues
related to cancer classification have also been discussed.
Graphical Model and Clustering-Regression based Methods for Causal Interactio...gerogepatton
The early detection of Breast Cancer, the deadly disease that mostly affects women is extremely complex
because it requires various features of the cell type. Therefore, the efficient approach to diagnosing Breast
Cancer at the early stage was to apply artificial intelligence where machines are simulated with
intelligence and programmed to think and act like a human. This allows machines to passively learn and
find a pattern, which can be used later to detect any new changes that may occur. In general, machine
learning is quite useful particularly in the medical field, which depends on complex genomic
measurements such as microarray technique and would increase the accuracy and precision of results.
With this technology, doctors can easily diagnose patients with cancer quickly and apply the proper
treatment in a timely manner. Therefore, the goal of this paper is to address and propose a robust Breast
Cancer diagnostic system using complex genomic analysis via microarray technology. The system will
combine two machine learning methods, K-means cluster, and linear regression.
GRAPHICAL MODEL AND CLUSTERINGREGRESSION BASED METHODS FOR CAUSAL INTERACTION...gerogepatton
The early detection of Breast Cancer, the deadly disease that mostly affects women is extremely complex because it requires various features of the cell type. Therefore, the efficient approach to diagnosing Breast Cancer at the early stage was to apply artificial intelligence where machines are simulated with intelligence and programmed to think and act like a human. This allows machines to passively learn and find a pattern, which can be used later to detect any new changes that may occur. In general, machine learning is quite useful particularly in the medical field, which depends on complex genomic measurements such as microarray technique and would increase the accuracy and precision of results. With this technology, doctors can easily diagnose patients with cancer quickly and apply the proper treatment in a timely manner. Therefore, the goal of this paper is to address and propose a robust Breast Cancer diagnostic system using complex genomic analysis via microarray technology. The system will combine two machine learning methods, K-means cluster, and linear regression.
Predictive modeling for breast cancer based on machine learning algorithms an...IJECEIAES
Breast cancer is one of the leading causes of death among women worldwide. However, early prediction of breast cancer plays a crucial role. Therefore, strong needs exist for automatic accurate early prediction of breast cancer. In this paper, machine learning (ML) classifiers combined with features selection methods are used to build an intelligent tool for breast cancer prediction. The Wisconsin diagnostic breast cancer (WDBC) dataset is used to train and test the model. Classification algorithms, including support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), logistic regression (LR), k-nearest neighbors (k-NN), and naïve Bayes, were employed. Performance measures for each of them were obtained, namely: accuracy, precision, recall, F-score, Kappa, Matthews correlation coefficient (MCC), and time. The results indicate that without feature selection, LightGBM achieves the highest accuracy at 95%. With minimum redundancy maximum relevance (mRMR) feature selection (15 features), LightGBM outperforms other classifiers, achieving an accuracy of 98%. For Pearson correlation coefficient feature selection (15 features), LightGBM also excels with a 95% accuracy rate. Lasso feature selection (5 features) produces varied results across classifiers, with logistic regression achieving the highest accuracy at 96%. These findings underscore the importance of feature selection in refining model performance and in improving detection for breast cancer.
IRJET - A Conceptual Method for Breast Tumor Classification using SHAP Values ...IRJET Journal
This document proposes a conceptual method for classifying breast tumors as benign or malignant using SHAP values and Adaboost. It involves scoring ultrasound image features using the BI-RADS lexicon with human input. SHAP value mining is used to discover diagnostic rules from the scored features. Weak classifiers are constructed by combining benign and malignant rules. The Adaboost algorithm then integrates the weak classifiers into a strong classifier for tumor classification. Experimental results on 250 patient records show the proposed method achieves high accuracy, specificity, and sensitivity, indicating potential for clinical use.
The current big challenge facing radiologists in healthcare is the automatic detection and classification of masses in breast mammogram images. In the last few years, many researchers have proposed various solutions to this problem. These solutions are effectively dependent and work on annotated breast image data. But these solutions fail when applied to unlabeled and non-annotated breast image data. Therefore, this paper provides the solution to this problem with the help of a neural network that considers any kind of unlabeled data for its procedure. In this solution, the algorithm automatically extracts tumors in images using a segmentation approach, and after that, the features of the tumor are extracted for further processing. This approach used a double thresholding-based segmentation technique to obtain a perfect location of the tumor region, which was not possible in existing techniques in the literature. The experimental results also show that the proposed algorithm provides better accuracy compared to the accuracy of existing algorithms in the literature.
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...IRJET Journal
This document presents research on using the DenseNet169 deep learning model for cervical cancer detection. The researcher trained and tested the model on a large cervical cell image dataset from Kaggle. Through data preprocessing like augmentation and normalization, and transfer learning by fine-tuning a DenseNet pre-trained on ImageNet, the model achieved 95.27% accuracy in classifying five cervical cell types. Evaluation of the model showed high average precision, recall, and F1-score, demonstrating its ability to correctly classify different cervical cell images. The research highlights the potential of deep learning models for automating cervical cancer screening and improving early detection.
This document presents research on using machine learning algorithms to predict breast cancer. Six classification algorithms (logistic regression, decision tree, KNN, random forest, SVM, naive Bayes) are tested on a breast cancer dataset containing 569 cases. The algorithms are trained on 80% of the data and tested on the remaining 20%. Logistic regression and random forest achieved the highest testing accuracy of 96.49%. The results indicate that machine learning algorithms can accurately predict breast cancer and may help pathologists in diagnosis.
Breast Cancer Detection Using Machine LearningIRJET Journal
The document proposes using machine learning classifiers like k-Nearest Neighbors (KNN) and Naive Bayes to classify breast cancer as benign or malignant based on features in the Wisconsin Breast Cancer Dataset, and finds that the KNN algorithm achieved 97.13% accuracy in classification; it also describes developing a web application with doctor and patient login for entering patient details and viewing cancer reports and classification results.
Breast Cancer Prediction using Machine LearningIRJET Journal
This document discusses using machine learning algorithms to predict breast cancer from patient data and imaging results. It first provides background on breast cancer, noting it is the most commonly diagnosed cancer worldwide. The document then reviews prior works applying machine learning to breast cancer prediction, finding support vector machines achieved the highest accuracy. It describes the dataset used, from the University of Wisconsin, containing patient data and tumor characteristics. Finally, it explores the data and discusses implementing classification algorithms like logistic regression, support vector machines, random forests and neural networks to predict cancer type, finding logistic regression achieved the highest accuracy of 98.24%.
Breast cancer detection using machine learning approaches: a comparative studyIJECEIAES
As the cause of the breast cancer disease has not yet clearly identified and a method to prevent its occurrence has not yet been developed, its early detection has a significant role in enhancing survival rate. In fact, artificial intelligent approaches have been playing an important role to enhance the diagnosis process of breast cancer. This work has selected eight classification models that are mostly used to predict breast cancer to be under investigation. These classifiers include single and ensemble classifiers. A trusted dataset has been enhanced by applying five different feature selection methods to pick up only weighted features and to neglect others. Accordingly, a dataset of only 17 features has been developed. Based on our experimental work, three classifiers, multi-layer perceptron (MLP), support vector machine (SVM) and stack are competing with each other by attaining high classification accuracy compared to others. However, SVM is ranked on the top by obtaining an accuracy of 97.7% with classification errors of 0.029 false negative (FN) and 0.019 false positive (FP). Therefore, it is noteworthy to mention that SVM is the best classifier and it outperforms even the stack classier.
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
The goal of this paper is to compare between different classifiers or multi-classifiers fusion with respect to accuracy in discovering breast cancer for four different data sets. We present an implementation among various classification techniques which represent the most known algorithms in this field on four different datasets of breast cancer two for diagnosis and two for prognosis. We present a fusion between classifiers to get the best multi-classifier fusion approach to each data set individually. By using confusion matrix to get classification accuracy which built in 10-fold cross validation technique. Also, using fusion majority voting (the mode of the classifier output). The experimental results show that no classification technique is better than the other if used for all datasets, since the classification task is affected by the type of dataset. By using multi-classifiers fusion the results show that accuracy improved in three datasets out of four.
Similar to SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION (20)
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
1. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
DOI :10.5121/ijscai.2020.9401 1
SVM &GA-CLUSTERING BASED FEATURE
SELECTION APPROACH FOR BREAST
CANCER DETECTION
Rashmi Priya1
and Syed Wajahat Abbas Rizvi2
1
Assistant Professor GD Goenka University ,India
2
Amity University, Uttar Pradesh ,India
ABSTRACT
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second
most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast
cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data
processing tools, we tackled this disease analysis. Data mining is an important step of library discovery
where intelligent methods are used to detect patterns. Several clinical breast cancer studies were
conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier,
easier, or more comprehensive than others. This research is focused on genetic programming and machine
learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best
features and parameter values. Data mining is an important step of library discovery where intelligent
methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A
comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F.
Tree processes, i.e. 97.71%.
KEYWORDS
S.M.O., Breast cancer, Machine learning, Feature selection, and WEKA
1. INTRODUCTION
Breast cancer is the most prevalent non-skin cancer in women and the second-largest cause of
cancer mortality in women. [1]. [1]. "Mammography usually represents thick-area breast cancer
and clusters. A normal benign mass has a circular border, circumscribed and round, but malignant
cancer usually has a suspected raw and fuzzy boundary. "[2],[3].
Nowadays, demand for machine learning grows until it becomes an operation. Sadly, machine
learning also takes skills and is a field with very high barriers. The creation of a successful
machine learning model involves many skills and experience, including pre-processing steps,
feature selection, and classification processes. Using computer analysis and machine learning
methods in medical fields is prevalent, as these techniques can be regarded as a great aid in
decision-making processes for medical professionals. A vast variety of databases are being used
for breast cancer cases that are helpful in supporting scientific and academic studies and even
more in integrating previous field computer analysis and machine learning.
2. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
2
In certain fields and implementations, solving such a problem is based on retrieving features from
the original images obtained in the physical world and organised as vectors. The processing
system 's efficiency depends heavily on the correct choice of these vectors. But, in many
situations, problem solving becomes almost impossible due to the excessive dimensionality of
these vectors or anomalies that may exist in the results. Therefore, reducing the size of the dataset
samples to a more suitable size is sometimes helpful and often necessary, even though this
reduction may lead to minor information loss.
Accurate and accurate data will be gathered to help the doctor's early diagnosis and treatment of
illness, both stable and malignant, using an exact model. This will save doctors time and improve
efficiency. This article reflects on how benign or malignant breast cancer diagnosis is, also at an
early stage. Forecast criteria are based on breast cancer symptoms. This paper's data set contains
32 attributes.
A breast cancer diagnosis may be useful in predicting the outcomes of complex diseases or
recognising the molecular nature of the tumour. Many methods for examining and identifying
trends of breast cancer. This paper compares empirically the utility of three classical tree
classifiers designed to specifically evaluate their effects. Traditional SVM preparation normally
requires Q.P. Set, and it takes longer to solve Q.P. Optimization problem, particularly for large
dataset problems. SVM planning is slow, and the creation of large data sets takes time. The
SMO-SVM method minimises data, is more reliable and easier to implement[4]. We suggest a
hybrid SVM-GA (Support vector machines with genetic attributes) approach to achieve optimal
results on Dataset breast cancer.
2. LITERATURE SURVEY
Machine learning techniques usually refer to life scientific analysis. Numerous research focused
on medical diagnostic technology has been written. These studies have applied different solutions
to the problem and achieved detailed classification precision[5] using an artificial cortical
network to assess breast cancer therapy. They have tested their system on a limited set of data,
but results suggest they understand actual survival. And al.[6] Also in breast cancer patients, a
naïve bay, decision tree, and neural backpropagation network were used.
Though the results were high (about 90% accuracy), they were not appropriate because the data
were split into two groups: one for more than five years of life and the other for those who died in
five years. Findings became meaningless. [7] Program pick approach to usability assessment of
feature selectors. This seeks a simple, coherent set of functions without losing the predictive
precision dimension. Using a ranking algorithm applies confidence to characteristics. [8]
Proposed hybrid GA / SVM approach using fuzzy logic to minimise the initial problem size.
Identify a sub-set of balanced genes, which are then checked by SVM. [9] This analysis aimed to
compare the performance of the Artificial Neural Network.
(ANN) and Vector Machine Support (SVM) for liver cancer classification. On BUPA Liver
Disease Dataset, both model accuracy, durability, precision, and performance were contrasted
and validated. Curved Field (A.U.C.). [10] used in tandem with heart disease modelling in mining
and genetic algorithms. The proposed approach used Gini genetic mutation index statistics for
interface process and crossover. They used a technique for gathering consistency functions.
3. CLASSIFICATION
Classification is one of the data mining techniques specifically used to analyse and assign a given
dataset to a particular class[11]. This method is designed to remove classification errors.
3. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
3
Classification enables model extraction that determines the forms for a given data set. Three
different testing approaches are used in information processing: guided and unrestricted study.
[12] [12]. Classification is a first step in evaluating many situations. The overarching aim is to
enhance market comprehension or forecasts. Analysis has proposed many common ways of
classification techniques. Data mining 's central operation is designing accurate classification
systems. A vital mission for data processing and machine learning research. Like decision-
making trees, naive- Bayesian systems, serial minB.B.F.um optimization (S.M.O.), I.B.K., B.F.
Chain modes of grouping methods, etc.
4. FEATURE SELECTION
Increased use of computers from both directions contributes to comprehensive data processing.
This data are large, systematically interconnected data to identify acceptable patterns, making
data mining a crucial area for data processing, prediction, and other activities. It has joined a
complex science area to address real- time theoretical problems. Big data mining is used in many
areas where data analysis is needed. These data mining and creation techniques were commonly
used at various levels, such as pattern recognition, etc., and collecting apps plays an important
role in virtually every field.
The collection attempts to assess the smallest possible subset of characteristics. The architecture
selects the foundation of original features by removing redundant and unused dimensionality
features without losing information. Until data-mining activities are implemented, the pre-
processing step is critical. Mining accuracy, measurement time, and test comprehension are
improved. Three filtering strategies comprises of philtres, wrappers and embedded approaches.
As discussed in [16], the Filter selects the function without the classifier type used. The
advantage of this method is that it is only necessary to select features once, it is simple and
irrespective of the classifier used. This procedure, however, lacks the classifier relationship; each
feature is interpreted separately from functional dependence. Wrapper's approach depends on
classification. Classifier results are used to assess the goodness of the specified feature or
attribute. Another method has the benefit that the filtering cycle eliminates the downside, which
is simpler than the filtering system as it still takes all the dependencies. The next embedded
approach is to combine a philtre algorithm with a wrapper approach to find an optimal sub-set in
the classification structure. This method 's advantage is less expensive and less vulnerable than
wrapper approach. Different feature selection applications over the past two decades:
• Text mining
• Image processing and computer vision
• Industrial applications
• Bioinformatics
4. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
4
5. MATERIAL AND METHODS
Weka is a data mining application that uses algorithms. Such algorithms can be directly applied
to the data or code-named Java.
"Weka's a platform for:
Regression.
Clusters.
Association.
Pre-processing data.
Classification Rating.
Visualization.”[13]
We have Weka classifiers to approximate measurable or numerical quantities. Decision and lists
are available for learning systems, vector-support machines, case-dependent classifiers, technical
regression, and Bayes networks. If loaded, all tabs are allowed. The best algorithm for basic data
representation can be found for parameters, tests and errors.
The tab Cluster aims to identify clusters or event groups in data set. Clustering provides the
customer details for analysis. Clustering uses the training set, percentage division, test set, and
classes issued, for which users can ignore other requirement-based data set attributes. "K-Means,
EM, Cobweb, X-means and Farthest First in Weka."
6. PROPOSED METHOD:
6.1. Sequential Minimal Optimization (S.M.O.)
The new teaching algorithm supporting vector machines ( SVMs) is minimal sequential
optimization. In 1998, John Platt developed it for minimal sequential optimization (S.M.O.)[14],
5. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
5
a simple and easy approach for SVM preparation. The underlying theory is to solve the dual
quadratic optimization by maximising the complete two-component sub-set.
S.M.O. 's advantage is the simple and clear introduction. A broad topic of quadratic programming
is understanding a vector-supporting computer.
"S.M.O. distinguishes this core quadratic programming problem in a number of small potentials.
Such a wide square
Analytically solved problem method, which avoids the use of time-consuming quadratic
numerical programming as an internal loop. S.M.O. has a continuous memory feature, allowing
S.M.O. to do intense exercises. Since matrix calculation is avoided, standard S.M.O. scales for
different testing problems like linear and quadratic chunking SVM algorithm scales like linear
and cubic. S.M.O. time is regulated by SVM evaluation; S.M.O. is also the fastest for linear,
small data sets.
7. FRAMEWORK
Fig.3 provides an illustrative outline of SVM and gene-dependent clustering process. Below is a
genetic clustering enhanced SVM process algorithm.
Fig. 3. Proposed Framework of S.M.O. classification with genetic algorithm
Select initial variables indexed by SVM philtre. Output dependent on adaptation, precision, estimation,
memory, accuracy. In the data mining industry, these metric metrics played a crucial role in assessing the
effects of different classifications and directing the algorithms as shown in Table I
6. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
6
Classification efficiency is calculated in the tables below.
7.1. The Proposed Hybrid Feature Selection Algorithm
Within this section, as our observations indicate, we identify a primary genetic clustering S.M.O.
algorithm. Several operations must be determined before using genetic algorithm. It is the first
population, fitness, selection, and crossover.
STEP 1: Population initialization: randomly generated initial population. A random function (g)
generates a random number in [0 , 1] for each element. If the number exceeds a threshold, g=1.
Otherwise, g=0. For example, the threshold value may be 0.5 to equate 1 or 0.
7. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
7
STEP 2: Minimal sequential vector recognition interface aid preparation.
STEP 3: It extracts all missing properties globally and converts them into binary attributes. This
functions are also simply standardised. (This is important in the study of output parameters based
on structured data, not original data.
STEP 4: The genetic algorithm integrates genetic algorithm K-means into a modern statistical
clustering method.
8. RESULTS AND DISCUSSION
In our opinion, genetic algorithms are original methods that can be used for the extraction of
functions. They suggest a system for the evaluation of individuals based on the combination of
SVM[15] classifiers that have been qualified for each function. More specifically, we associate
an SVM classifier with each primitive and implement a selection of classifiers. In the method we
propose, classifier learning is Performed one step before each primitive genetic algorithm. The
genetic algorithm fitness function is calculated by a mixture of these classes. We significantly
reduce the training period for the that classifier. Therefore, the proposed selection approach
significantly reduces implementation time. This dataset includes569 reports of breast cancer
cases, 357 of whom were healthy, and 212 of whom were malignant. Class name and 32 function
number are identical to the brain or malignant type of breast cancer. This functions are Measured
by visual representation of the breast mass aspirates needle (F.N.A.) representing the cell nucleus
characteristics in the photo.
8.1. S.M.O. Based SVM Algorithm Results
We run 5 K folded philtre output S.M.O. Classifier. Various other algorithms were contrasted
with our system and can be seen below:
Where, B = Benign, M = Malignant.
It can be shown that S.M.O. algorithm worked better in terms of Accuracy, Recall, F-measure,
and R.O.C. field than other algorithms.
8. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
8
8.2. Genetic Clustering Algorithm Results
Founded by [15]. Standard Manhattan Distance class is used for outcome similarity calculations.
Easy K-Means incorporates the basic missing qualities. If a mechanism creates a chromosome
with all the records of one cluster, chromosomes are changed until at least two
Here, the algorithm took 38.81 seconds to run.
Ultimately, it was found that the overall wrongly clustered instances is 36, 6.32 percent of the
entire data collection. Visualization of cluster forming can be seen in the figure below indicating
cluster formation selection.
9. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
9
Blue in the database shows malignant examples, and red in the Map indicates Benign cases. The
plot reveals that clustering was successful.
9. CONCLUSION
This essay presents a new approach to feature collection. This selection approach is based on
genetic algorithms combined with 5k fold-cross-validation SVM classification. Our system was
evaluated on the Wisconsin dataset for breast cancer with several other forms classifier. We
obtained very high accuracy results in grouping and quick results in clustering Breast-cancer
diagnosis. This method not only provides a more accurate model estimation utility estimation, but
also helps to assess the best ML algorithms parameters. Results show the suggested method's
robustness. Our goal is to launch the G.A. immediately. Analysis to ensure the required balance
of consistency and time of diverse datasets from other fields of inquiry.
REFERENCES
1. E.C. E.C. Fear, P.M. Meaney, and M.A.Stuchly,"Breast Cancer Detection Microwaves, "IEEE
Potentials, vol.22, pp.12-18,February-March 2003.
10. International Journal on Soft Computing, Artificial Intelligence and Applications (IJSCAI), Vol.9, No.4,
November 2020
10
2. Homer MJ.Mammographic: A realistic solution. Boston, MA, second edition, 1997.
3. Reston VA, Illustrated Breast Imaging Reporting and Data System (BI-RADSTM), third edition,
1998.
4. Urmaliya, Ajay, Singhai. (2013). (2013). Minimal sequential optimization to help vector machine for
breast cancer diagnosis function collection. 2013 IEEE 2nd International Image Processing
Conference, IEEE ICIIP 2013. 481–486. ICIIP.2013. 6707638 10.1109/2013.
5. Bittern, R., Dolgobrodov, D., Marshall, R., Moore, P., Steele, R. Artificial neural cancer networks.
All Hands Conference 19 (2007), pp.251 – 263.
6. Bellaachia, A., AND Guven, E. Estimating breast cancer longevity through data mining.
7. Afef Ben Brahim, Mohamed Limam, 2017, "Ensemble feature selection for high-dimensional data: a
new approach and comparative analysis," IFCS, vol. 12(4), 937-952
8. Huerta, E.B., 2006, "A Hybrid GA / SVM method for gene discovery and microarray data
classification," springer, vol.3907 pp.34–44
9. Sallehuddin, R.N.H.aidillah, S.H., Mustaffa, N.H., 2014, "Liver cancer detection using artificial
neural networks and supporting vector machines" In: Proceedings of the International Conference on
Progress in Communication Network, and Computation, Elsevier Research, pp 487-493.
10. Jabbar, M.A., Deekshatulu, B.L., Chandra, 2012, "Heart disease prediction method using associative
and genetic algorithm."
11. S. S. Nikam, "A Comparative Study of Data Mining Algorithms Classification Strategies," Oriental
Computer Science & Technology Journal, vol.8, pp.13-19, April 2015.
12. S. Neelamegam, E.Ramaraj, "International Journal of P2P Network Developments and Technology
(IJPTT), vol. 4, pp. 369-374, September 2013.
13. Eibe Frank, Mark A. Hall, Witten (2016). Workbench WEKA. "Data Mining: Functional Machine
Learning Methods and Techniques," Morgan Kaufmann, Fourth Edition, 2016.
14. Fourteen. "Platt, J.C.: Minimum sequential optimization: quick algorithm for training vector
machines. MSR-TR-98- 14, Microsoft Research, 1998.
15. Fifteen. 'Islam, M. Z., Estivill-Castro, V., Rahman, M. A. (From 2018). Combining K-Means and a
Genetic Algorithm into a Novel Method for High Quality Clustering. Application-expert systems.
91:402-417 AM.
16. Sangaiah, Vincent Antony Kumar, A. (2019). (2019). Improving the efficiency of medical diagnosis
using hybrid function selection by reliefs and entropy-based genetic search (RF-EGA) approach:
breast cancer prediction. Computing array. 22. 22. 10.1007 / s10586-1702-5.