Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second
most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast
cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data
processing tools, we tackled this disease analysis. Data mining is an important step of library discovery
where intelligent methods are used to detect patterns. Several clinical breast cancer studies were
conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier,
easier, or more comprehensive than others. This research is focused on genetic programming and machine
learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best
features and parameter values. Data mining is an important step of library discovery where intelligent
methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A
comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F.
Tree processes, i.e. 97.71%.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine
learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%.
Predictive modeling for breast cancer based on machine learning algorithms an...IJECEIAES
Breast cancer is one of the leading causes of death among women worldwide. However, early prediction of breast cancer plays a crucial role. Therefore, strong needs exist for automatic accurate early prediction of breast cancer. In this paper, machine learning (ML) classifiers combined with features selection methods are used to build an intelligent tool for breast cancer prediction. The Wisconsin diagnostic breast cancer (WDBC) dataset is used to train and test the model. Classification algorithms, including support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), logistic regression (LR), k-nearest neighbors (k-NN), and naïve Bayes, were employed. Performance measures for each of them were obtained, namely: accuracy, precision, recall, F-score, Kappa, Matthews correlation coefficient (MCC), and time. The results indicate that without feature selection, LightGBM achieves the highest accuracy at 95%. With minimum redundancy maximum relevance (mRMR) feature selection (15 features), LightGBM outperforms other classifiers, achieving an accuracy of 98%. For Pearson correlation coefficient feature selection (15 features), LightGBM also excels with a 95% accuracy rate. Lasso feature selection (5 features) produces varied results across classifiers, with logistic regression achieving the highest accuracy at 96%. These findings underscore the importance of feature selection in refining model performance and in improving detection for breast cancer.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second
most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast
cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data
processing tools, we tackled this disease analysis. Data mining is an important step of library discovery
where intelligent methods are used to detect patterns. Several clinical breast cancer studies were
conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier,
easier, or more comprehensive than others. This research is focused on genetic programming and machine
learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best
features and parameter values. Data mining is an important step of library discovery where intelligent
methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A
comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F.
Tree processes, i.e. 97.71%.
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast
cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise
the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
Mortality leading among women in developed countries is breast cancer. Breast cancer is women's second most prominent cause of cancer mortality worldwide. In recent decades, women's high prevalence of breast cancer has risen dramatically. This paper discussed several data analysis methods used to detect breast cancer early. Breast cancer diagnosis distinguishes benign and malignant breast lumps. Using data processing tools, we tackled this disease analysis. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. Several clinical breast cancer studies were conducted using soft computing and machine learning techniques. Sometimes their algorithms are easier, easier, or more comprehensive than others. This research is focused on genetic programming and machine
learning algorithms to reliably identify benign and malignant breast cancer. This study aimed to optimise the testing algorithm. We used genetic programming methods to choose classification machines' best features and parameter values. Data mining is an important step of library discovery where intelligent methods are used to detect patterns. We are analysing data accessible from the U.C.I. deep-learning data
set in Wisconsin. In this experiment, we equate four Weka clustering strategies with genetic clustering. A comparison of results reveals that sequential minimal optimization (S.M.O.) is better than I.B.K. and B.F. Tree processes, i.e. 97.71%.
Predictive modeling for breast cancer based on machine learning algorithms an...IJECEIAES
Breast cancer is one of the leading causes of death among women worldwide. However, early prediction of breast cancer plays a crucial role. Therefore, strong needs exist for automatic accurate early prediction of breast cancer. In this paper, machine learning (ML) classifiers combined with features selection methods are used to build an intelligent tool for breast cancer prediction. The Wisconsin diagnostic breast cancer (WDBC) dataset is used to train and test the model. Classification algorithms, including support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), logistic regression (LR), k-nearest neighbors (k-NN), and naïve Bayes, were employed. Performance measures for each of them were obtained, namely: accuracy, precision, recall, F-score, Kappa, Matthews correlation coefficient (MCC), and time. The results indicate that without feature selection, LightGBM achieves the highest accuracy at 95%. With minimum redundancy maximum relevance (mRMR) feature selection (15 features), LightGBM outperforms other classifiers, achieving an accuracy of 98%. For Pearson correlation coefficient feature selection (15 features), LightGBM also excels with a 95% accuracy rate. Lasso feature selection (5 features) produces varied results across classifiers, with logistic regression achieving the highest accuracy at 96%. These findings underscore the importance of feature selection in refining model performance and in improving detection for breast cancer.
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
The classification algorithms are very frequently used algorithms for analyzing various kinds of data available in different repositories which have real world applications. The main objective of this research work is to find the performance of classification algorithms in analyzing Breast Cancer data via analyzing the mammogram images based its characteristics.Different attribute values of cancer affected mammogram images are considered for analysis in this work. The Patients food habits, age of the patients, their life styles, occupation, their problem about the diseases and other information are taken into account for classification. Finally, performance of classification algorithms J48, CART and ADTree are given with its accuracy. The accuracy of taken algorithms is measured by various measures like specificity, sensitivity and kappa statistics (Errors).
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
The most common type of cancer in women
worldwide is the Breast Cancer. Breast cancer may be
detected early using Mammograms, probably before it's
spread. Recurrent breast cancer could occur months or years
after initial treatment. The cancer could return within the
same place because the original cancer (local recurrence), or it
may spread to different areas of your body (distant
recurrence). Early stage treatment is done not only to cure
breast cancer however additionally facilitate in preventing its
repetition/recurrence. Data mining algorithms provide
assistance in predicting the early-stage breast cancer that
continually has been difficult analysis drawback. The
projected analysis can establish the most effective algorithm
that predicts the recurrence of the breast cancer and improve
the accuracy the algorithms. Large information like Clump,
Classification, Association Rules, Prediction and Neural
Networks, Decision Trees can be analyzed using data mining
applications and techniques.
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...Kiogyf
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND FIREFLY ALGORITHM
ABSTRACT
Cancer is a globally recognized cause of death. A proper cancer analysis demands the classification of several types of tumor. Investigations into microarray gene expressions seem to be a successful platform for revising genetic diseases. Although the standard machine learning (ML) approaches have been efficient in the realization of significant genes and in the classification of new types of cancer cases, their medical and logical application has faced several drawbacks such as DNA microarray data analysis limitation, which includes an incredible number of features and the relatively small size of an instance. To achieve a reasonable and efficient DNA microarray dataset information, there is a need to extend the level of interpretability and forecast approach while maintaining a great level of precision. In this work, a novel way of cancer classification based on based gene expression profiles is presented. This method is a combination of both Firefly algorithm and Mutual Information Method. First, the features are used to select the features before using the Firefly algorithm for feature reduction. Finally, the Support Vector Machine is used to classify cancer into types. The performance of the proposed system was evaluated by using it to classify datasets from colon cancer; the results of the evaluation were compared with some recent approaches.
Keywords: Feature Selection, Firefly Algorithm, Cancer Disease, Mutual Information
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
At the 35th AICC-RCOG Annual Conference in association with FOGSI and MOGS, Dr. Niranjan Chavan, President of MOGS, gave an address on Artificial Intelligence in Gynaecologic Oncology at Taj Lands' End, Bandra, Mumbai on the 6th November 2022
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
The Evolution and Impact of Medical Science Journals in Advancing Healthcaresana473753
Medical science journals have evolved into essential tools for advancing healthcare by disseminating research findings, promoting evidence-based practices, and fostering collaboration. Their historical significance, role in evidence-based medicine, and adaptability to the digital age make them indispensable in the quest for improved healthcare outcomes. As they continue to evolve, medical science journals will play a vital role in shaping the future of medicine and healthcare worldwide.
"journals" refer to academic or professional publications that contain articles and research papers related to various aspects of the medical field. These journals serve as a platform for the dissemination of new medical knowledge, research findings, clinical studies, and expert opinions. They play a crucial role in advancing medical science, sharing best practices, and keeping healthcare professionals, researchers, and students informed about the latest developments in medicine and related disciplines.
An approach of cervical cancer diagnosis using class weighting and oversampli...TELKOMNIKA JOURNAL
Globally, cervical cancer caused 604,127 new cases and 341,831 deaths in 2020, according to the global cancer observatory. In addition, the number of cervical cancer patients who have no symptoms has grown recently. Therefore, giving patients early notice of the possibility of cervical cancer is a useful task since it would enable them to have a clear understanding of their health state. The use of artificial intelligence (AI), particularly in machine learning, in this work is continually uncovering cervical cancer. With the help of a logit model and a new deep learning technique, we hope to identify cervical cancer using patient-provided data. For better outcomes, we employ Keras deep learning and its technique, which includes class weighting and oversampling. In comparison to the actual diagnostic result, the experimental result with model accuracy is 94.18%, and it also demonstrates a successful logit model cervical cancer prediction.
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
Over the past few years, there has been a considerable spread of microarray technology in many biological patterns, particularly in those pertaining to cancer diseases like leukemia, prostate, colon cancer, etc. The primary bottleneck that one experiences in the proper understanding of such datasets lies in their dimensionality, and thus for an efficient and effective means of studying the same, a reduction in their dimension to a large extent is deemed necessary. This study is a bid to suggesting different algorithms and approaches for the reduction of dimensionality of such microarray datasets.This study exploits the matrix-like structure of such microarray data and uses a popular technique called Non-Negative Matrix Factorization (NMF) to reduce the dimensionality, primarily in the field of biological data. Classification accuracies are then compared for these algorithms.This technique gives an accuracy of 98%.
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
Over the past few years, there has been a considerable spread of microarray technology in many
biological patterns, particularly in those pertaining to cancer diseases like leukemia, prostate, colon
cancer, etc. The primary bottleneck that one experiences in the proper understanding of such datasets lies
in their dimensionality, and thus for an efficient and effective means of studying the same, a reduction in
their dimension to a large extent is deemed necessary. This study is a bid to suggesting different algorithms
and approaches for the reduction of dimensionality of such microarray datasets.This study exploits the
matrix-like structure of such microarray data and uses a popular technique called Non-Negative Matrix
Factorization (NMF) to reduce the dimensionality, primarily in the field of biological data. Classification
accuracies are then compared for these algorithms.This technique gives an accuracy of 98%
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
inAbstract— Pattern Recognition (PR) plays an important role in field of Bioinformatics. PR is concerned with processing raw measurement data by a computer to arrive at a prediction that can be used to formulate a decision to be taken. The important problem in which pattern recognition are applied have common that they are too complex to model explicitly. Diverse methods of this PR are used to analyze, segment and manage the high dimensional microarray gene data for classification. PR is concerned with the development of systems that learn to solve a given problem using a set of instances, each instances represented by a number of features. The microarray expression technologies are possible to monitor the expression levels of thousands of genes simultaneously. The microarrays generated large amount of data has stimulate the development of various computational methods to different biological processes by gene expression profiling. Microarray Gene Expression Profiling (MGEP) is important in Bioinformatics, it yield various high dimensional data used in various clinical applications like cancer diagnostics and drug designing. In this work a new schema has developed for classification of unknown malignant tumors into known class. According to this work an new classification scheme includes the transformation of very high dimensional microarray data into mahalanobis space before classification. The eligibility of the proposed classification scheme has proved to 10 commonly available cancer gene datasets, this contains both the binary and multiclass data sets. To improve the performance of the classification gene selection method is applied to the datasets as a preprocessing and data extraction step.
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisIOSR Journals
Abstract: We analyze the breast Cancer data available from the WBC, WDBC from UCI machine learning with
the aim of developing accurate prediction models for breast cancer using data mining techniques. Data mining
has, for good reason, recently attracted a lot of attention, it is a new Technology, tackling new problem, with
great potential for valuable commercial and scientific discoveries. The experiments are conducted in WEKA.
Several data mining classification techniques were used on the proposed data. There are many classification
techniques in data mining such as Decision Tree, Rules NNge, Tree random forest, Random Tree, lazy IBK. The
aim of this paper is to investigate the performance of different classification techniques. The data breast cancer
data with a total 286 rows and 10 columns will be used to test and justify the different between the classification
methods and algorithm.
Keywords - Machine learning, data mining Weka, classification, breast cancer
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...mlaij
The classification of different types of tumor is of great importance in cancer diagnosis and drug discovery.
Earlier studies on cancer classification have limited diagnostic ability. The recent development of DNA
microarray technology has made monitoring of thousands of gene expression simultaneously. By using this
abundance of gene expression data researchers are exploring the possibilities of cancer classification.
There are number of methods proposed with good results, but lot of issues still need to be addressed. This
paper present an overview of various cancer classification methods and evaluate these proposed methods
based on their classification accuracy, computational time and ability to reveal gene information. We have
also evaluated and introduced various proposed gene selection method. In this paper, several issues
related to cancer classification have also been discussed.
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34×16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
Scientific workload execution on a distributed computing platform such as a
cloud environment is time-consuming and expensive. The scientific workload
has task dependencies with different service level agreement (SLA)
prerequisites at different levels. Existing workload scheduling (WS) designs
are not efficient in assuring SLA at the task level. Alongside, induces higher
costs as the majority of scheduling mechanisms reduce either time or energy.
In reducing, cost both energy and makespan must be optimized together for
allocating resources. No prior work has considered optimizing energy and
processing time together in meeting task level SLA requirements. This paper
presents task level energy and performance assurance-workload scheduling
(TLEPA-WS) algorithm for the distributed computing environment. The
TLEPA-WS guarantees energy minimization with the performance
requirement of the parallel application under a distributed computational
environment. Experiment results show a significant reduction in using energy
and makespan; thereby reducing the cost of workload execution in comparison
with various standard workload execution models.
More Related Content
Similar to Comparison of breast cancer classification models on Wisconsin dataset
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND F...Kiogyf
A CLASSIFICATION MODEL ON TUMOR CANCER DISEASE BASED MUTUAL INFORMATION AND FIREFLY ALGORITHM
ABSTRACT
Cancer is a globally recognized cause of death. A proper cancer analysis demands the classification of several types of tumor. Investigations into microarray gene expressions seem to be a successful platform for revising genetic diseases. Although the standard machine learning (ML) approaches have been efficient in the realization of significant genes and in the classification of new types of cancer cases, their medical and logical application has faced several drawbacks such as DNA microarray data analysis limitation, which includes an incredible number of features and the relatively small size of an instance. To achieve a reasonable and efficient DNA microarray dataset information, there is a need to extend the level of interpretability and forecast approach while maintaining a great level of precision. In this work, a novel way of cancer classification based on based gene expression profiles is presented. This method is a combination of both Firefly algorithm and Mutual Information Method. First, the features are used to select the features before using the Firefly algorithm for feature reduction. Finally, the Support Vector Machine is used to classify cancer into types. The performance of the proposed system was evaluated by using it to classify datasets from colon cancer; the results of the evaluation were compared with some recent approaches.
Keywords: Feature Selection, Firefly Algorithm, Cancer Disease, Mutual Information
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
Breast Cancer is one of the crucial and burning diseases that has invaded women. Predicting breast cancer manually takes a lot of time and it is difficult for the physician to classification. So, detecting cancer through various automatic diagnostic techniques is very necessary. Data mining is the process of running powerful classification techniques that extract useful information from data. The uses and potentials of these techniques have found its scope in medical data. Classification techniques tend to simplify the prediction segment.
BREAST CANCER DIAGNOSIS USING MACHINE LEARNING ALGORITHMS –A SURVEYijdpsjournal
Breast cancer has become a common factor now-a-days. Despite the fact, not all general hospitals
have the facilities to diagnose breast cancer through mammograms. Waiting for diagnosing a breast
cancer for a long time may increase the possibility of the cancer spreading. Therefore a computerized
breast cancer diagnosis has been developed to reduce the time taken to diagnose the breast cancer and
reduce the death rate. This paper summarizes the survey on breast cancer diagnosis using various machine
learning algorithms and methods, which are used to improve the accuracy of predicting cancer. This survey
can also help us to know about number of papers that are implemented to diagnose the breast cancer.
At the 35th AICC-RCOG Annual Conference in association with FOGSI and MOGS, Dr. Niranjan Chavan, President of MOGS, gave an address on Artificial Intelligence in Gynaecologic Oncology at Taj Lands' End, Bandra, Mumbai on the 6th November 2022
Breast cancer diagnosis via data mining performance analysis of seven differe...cseij
According to World Health Organization (WHO), breast cancer is the top cancer in women both in the
developed and the developing world. Increased life expectancy, urbanization and adoption of western
lifestyles trigger the occurrence of breast cancer in the developing world. Most cancer events are
diagnosed in the late phases of the illness and so, early detection in order to improve breast cancer
outcome and survival is very crucial.
In this study, it is intended to contribute to the early diagnosis of breast cancer. An analysis on breast
cancer diagnoses for the patients is given. For the purpose, first of all, data about the patients whose
cancers’ have already been diagnosed is gathered and they are arranged, and then whether the other
patients are in trouble with breast cancer is tried to be predicted under cover of those data. Predictions of
the other patients are realized through seven different algorithms and the accuracies of those have been
given. The data about the patients have been taken from UCI Machine Learning Repository thanks to Dr.
William H. Wolberg from the University of Wisconsin Hospitals, Madison. During the prediction process,
RapidMiner 5.0 data mining tool is used to apply data mining with the desired algorithms.
The Evolution and Impact of Medical Science Journals in Advancing Healthcaresana473753
Medical science journals have evolved into essential tools for advancing healthcare by disseminating research findings, promoting evidence-based practices, and fostering collaboration. Their historical significance, role in evidence-based medicine, and adaptability to the digital age make them indispensable in the quest for improved healthcare outcomes. As they continue to evolve, medical science journals will play a vital role in shaping the future of medicine and healthcare worldwide.
"journals" refer to academic or professional publications that contain articles and research papers related to various aspects of the medical field. These journals serve as a platform for the dissemination of new medical knowledge, research findings, clinical studies, and expert opinions. They play a crucial role in advancing medical science, sharing best practices, and keeping healthcare professionals, researchers, and students informed about the latest developments in medicine and related disciplines.
An approach of cervical cancer diagnosis using class weighting and oversampli...TELKOMNIKA JOURNAL
Globally, cervical cancer caused 604,127 new cases and 341,831 deaths in 2020, according to the global cancer observatory. In addition, the number of cervical cancer patients who have no symptoms has grown recently. Therefore, giving patients early notice of the possibility of cervical cancer is a useful task since it would enable them to have a clear understanding of their health state. The use of artificial intelligence (AI), particularly in machine learning, in this work is continually uncovering cervical cancer. With the help of a logit model and a new deep learning technique, we hope to identify cervical cancer using patient-provided data. For better outcomes, we employ Keras deep learning and its technique, which includes class weighting and oversampling. In comparison to the actual diagnostic result, the experimental result with model accuracy is 94.18%, and it also demonstrates a successful logit model cervical cancer prediction.
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
Over the past few years, there has been a considerable spread of microarray technology in many biological patterns, particularly in those pertaining to cancer diseases like leukemia, prostate, colon cancer, etc. The primary bottleneck that one experiences in the proper understanding of such datasets lies in their dimensionality, and thus for an efficient and effective means of studying the same, a reduction in their dimension to a large extent is deemed necessary. This study is a bid to suggesting different algorithms and approaches for the reduction of dimensionality of such microarray datasets.This study exploits the matrix-like structure of such microarray data and uses a popular technique called Non-Negative Matrix Factorization (NMF) to reduce the dimensionality, primarily in the field of biological data. Classification accuracies are then compared for these algorithms.This technique gives an accuracy of 98%.
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
Over the past few years, there has been a considerable spread of microarray technology in many
biological patterns, particularly in those pertaining to cancer diseases like leukemia, prostate, colon
cancer, etc. The primary bottleneck that one experiences in the proper understanding of such datasets lies
in their dimensionality, and thus for an efficient and effective means of studying the same, a reduction in
their dimension to a large extent is deemed necessary. This study is a bid to suggesting different algorithms
and approaches for the reduction of dimensionality of such microarray datasets.This study exploits the
matrix-like structure of such microarray data and uses a popular technique called Non-Negative Matrix
Factorization (NMF) to reduce the dimensionality, primarily in the field of biological data. Classification
accuracies are then compared for these algorithms.This technique gives an accuracy of 98%
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
inAbstract— Pattern Recognition (PR) plays an important role in field of Bioinformatics. PR is concerned with processing raw measurement data by a computer to arrive at a prediction that can be used to formulate a decision to be taken. The important problem in which pattern recognition are applied have common that they are too complex to model explicitly. Diverse methods of this PR are used to analyze, segment and manage the high dimensional microarray gene data for classification. PR is concerned with the development of systems that learn to solve a given problem using a set of instances, each instances represented by a number of features. The microarray expression technologies are possible to monitor the expression levels of thousands of genes simultaneously. The microarrays generated large amount of data has stimulate the development of various computational methods to different biological processes by gene expression profiling. Microarray Gene Expression Profiling (MGEP) is important in Bioinformatics, it yield various high dimensional data used in various clinical applications like cancer diagnostics and drug designing. In this work a new schema has developed for classification of unknown malignant tumors into known class. According to this work an new classification scheme includes the transformation of very high dimensional microarray data into mahalanobis space before classification. The eligibility of the proposed classification scheme has proved to 10 commonly available cancer gene datasets, this contains both the binary and multiclass data sets. To improve the performance of the classification gene selection method is applied to the datasets as a preprocessing and data extraction step.
Performance and Evaluation of Data Mining Techniques in Cancer DiagnosisIOSR Journals
Abstract: We analyze the breast Cancer data available from the WBC, WDBC from UCI machine learning with
the aim of developing accurate prediction models for breast cancer using data mining techniques. Data mining
has, for good reason, recently attracted a lot of attention, it is a new Technology, tackling new problem, with
great potential for valuable commercial and scientific discoveries. The experiments are conducted in WEKA.
Several data mining classification techniques were used on the proposed data. There are many classification
techniques in data mining such as Decision Tree, Rules NNge, Tree random forest, Random Tree, lazy IBK. The
aim of this paper is to investigate the performance of different classification techniques. The data breast cancer
data with a total 286 rows and 10 columns will be used to test and justify the different between the classification
methods and algorithm.
Keywords - Machine learning, data mining Weka, classification, breast cancer
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
Breast cancer is one of the leading cancers for women in developed countries including India. It is the
second most common cause of cancer death in women. The high incidence of breast cancer in women has
increased significantly in the last years. In this paper we have discussed various data mining approaches
that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is
distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast
Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various
review and technical articles on breast cancer diagnosis and prognosis also we focus on current research
being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...mlaij
The classification of different types of tumor is of great importance in cancer diagnosis and drug discovery.
Earlier studies on cancer classification have limited diagnostic ability. The recent development of DNA
microarray technology has made monitoring of thousands of gene expression simultaneously. By using this
abundance of gene expression data researchers are exploring the possibilities of cancer classification.
There are number of methods proposed with good results, but lot of issues still need to be addressed. This
paper present an overview of various cancer classification methods and evaluate these proposed methods
based on their classification accuracy, computational time and ability to reveal gene information. We have
also evaluated and introduced various proposed gene selection method. In this paper, several issues
related to cancer classification have also been discussed.
Similar to Comparison of breast cancer classification models on Wisconsin dataset (20)
Because of the rapid growth in technology breakthroughs, including
multimedia and cell phones, Telugu character recognition (TCR) has recently
become a popular study area. It is still necessary to construct automated and
intelligent online TCR models, even if many studies have focused on offline
TCR models. The Telugu character dataset construction and validation using
an Inception and ResNet-based model are presented. The collection of 645
letters in the dataset includes 18 Achus, 38 Hallus, 35 Othulu, 34×16
Guninthamulu, and 10 Ankelu. The proposed technique aims to efficiently
recognize and identify distinctive Telugu characters online. This model's main
pre-processing steps to achieve its goals include normalization, smoothing,
and interpolation. Improved recognition performance can be attained by using
stochastic gradient descent (SGD) to optimize the model's hyperparameters.
Scientific workload execution on a distributed computing platform such as a
cloud environment is time-consuming and expensive. The scientific workload
has task dependencies with different service level agreement (SLA)
prerequisites at different levels. Existing workload scheduling (WS) designs
are not efficient in assuring SLA at the task level. Alongside, induces higher
costs as the majority of scheduling mechanisms reduce either time or energy.
In reducing, cost both energy and makespan must be optimized together for
allocating resources. No prior work has considered optimizing energy and
processing time together in meeting task level SLA requirements. This paper
presents task level energy and performance assurance-workload scheduling
(TLEPA-WS) algorithm for the distributed computing environment. The
TLEPA-WS guarantees energy minimization with the performance
requirement of the parallel application under a distributed computational
environment. Experiment results show a significant reduction in using energy
and makespan; thereby reducing the cost of workload execution in comparison
with various standard workload execution models.
Investigating human subjects is the goal of predicting human emotions in the
real world scenario. A significant number of psychological effects require
(feelings) to be produced, directly releasing human emotions. The
development of effect theory leads one to believe that one must be aware of
one's sentiments and emotions to forecast one's behavior. The proposed line
of inquiry focuses on developing a reliable model incorporating
neurophysiological data into actual feelings. Any change in emotional affect
will directly elicit a response in the body's physiological systems. This
approach is named after the notion of Gaussian mixture models (GMM). The
statistical reaction following data processing, quantitative findings on emotion
labels, and coincidental responses with training samples all directly impact the
outcomes that are accomplished. In terms of statistical parameters such as
population mean and standard deviation, the suggested method is evaluated
compared to a technique considered to be state-of-the-art. The proposed
system determines an individual's emotional state after a minimum of 6
iterative learning using the Gaussian expectation-maximization (GEM)
statistical model, in which the iterations tend to continue to zero error. Perhaps
each of these improves predictions while simultaneously increasing the
amount of value extracted.
Early diagnosis of cancers is a major requirement for patients and a
complicated job for the oncologist. If it is diagnosed early, it could have made
the patient more likely to live. For a few decades, fuzzy logic emerged as an
emphatic technique in the identification of diseases like different types of
cancers. The recognition of cancer diseases mostly operated with inexactness,
inaccuracy, and vagueness. This paper aims to design the fuzzy expert system
(FES) and its implementation for the detection of prostate cancer. Specifically,
prostate-specific antigen (PSA), prostate volume (PV), age, and percentage
free PSA (%FPSA) are used to determine prostate cancer risk (PCR), while
PCR serves as an output parameter. Mamdani fuzzy inference method is used
to calculate a range of PCR. The system provides a scale of risk of prostate
cancer and clears the path for the oncologist to determine whether their
patients need a biopsy. This system is fast as it requires minimum calculation
and hence comparatively less time which reduces mortality and morbidity and
is more reliable than other economic systems and can be frequently used by
doctors.
The biomedical profession has gained importance due to the rapid and accurate diagnosis of clinical patients using computer-aided diagnosis (CAD) tools.
The diagnosis and treatment of Alzheimer’s disease (AD) using complementary multimodalities can improve the quality of life and mental state of patients.
In this study, we integrated a lightweight custom convolutional neural network
(CNN) model and nature-inspired optimization techniques to enhance the performance, robustness, and stability of progress detection in AD. A multi-modal
fusion database approach was implemented, including positron emission tomography (PET) and magnetic resonance imaging (MRI) datasets, to create a fused
database. We compared the performance of custom and pre-trained deep learning models with and without optimization and found that employing natureinspired algorithms like the particle swarm optimization algorithm (PSO) algorithm significantly improved system performance. The proposed methodology,
which includes a fused multimodality database and optimization strategy, improved performance metrics such as training, validation, test accuracy, precision, and recall. Furthermore, PSO was found to improve the performance of
pre-trained models by 3-5% and custom models by up to 22%. Combining different medical imaging modalities improved the overall model performance by
2-5%. In conclusion, a customized lightweight CNN model and nature-inspired
optimization techniques can significantly enhance progress detection, leading to
better biomedical research and patient care.
Class imbalance is a pervasive issue in the field of disease classification from
medical images. It is necessary to balance out the class distribution while training a model. However, in the case of rare medical diseases, images from affected
patients are much harder to come by compared to images from non-affected
patients, resulting in unwanted class imbalance. Various processes of tackling
class imbalance issues have been explored so far, each having its fair share of
drawbacks. In this research, we propose an outlier detection based image classification technique which can handle even the most extreme case of class imbalance. We have utilized a dataset of malaria parasitized and uninfected cells. An
autoencoder model titled AnoMalNet is trained with only the uninfected cell images at the beginning and then used to classify both the affected and non-affected
cell images by thresholding a loss value. We have achieved an accuracy, precision, recall, and F1 score of 98.49%, 97.07%, 100%, and 98.52% respectively,
performing better than large deep learning models and other published works.
As our proposed approach can provide competitive results without needing the
disease-positive samples during training, it should prove to be useful in binary
disease classification on imbalanced datasets.
Recently, plant identification has become an active trend due to encouraging
results achieved in plant species detection and plant classification fields
among numerous available plants using deep learning methods. Therefore,
plant classification analysis is performed in this work to address the problem
of accurate plant species detection in the presence of multiple leaves together,
flowers, and noise. Thus, a convolutional neural network based deep feature
learning and classification (CNN-DFLC) model is designed to analyze
patterns of plant leaves and perform classification using generated finegrained feature weights. The proposed CNN-DFLC model precisely estimates
which the given image belongs to which plant species. Several layers and
blocks are utilized to design the proposed CNN-DFLC model. Fine-grained
feature weights are obtained using convolutional and pooling layers. The
obtained feature maps in training are utilized to predict labels and model
performance is tested on the Vietnam plant image (VPN-200) dataset. This
dataset consists of a total number of 20,000 images and testing results are
achieved in terms of classification accuracy, precision, recall, and other
performance metrics. The mean classification accuracy obtained using the
proposed CNN-DFLC model is 96.42% considering all 200 classes from the
VPN-200 dataset.
Big data as a service (BDaaS) platform is widely used by various
organizations for handling and processing the high volume of data generated
from different internet of things (IoT) devices. Data generated from these IoT
devices are kept in the form of big data with the help of cloud computing
technology. Researchers are putting efforts into providing a more secure and
protected access environment for the data available on the cloud. In order to
create a safe, distributed, and decentralised environment in the cloud,
blockchain technology has emerged as a useful tool. In this research paper, we
have proposed a system that uses blockchain technology as a tool to regulate
data access that is provided by BDaaS platforms. We are securing the access
policy of data by using a modified form of ciphertext policy-attribute based
encryption (CP-ABE) technique with the help of blockchain technology. For
secure data access in BDaaS, algorithms have been created using a mix of CPABE with blockchain technology. Proposed smart contract algorithms are
implemented using Eclipse 7.0 IDE and the cloud environment has been
simulated on CloudSim tool. Results of key generation time, encryption time,
and decryption time has been calculated and compared with access control
mechanism without blockchain technology.
Internet of things (IoT) has become one of the eminent phenomena in human
life along with its collaboration with wireless sensor networks (WSNs), due
to enormous growth in the domain; there has been a demand to address the
various issues regarding it such as energy consumption, redundancy, and
overhead. Data aggregation (DA) is considered as the basic mechanism to
minimize the energy efficiency and communication overhead; however,
security plays an important role where node security is essential due to the
volatile nature of WSN. Thus, we design and develop proximate node aware
secure data aggregation (PNA-SDA). In the PNA-SDA mechanism, additional
data is used to secure the original data, and further information is shared with
the proximate node; moreover, further security is achieved by updating the
state each time. Moreover, the node that does not have updated information is
considered as the compromised node and discarded. PNA-SDA is evaluated
considering the different parameters like average energy consumption, and
average deceased node; also, comparative analysis is carried out with the
existing model in terms of throughput and correct packet identification.
Drones provide an alternative progression in protection submissions since
they are capable of conducting autonomous seismic investigations. Recent
advancement in unmanned aerial vehicle (UAV) communication is an internet
of a drone combined with 5G networks. Because of the quick utilization of
rapidly progressed registering frameworks besides 5G officialdoms, the
information from the user is consistently refreshed and pooled. Thus, safety
or confidentiality is vital among clients, and a proficient substantiation
methodology utilizing a vigorous sanctuary key. Conventional procedures
ensure a few restrictions however taking care of the assault arrangements in
information transmission over the internet of drones (IOD) environmental
frameworks. A unique hyperelliptical curve (HEC) cryptographically based
validation system is proposed to provide protected data facilities among
drones. The proposed method has been compared with the existing methods
in terms of packet loss rate, computational cost, and delay and thereby
provides better insight into efficient and secure communication. Finally, the
simulation results show that our strategy is efficient in both computation and
communication.
Monitoring behavior, numerous actions, or any such information is considered
as surveillance and is done for information gathering, influencing, managing,
or directing purposes. Citizens employ surveillance to safeguard their
communities. Governments do this for the purposes of intelligence collection,
including espionage, crime prevention, the defense of a method, a person, a
group, or an item; or the investigation of criminal activity. Using an internet
of things (IoT) rover, the area will be secured with better secrecy and
efficiency instead of humans, will provide an additional safety step. In this
paper, there is a discussion about an IoT rover for remote surveillance based
around a Raspberry Pi microprocessor which will be able to monitor a
closed/open space. This rover will allow safer survey operations and would
help to reduce the risks involved with it.
In a world where climate change looms large the spotlight often shines on
greenhouse gases, but the shadow of man-made aerosols should not be
underestimated. These tiny particles play a pivotal role in disrupting Earth's
radiative equilibrium, yet many mysteries surround their influence on various
physical aspects of our planet. The root of these mysteries lies in the limited
data we have on aerosol sources, formation processes, conversion dynamics,
and collection methods. Aerosols, composed of particulate matter (PM),
sulfates, and nitrates, hold significant sway across the hemisphere. Accurate
measurement demands the refinement of in-situ, satellite, and ground-based
techniques. As aerosols interact intricately with the environment, their full
impact remains an enigma. Enter a groundbreaking study in Morocco that
dared to compare an internet of thing (IoT) system with satellite-based
atmospheric models, with a focus on fine particles below 10 and 2.5
micrometers in diameter. The initial results, particularly in regions abundant
with extraction pits, shed light on the IoT system's potential to decode
aerosols' role in the grand narrative of climate change. These findings inspire
hope as we confront the formidable global challenge of climate change.
The use of technology has a significant impact to reduce the consequences of
accidents. Sensors, small components that detect interactions experienced by
various components, play a crucial role in this regard. This study focuses on
how the MPU6050 sensor module can be used to detect the movement of
people who are falling, defined as the inability of the lower body, including
the hips and feet, to support the body effectively. An airbag system is
proposed to reduce the impact of a fall. The data processing method in this
study involves the use of a threshold value to identify falling motion. The
results of the study have identified a threshold value for falling motion,
including an acceleration relative (AR) value of less than or equal to 0.38 g,
an angle slope of more than or equal to 40 degrees, and an angular velocity
of more than or equal to 30 °/s. The airbag system is designed to inflate
faster than the time of impact, with a gas flow rate of 0.04876 m3
/s and an
inflating time of 0.05 s. The overall system has a specificity value of 100%,
a sensitivity of 85%, and an accuracy of 94%.
The fundamental principle of the paper is that the soil moisture sensor obtains
the moisture content level of the soil sample. The water pump is automatically
activated if the moisture content is insufficient, which causes water to flow
into the soil. The water pump is immediately turned off when the moisture
content is high enough. Smart home, smart city, smart transportation, and
smart farming are just a few of the new intelligent ideas that internet of things
(IoT) includes. The goal of this method is to increase productivity and
decrease manual labour among farmers. In this paper, we present a system for
monitoring and regulating water flow that employs a soil moisture sensor to
keep track of soil moisture content as well as the land’s water level to keep
track of and regulate the amount of water supplied to the plant. The device
also includes an automated led lighting system.
In order to provide sensing services to low-powered IoT devices, wireless sensor networks (WSNs) organize specialized transducers into networks. Energy usage is one of the most important design concerns in WSN because it is very hard to replace or recharge the batteries in sensor nodes. For an energy-constrained network, the clustering technique is crucial in preserving battery life. By strategically selecting a cluster head (CH), a network's load can be balanced, resulting in decreased energy usage and extended system life. Although clustering has been predominantly used in the literature, the concept of chain-based clustering has not yet been explored. As a result, in this paper, we employ a chain-based clustering architecture for data dissemination in the network. Furthermore, for CH selection, we employ the coati optimisation algorithm, which was recently proposed and has demonstrated significant improvement over other optimization algorithms. In this method, the parameters considered for selecting the CH are energy, node density, distance, and the network’s average energy. The simulation results show tremendous improvement over the competitive cluster-based routing algorithms in the context of network lifetime, stability period (first node dead), transmission rate, and the network's power reserves.
The construction industry is an industry that is always surrounded by
uncertainties and risks. The industry is always associated with a threatindustry which has a complex, tedious layout and techniques characterized by
unpredictable circumstances. It comprises a variety of human talents and the
coordination of different areas and activities associated with it. In this
competitive era of the construction industry, delays and cost overruns of the
project are often common in every project and the causes of that are also
common. One of the problems which we are trying to cater to is the improper
handling of materials at the construction site. In this paper, we propose
developing a system that is capable of tracking construction material on site
that would benefit the contractor and client for better control over inventory
on-site and to minimize loss of material that occurs due to theft and misplacing
of materials.
Today, health monitoring relies heavily on technological advancements. This
study proposes a low-power wide-area network (LPWAN) based, multinodal
health monitoring system to monitor vital physiological data. The suggested
system consists of two nodes, an indoor node, and an outdoor node, and the
nodes communicate via long range (LoRa) transceivers. Outdoor nodes use an
MPU6050 module, heart rate, oxygen pulse, temperature, and skin resistance
sensors and transmit sensed values to the indoor node. We transferred the data
received by the master node to the cloud using the Adafruit cloud service. The
system can operate with a coverage of 4.5 km, where the optimal distance
between outdoor sensor nodes and the indoor master node is 4 km. To further
predict fall detection, various machine learning classification techniques have
been applied. Upon comparing various classifier techniques, the decision tree
method achieved an accuracy of 0.99864 with a training and testing ratio of
70:30. By developing accurate prediction models, we can identify high-risk
individuals and implement preventative measures to reduce the likelihood of
a fall occurring. Remote monitoring of the health and physical status of elderly
people has proven to be the most beneficial application of this technology.
The effectiveness of adaptive filters are mainly dependent on the design
techniques and the algorithm of adaptation. The most common adaptation
technique used is least mean square (LMS) due its computational simplicity.
The application depends on the adaptive filter configuration used and are well
known for system identification and real time applications. In this work, a
modified delayed μ-law proportionate normalized least mean square
(DMPNLMS) algorithm has been proposed. It is the improvised version of the
µ-law proportionate normalized least mean square (MPNLMS) algorithm.
The algorithm is realized using Ladner-Fischer type of parallel prefix
logarithmic adder to reduce the silicon area. The simulation and
implementation of very large-scale integration (VLSI) architecture are done
using MATLAB, Vivado suite and complementary metal–oxide–
semiconductor (CMOS) 90 nm technology node using Cadence RTL and
Genus Compiler respectively. The DMPNLMS method exhibits a reduction
in mean square error, a higher rate of convergence, and more stability. The
synthesis results demonstrate that it is area and delay effective, making it
practical for applications where a faster operating speed is required.
The increasing demand for faster, robust, and efficient device development of enabling technology to mass production of industrial research in circuit design deals with challenges like size, efficiency, power, and scalability. This paper, presents a design and analysis of low power high speed full adder using negative capacitance field effecting transistors. A comprehensive study is performed with adiabatic logic and reversable logic. The performance of full adder is studied with metal oxide field effect transistor (MOSFET) and negative capacitance field effecting (NCFET). The NCFET based full adder offers a low power and high speed compared with conventional MOSFET. The complete design and analysis are performed using cadence virtuoso. The adiabatic logic offering low delay of 0.023 ns and reversable logic is offering low power of 7.19 mw.
The global agriculture system faces significant challenges in meeting the
growing demand for food production, particularly given projections that the
world's population will reach 70% by 2050. Hydroponic farming is an
increasingly popular technique in this field, offering a promising solution to
these challenges. This paper will present the improvement of the current
traditional hydroponic method by providing a system that can be used to
monitor and control the important element in order to help the plant grow up
smoothly. This proposed system is quite efficient and user-friendly that can
be used by anyone. This is a combination of a traditional hydroponic system,
an automatic control system and a smartphone. The primary objective is to
develop a smart system capable of monitoring and controlling potential
hydrogen (pH) levels, a key factor that affects hydroponic plant growth.
Ultimately, this paper offers an alternative approach to address the challenges
of the existing agricultural system and promote the production of clean,
disease-free, and healthy food for a better future.
More from International Journal of Reconfigurable and Embedded Systems (20)
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Comparison of breast cancer classification models on Wisconsin dataset
1. International Journal of Reconfigurable and Embedded Systems (IJRES)
Vol. 11, No. 2, July 2022, pp. 166~174
ISSN: 2089-4864, DOI: 10.11591/ijres.v11.i2.pp166-174 166
Journal homepage: http://ijres.iaescore.com
Comparison of breast cancer classification models on Wisconsin
dataset
Rania R. Kadhim, Mohammed Y. Kamil
College of Sciences, Mustansiriyah University, Baghdad, Iraq
Article Info ABSTRACT
Article history:
Received Feb 5, 2022
Revised Feb 21, 2022
Accepted Mar 11, 2022
Breast cancer is the leading cause of death for women worldwide. Cancer
can be discovered early, lowering the rate of death. Machine learning
techniques are a hot field of research, and they have been shown to be
helpful in cancer prediction and early detection. The primary purpose of this
research is to identify which machine learning algorithms are the most
successful in predicting and diagnosing breast cancer, according to five
criteria: specificity, sensitivity, precision, accuracy, and F1 score. The
project is finished in the Anaconda environment, which uses Python's
NumPy and SciPy numerical and scientific libraries as well as matplotlib
and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was
used to evaluate eleven machine learning classifiers: decision tree, quadratic
discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized
trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest
neighbors, multilayer perceptron, and support vector classifier. During
performance analysis, extremely randomized trees outperformed all other
classifiers with an F1-score of 96.77% after data collection and data
analysis.
Keywords:
Breast cancer
Classification
Computer-aided diagnosis
Machine learning
Wisconsin
This is an open access article under the CC BY-SA license.
Corresponding Author:
Mohammed Y. Kamil
College of Sciences, Mustansiriyah University
St. Palastine, Baghdad, Iraq
Email: m80y98@uomustansiriyah.edu.iq
1. INTRODUCTION
Breast cancer is the second major cause of death in women, coming in second only to lung cancer
[1]. Cancerous breast cells multiply unrestrained, which is one of the most telling signs that something is
wrong with your breasts. Until today, it has been impossible to stop the growth of breast cancer. Breast tissue
tumors should be found as early as possible to maximize the chances of survival, as this disease accounts for
roughly 15% of all cancer-related fatalities [2]. Tissue biopsies can be used to detect cancer in its earliest
stages, increasing the possibility of a positive result. It is possible to identify this disease using various
methods, including a biopsy, ultrasound, thermography, and fine-needle aspiration biopsies [3]. While
mammography has been the primary mode of early detection, it is not always enough for doctors to conclude
whether or not the patient has breast cancer [4], [5]. As a result, the detection rate is just (60-70)% accurate
[6]. The patient will need to take additional testing, which is expensive and time-consuming [7].
A biopsy is the only effective method of determining if a woman has breast cancer [8]. A biopsy
sample is obtained using specialized needle equipment using an imaging test such as X-rays or other imaging
methods. A small metal marker may be inserted into the breast to facilitate future imaging examinations.
Scientists examine the cells in a lab to determine if they are malignant. It is also critical to know what type of
cells are involved in breast cancer, how aggressive it is (grade), and if the cells have hormones or other
2. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Comparison of breast cancer classification models on Wisconsin dataset (Rania R. Kadhim)
167
receptors that could modify your therapy options [9]. Breast biopsy results can take a few days to process.
Physicians are experts in the study of blood and bodily tissues and perform their work in a specialized
laboratory after the biopsy is completed [10]. The pathologist's report details the size and consistency of the
tissue samples and their location, whether they are malignant or not. As a result of their differing opinions,
these experts may need more surgery to collect additional tissue for location study [11]. Conventional
diagnostic procedures have limitations researchers have applied machine learning (ML) methodologies to
provide a second opinion to clinicians and reduce the risk of human error, resulting in patient death [12]-[15].
ML approaches help automate the decision-making process and help increase a decision support
system [16]. An increase in the amount of work that doctors must accomplish is made possible by better
precision and speed of response, especially in times of medical staff shortage. Preventive care and individual
mental health therapies can benefit from decision-support and health-monitoring systems that apply ML
techniques to learning [17]. Using ML classifiers and the Wisconsin diagnostic breast cancer (WDBC)
dataset, this research evaluated approaches for detecting breast cancer.
Salama et al. [18] presented a comparison of the decision tree (J48), multi-layer perception (MLP),
sequential minimal optimization (SMO), naive Bayes (NB), and instance-based for K-nearest neighbor
(IBK). The results indicate that classification using SMO alone or in combination with MLP or IBK is
superior to other classifiers. SMO outperforms other classifiers in terms of accuracy 97.7%. Azar et al. [19]
implemented an ANN algorithm that included multilayer perceptron (MLP), probabilistic neural networks
(PNN), and radial basis function (RBF). The testing and training phases determined that PNN was the high
accurate classifier, with accuracy, sensitivity, specificity, precision, and area under the curve (AUC) of
98.66%, 95.65%, 95.82%, 97.77%, 0.99, respectively. Senapati et al. [20] proposed the local linear wavelet
neural networks. The network was trained to detect breast cancer in the recursive least squares (RLS)
technique to parameters. Discrepancies in the local linearity network of wavelet neurons with a traditional
wavelet neural architecture (WNN) refer to the weighted average of the connections between the two
conventional WNN's input, and output layers are replaced using a linear regression model. The high accuracy
value obtained was 97.2%. Dora et al. [21] proposed a unique Gauss-Newton representation-based algorithm
(GNRBA) for classification. It uses sparse representation in conjunction with training sample selection.
Compared to the traditional l1-norm approach, this method evaluates sparsity more computationally
efficiently. The suggested method achieved the greatest classification accuracy of 98.25%, according to the
results. For 50–50 partitions, 60–40 partitions, 70–30 partitions, and 10 – fold cross-validation, the results
were 98.86%, 98.46%, and 98.46%, respectively. Wang et al. [22] used two common classifiers: SVM and
KNN. The suggested feature relevance measurement outperforms previous methods in experiments. It also
exceeds many conventional classification execution speed and accuracy methods, demonstrating its
usefulness in obtaining the best features. The KNN model had a high accuracy of 95.8%. Oyelade et al. [23]
used ML methodology to deal with the select and test limitation of one such reasoning algorithm (ST). An
efficient input method is an initial step in technique, which allows the system to read, filter, and clean
datasets. Knowledge representation frameworks were also built using semantic web languages (ontologies
and rule languages) to help the reasoning algorithm. As a result, the ST reasoning structures were modified to
support this improvement. The ST-ONCODIAG technique had 81.0% sensitivity and 89.0% specificity.
This project aims to construct ML models and apply them to breast cancer diagnosis. By comparing
the results of the eleven classification algorithms and choosing the classification algorithm that achieved the
highest results for this dataset, The remainder of this work is divided into the following sections: Section 2
explains the technique, and Section 3 explains the experimental environment. Section 4 explains the results,
conclusions, and future work, and Section 5 talks about the significance of the results, conclusions, and future
work.
2. RESEARCH METHOD
The major goal of this study is to identify the most accurate and predictive breast cancer detection
algorithm. we used different ML classifiers: decision tree, quadratic discriminant analysis, AdaBoost,
Bagging meta estimator, extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-
nearest neighbors, multilayer perceptron, and support vector classifier. The first step in our methodology pre-
processing includes attribute selection. ML systems that can detect breast cancer using new measures can be
improved using pre-processed data. To evaluate the algorithms' performance, we use new data that has been
labeled. Our labeled data is often divided into two parts to accomplish this. Splitting the method test train ML
models are trained using 80% of the data, known as the training set. We used about 20% of the data, known
as the testing data or testing set. When the models have been evaluated, compared the results to see which
algorithm is the most accurate and the most probable to diagnose breast cancer.
These algorithms are used for making predictive analysis on ML techniques: A decision tree (DT) is
a non-parametric supervised learning approach to classification. The DT is made up of several nodes that
3. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 11, No. 2, July 2022: 166-174
168
together form a rooted tree, which is a tree with no incoming edges and a "root" node. Each of the other
nodes is connected by a single incoming edge. An internal or test node has outer edges; all other nodes are
leaves (also known as the decision or terminal nodes. DT is simple to understand and apply. It is possible to
imagine trees in your mind. It can work with both categorical and numerical data [24].
Quadratic discriminant analysis (QDA) is an individual covariance matrix is estimated for each class
of observations. If you know that each class has a different set of covariances, QDA can be very useful. A
problem with QDA is that it can't make things smaller in size. QDA is more flexible than LDA in that it does
not require equal variance and covariance. To put it another way, the covariance matrix for each class in
QDA might be different. When you have a short training set, LDA is preferable to QDA. QDA has suggested
that if the training set is massive and the classifier's variance isn't a key concern, the assumption of a shared
covariance matrix for the K classes is unsustainable [25]. AdaBoost (AB) the basic idea behind AB is to fit
several weak learners (i.e., models that are just slightly better than random guessings, such as tiny DT) to
multiple copies of the data. The final prediction is calculated by averaging the predictions using a weighted
majority vote (or mean). AB is a simple program it continually corrects the faults of a poor classifier
and enhances accuracy by merging poor learners. You may use AB with several base classifiers. AB
does not suffer from overfitting [26].
Bagging Meta-Estimator (BME) is a type of ensemble algorithm that creates many random subsets
of the original training data used to test a black-box estimator and combines their different predictions to
create the last prediction. These techniques are employed to minimize variation. By adding randomness into
the building method of a base estimator (e.g., DT) and then building an ensemble from it. In many situations,
bagging approaches provide a straightforward way to enhance one model without changing the algorithm
used as a basis. Bagging approaches, in contrast to boosting methods, which are the most effective with weak
models, perform best with powerful and complicated models (e.g., fully formed DT) because they give a
mechanism to reduce overfitting (e.g., shallow DT) [27]. Extremely randomized trees (ERT) are extremely
similar to Random Forests but have two main differences: ERT does not resample data when building a tree,
ERT does not use the “best split”. It is usually provided for a small reduction in the model at the cost of a
small bias increase [28].
Gaussian process classifier (GPC) more especially for probabilistic classification. The probabilistic
classification expresses the test predictions as class probabilities. These binary predictor predictions are
combined to provide a multi-class prediction. Gaussian processes have a clear advantage: prediction is
probabilistic (Gaussian), allowing actual probability ranges to be GPC offers three major benefits over other
widely used classifiers. GPC is capable of dealing with high-dimensional and nonlinear problems that arise in
travel mode detection. Instead of determinant classification findings, GPC provides probabilistic outputs,
which account for the model uncertainty inherent in trip mode identification. Because GPC is a non-
parameterized model, it may tune hyperparameters directly from training data. GPC may also utilize evidence
to build a completely automated model selection procedure [29].
Ridge classifier the goal values are transformed to -1, 1 using this classifier. Before treating the
problem as a regression problem (multi-output regression in the multiclass case), after which the problem is
treated as a regression task, with the same goal. The projected class is determined by the regressor's
prediction sign. The problem is addressed as multi-output regression for multiclass classification, and the
output with the highest value matches the expected class. Using a (penalized) Least Squares rather than the
more common logistic or hinge losses to fit a classification model may appear weird, however, any of those
models can provide similar cross-validation scores in terms of or precision/recall accuracy, but the Ridge
Classifier's penalized least squares loss allows for a far wider range of numerical solvers with various
computational performance profiles [30].
Naïve Bayes (NB) classification algorithms rely on strong assumptions about the independence of
variables. NB classifier uses a Gaussian distribution of numeric predictors with mean and standard deviation
calculated from the training dataset and assumes independence between predictor variables depending on
response. NB models are often employed as a substitute for decision trees to solve classification difficulties.
To develop an NB classifier, all rows in the training dataset that include at least one NA will be ignored.
Missing values in the test dataset are removed from the probability calculation for making predictions. It
performs effectively with categorical input variables compared to numeric input parameters (s). A typical
distribution is taken into account (bell curve, which is a powerful inference) [31].
K-nearest neighbors (KNN) in the field of ML and data mining (DM), the KNN classifier is a
commonly used and well-known non-parametric classifier that is used to solve a variety of issues. KNN is a
simple way to build, but as the dataset grows, the efficiency and speed of the process decrease significantly.
KNN performs well with a limited number of input variables, but as the number of input variables increases,
it fails to predict the output of more data points. KNN requires rather homogeneous features: If you want to
design KNN using a popular length, such as Euclidean or Manhattan distances, absolute differences in
4. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Comparison of breast cancer classification models on Wisconsin dataset (Rania R. Kadhim)
169
attributes must be weighed the same, i.e., a given distance in feature 1 must signify the same as a given
distance in feature 2. Unbalanced data causes complications, and KNN has difficulty dealing with it. KNN is
Sensitive to outliers Because it chooses neighbors only based on distance. KNN is fundamentally incapable
of dealing with problems requiring partial data. The algorithm is easy to understand and apply. In this case,
developing a model, fine-tuning various parameters, or producing more projections are not required [32].
Multi-layer perceptron (MLP) has a hidden layer or layers (except for one input and one output
layer). In contrast to a single layer perceptron, MLP is capable of learning functions that are not linear. There
are weights connected with all connections, however only three weights (w0, w1, and w2). The input layer is
composed of three nodes. The Bias node has a value of 1. Both X1 and X2 are used as external inputs by the
other two nodes (quantities depending upon the given data). To recap, the input layer does not perform any
computation, therefore nodes in the input layer produce anything other than 1, X1, and the outputs of the
hidden layer. The bias node on the hidden layer has an output of 1, making it part of it. The hidden layer's
other two nodes' outputs are dependent on the input layer's (1, X1, X2) outputs and the connections' weights
(edges) [33].
Support vector machine (SVM) is an ML classifier that can generalize across two different classes if
it is given a training sample of labeled data. The most important task for the SVM is to find a hyperplane that
can distinguish between similar and dissimilar classes of data. Because it works effectively in complex three-
dimensional situations, this model is effective. Even when the number of dimensions exceeds the number of
samples, the approach remains effective [34].
3. EXPERIMENT ENVIRONMENT
Oyelade et al. [23] gathered the WDBC dataset from Madison's University of Wisconsin Hospitals.
WDBC is available in the UCI ML repository. To compute features, data from a fine needle aspirate (FNA)
of a breast lump is employed; cell nuclei are defined using the image's features. The WDBC dataset
contained 569 patients (62.74 % benign, 37.26 % malignant) with WDBC as the patient ID, 30 tumor
features, and one class indicator [35]. In total, 30 parameters were considered: area, texture, radius,
perimeter, compactness, smoothness, concavity, concave points, and symmetry. The ID number was removed
because it has no bearing on the classification process. For all of the experiments on the ML algorithms (DT,
QDA, AB, BME, ERT, GPC, Ridge, NB, KNN, MLP, and SVC) presented in this paper, the Scikit-Learn
library and the Python language were utilized. It was implemented with Python's SciPy, pandas, and
matplotlib libraries.
4. RESULTS AND DISCUSSION
Eleven classification methods were used to divide WDBC into benign and malignant tumors: DT,
QDA, AB, BME, ERT, GPC, Ridge, NB, KNN, MLP, and SVC, to assess each model, we used the set of
criteria: specificity sensitivity precision accuracy, F1-Score. Interestingly, it can note these algorithms, GPC,
Ridge, KNN, and SVC have a maximum specificity of 100%, DT has the lowest specificity with a value of
89.55%. The results of specificity are presented in Figure 1.
Figure 1. The specificity of classification models
5. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 11, No. 2, July 2022: 166-174
170
Figure 2 illustrated that QDA, AB, BME, ERT, and MLP had the maximum sensitivity of 95.74%,
but GPC, Ridge, Gaussian NB, and KNN had the lowest sensitivity 89.36%. The results indicate the highest
accuracy was 97.36% in ERT, but, 91.22% in DT and Gaussian NB, as shown in Figure 3.
Figure 2. The sensitivity of classification models
Figure 3. The accuracy of classification models
It is apparent that the GPC, Ridge, KNN, and SVC all had a precision of 100%, while DT had a
precision of 86.27% as shown in Figure 4. The ERT classifier scored the highest F1-Score of 96.77%, while
the Gaussian Naive Bayes classifier had the lowest F1-Score of 89.36%, as illustrated in Figure 5.
6. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Comparison of breast cancer classification models on Wisconsin dataset (Rania R. Kadhim)
171
Figure 4. The precision of classification models
Figure 5. F1_score for classification models
The most remarkable result from this study is the performance of ML in the prediction and detection
of breast cancer. The mean accuracy of all applied models was 95.05%. We observed that the ERT model
made the highest results with this data, was stable, and produced superior results (F1- score in especially).
Additionally, clinicians are interested in the results of the sensitivity-specificity trade-off. Therefore,
sensitivity is a term that relates to a diagnostic test's capacity to diagnose cancer patients as abnormal.
In comparison, specificity refers to a diagnostic's ability to designate non-cancerous patients as
normal. For ML programmers, the receiver operating characteristic (ROC) curve results are of primary
importance. The receiver operating characteristic ROC curve is constructed by computing and plotting the
true positive rate vs. the false positive rate. Because it demonstrates the model's stability and reliability, it
attained a maximum of 1.00 in ERT, as shown in Figure 6.
7. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 11, No. 2, July 2022: 166-174
172
Figure 6. ROC curve for ERT
This outcome can be explained because ERT is an ensemble method that learns from previous
predictor mistakes to improve future predictions. The strategy combines the weaker classification model with
a powerful learner, increasing the model's predictability. By sequentially arranging weak learners, weak
learners can learn from previous ones, strengthening prediction models. Ensemble approaches are well suited
for reducing model variability and increasing prediction accuracy. When multiple models are combined into
a single prediction, the variance is reduced. To correct for previous prediction errors, new predictors are fit.
By adding predictors consecutively to the ensemble, ERT improves the model's accuracy. It makes use of the
gradient to identify and rectify mistakes in the predictions made by learners. Table 1 compares the most
significant findings obtained in this study to other outcomes presented in the literature. These results are
consistent with those of previous studies that show ML can accurately predict the presence of breast cancer.
To summarize, the ML models achieved a greater classification accuracy rate, reduced false positives, and
improved performance. As a result of this research, it has been demonstrated that the ML can help a
radiologist make an accurate breast cancer diagnosis and that it can demonstrate significant capability in the
domain of medical decision-making.
Table1. Comparison of performance to previous studies results
AUC
Precision
Specificity
Sensitivity
Accuracy
ML technique
Year
Reference
-
-
-
-
97.71%
MLP
2012
Salama et al. [18]
0.993
97.77%
95.82%
98.65%
97.66%
ANN
2013
Azar and El-Said [
19
]
-
-
-
-
97.20%
Local linear wavelet neural networks
2013
Senapati et al. [
20
]
-
-
-
-
98.25%
GNRBA
2017
Dora et al. [21]
-
-
-
-
95.80%
KNN
2018
Wang and Feng [22]
-
-
89.00%
-
0.81%
ST-ONCODIAG
2018
Oyelade et al. [23]
1.00
97.82%
98.50%
95.74%
97.36%
ERT
Our work
5. CONCLUSION
One of the most critical scientific subjects being pursued is the early detection of breast cancer. It is
a terrible disease that affects women all over the world. In women, it is one of the most frequent cancer types.
One in every eight women in the world is at risk of being diagnosed at a certain point in their lives. Accurate
classification of breast cancer tumors has become a complex problem in the medical world. The study aims to
compare the performance of different classification methods. The performance of ML in breast cancer
prediction and diagnosis has been described as the most surprising result of these results. As evidenced by
this study, the ERT classifier surpasses the WDBC in terms of specificity and precision. We estimate that the
high results will be attributed to the fact that using ensembles rather than a single predictive model improves
predictive modeling performance in this scenario, as this model belongs to the ensemble methods group.
These limitations will require additional research in the future. As a result, future work should evaluate
machine learning models for new medical diagnostic challenges and optimize their performance using high-
performance computing methods.
8. Int J Reconfigurable & Embedded Syst ISSN: 2089-4864
Comparison of breast cancer classification models on Wisconsin dataset (Rania R. Kadhim)
173
ACKNOWLEDGEMENTS
The authors would like to thank Mustansiriya University for their valuable support and providing
the essential facilities for this research.
REFERENCES
[1] M. Y. Kamil, "Computer-aided diagnosis system for breast cancer based on the Gabor filter technique," International Journal of
Electrical Computer Engineering, vol. 10, no. 5, pp. 5235-5242, 2020, doi: 10.11591/ijece.v10i5.pp5235-5242.
[2] P. N. Pandey, N. Saini, N. Sapre, A. Kulkarni, and A. K. Tiwari, "Prioritising breast cancer theranostics: A current medical
longing in oncology," Cancer Treatment and Research Communications, vol. 29, p. 100465, 2021, doi:
10.1016/j.ctarc.2021.100465.
[3] A. Conti, A. Duggento, I. Indovina, M. Guerrisi, and N. Toschi, "Radiomics in breast cancer classification and prediction,"
Seminars in Cancer Biology, vol. 72, pp. 238-250, 2021, doi: 10.1016/j.semcancer.2020.04.002.
[4] S. Khairnar, S. D. Thepade, and S. Gite, "Effect of image binarization thresholds on breast cancer identification in mammography
images using OTSU, Niblack, Burnsen, Thepade's SBTC," Intelligent Systems with Applications, vol. 10-11, p. 200046, 2021, doi:
10.1016/j.iswa.2021.200046.
[5] E. A. Radhi and M. Y. Kamil, "Breast tumor detection via active contour technique," International Journal of Intelligent
Engineering and Systems, vol. 14, no. 4, pp. 561-570, 2021, doi: 10.22266/ijies2021.0831.49.
[6] M. Y. Kamil and E. A. Radhi, "Breast tumor segmentation in mammography image via Chan-Vese technique," Indonesian
Journal of Electrical Engineering and Computer Science, vol. 22, no. 2, pp. 201-209, 2021, doi: 10.11591/ijeecs.v22.i2.pp809-
817.
[7] R. Vijayarajeswari, P. Parthasarathy, S. Vivekanandan, and A. A. Basha, "Classification of mammogram for early detection of
breast cancer using SVM classifier and Hough transform," Measurement, vol. 146, pp. 800-805, 2019, doi:
10.1016/j.measurement.2019.05.083.
[8] S. Halvaei et al., "Exosomes in Cancer liquid biopsy: a focus on breast cancer," Molecular Therapy - Nucleic Acids, vol. 10, pp.
131-141, 2018, doi: 10.1016/j.omtn.2017.11.014.
[9] I. Steinberg, D. M. Huland, O. Vermesh, H. E. Frostig, W. S. Tummers, and S. S. Gambhir, "Photoacoustic clinical imaging,"
Photoacoustics, vol. 14, pp. 77-98, 2019, doi: 10.1016/j.pacs.2019.05.001.
[10] N. I. R. Yassin, S. Omran, E. M. F. El Houby, and H. Allam, "Machine learning techniques for breast cancer computer aided
diagnosis using different image modalities: A systematic review," Computer Methods and Programs in Biomedicine, vol. 156, pp.
25-45, 2018, doi: 10.1016/j.cmpb.2017.12.012.
[11] S. Pandey, A. Sharma, M. K. Siddiqui, D. Singla, and J. Vanderpuye-Orgle, "ai3 prediction of breast cancer using k-nearest
neighbour: a supervised machine learning algorithm," Value in Health, vol. 23, p. S1, 2020, doi: 10.1016/j.jval.2020.04.006.
[12] C. Kaushal, S. Bhat, D. Koundal, and A. Singla, "Recent trends in computer assisted diagnosis (CAD) system for breast cancer
diagnosis using histopathological images," IRBM, vol. 40, no. 4, pp. 211-227, 2019, doi: 10.1016/j.irbm.2019.06.001.
[13] A. H. Farhan and M. Y. Kamil, "Texture analysis of breast cancer via LBP, HOG, and GLCM techniques," in IOP Conference
Series: Materials Science and Engineering, vol. 928, no. 7, p. 072098, 2020, doi: 10.1088/1757-899X/928/7/072098.
[14] A. H. Farhan and M. Y. Kamil, "Texture analysis of mammogram using local binary pattern method," in Journal of Physics:
Conference Series, vol. 1530, no. 1, p. 012091, 2020, doi: 10.1088/1742-6596/1530/1/012091.
[15] M. Y. Kamil and A.-L. A. Jassam, "Analysis of tissue abnormality in mammography images using gray level co-occurrence
matrix method," in Journal of Physics: Conference Series, vol. 1530, no. 1, p. 012101, 2020, doi: 10.1088/1742-
6596/1530/1/012101.
[16] M. Tariq, S. Iqbal, H. Ayesha, I. Abbas, K. T. Ahmad, and M. F. K. Niazi, "Medical image based breast cancer diagnosis: State of
the art and future directions," Expert Systems with Applications, vol. 167, p. 114095, 2021, doi: 10.1016/j.eswa.2020.114095.
[17] S. Saxena and M. Gyanchandani, "Machine learning methods for computer-aided breast cancer diagnosis using histopathology: a
narrative review," Journal of Medical Imaging and Radiation Sciences, vol. 51, no. 1, pp. 182-193, 2020, doi:
10.1016/j.jmir.2019.11.001.
[18] G. I. Salama, M. Abdelhalim, and M. A. Zeid, "Breast cancer diagnosis on three different datasets using multi-classifiers,"
International Journal of Computer and Information Technology, vol. 1, no. 1, p. 2, 2012.
[19] A. T. Azar and S. A. El-Said, "Probabilistic neural network for breast cancer classification," Neural Computing and Applications,
vol. 23, no. 6, pp. 1737-1751, 2013, doi: 10.1007/s00521-012-1134-8.
[20] M. R. Senapati, A. K. Mohanty, S. Dash, and P. K. Dash, "Local linear wavelet neural network for breast cancer recognition,"
Neural Computing and Applications, vol. 22, no. 1, pp. 125-131, 2013, doi: 10.1007/s00521-011-0670-y.
[21] L. Dora, S. Agrawal, R. Panda, and A. Abraham, "Optimal breast cancer classification using Gauss–Newton representation based
algorithm," Expert Systems with Applications, vol. 85, pp. 134-145, 2017, doi: 10.1016/j.eswa.2017.05.035.
[22] Y. Wang and L. Feng, "Hybrid feature selection using component co-occurrence based feature relevance measurement," Expert
Systems with Applications, vol. 102, pp. 83-99, 2018, doi: 10.1016/j.eswa.2018.01.041.
[23] O. N. Oyelade, A. A. Obiniyi, S. B. Junaidu, and S. A. Adewuyi, "ST-ONCODIAG: A semantic rule-base approach to diagnosing
breast cancer base on Wisconsin datasets," Informatics in Medicine Unlocked, vol. 10, pp. 117-125, 2018, doi:
10.1016/j.imu.2017.12.008.
[24] A. T. Azar and S. M. El-Metwally, "Decision tree classifiers for automated medical diagnosis," Neural Computing and
Applications, vol. 23, no. 7, pp. 2387-2403, 2013, doi: 10.1007/s00521-012-1196-7.
[25] U. Raghavendra, U. Rajendra Acharya, H. Fujita, A. Gudigar, J. H. Tan, and S. Chokkadi, "Application of Gabor wavelet and
Locality Sensitive Discriminant Analysis for automated identification of breast cancer using digitized mammogram images,"
Applied Soft Computing, vol. 46, pp. 151-161, 2016, doi: 10.1016/j.asoc.2016.04.036.
[26] B. Bigdeli, P. Pahlavani, and H. A. Amirkolaee, "An ensemble deep learning method as data fusion system for remote sensing
multisensor classification," Applied Soft Computing, vol. 110, p. 107563, 2021, doi: 10.1016/j.asoc.2021.107563.
[27] S. Moral-García, C. J. Mantas, J. G. Castellano, M. D. Benítez, and J. Abellán, "Bagging of credal decision trees for imprecise
classification," Expert Systems with Applications, vol. 141, p. 112944, 2020, doi: 10.1016/j.eswa.2019.112944.
[28] Y. Liang, S. Zhang, H. Qiao, and Y. Yao, "iPromoter-ET: Identifying promoters and their strength by extremely randomized
trees-based feature selection," Analytical Biochemistry, vol. 630, p. 114335, 2021, doi: 10.1016/j.ab.2021.114335.
[29] H. Oyama, M. Yamakita, K. Sata, and A. Ohata, "Identification of static boundary model based on Gaussian Process
classification," IFAC-PapersOnLine, vol. 49, no. 11, pp. 787-792, 2016, doi: 10.1016/j.ifacol.2016.08.115.
9. ISSN: 2089-4864
Int J Reconfigurable & Embedded Syst, Vol. 11, No. 2, July 2022: 166-174
174
[30] X. Zhang, W. Chao, Z. Li, C. Liu, and R. Li, "Multi-modal kernel ridge regression for social image classification," Applied Soft
Computing, vol. 67, pp. 117-125, 2018, doi: 10.1016/j.asoc.2018.02.030.
[31] R. Blanquero, E. Carrizosa, P. Ramírez-Cobo, and M. R. Sillero-Denamiel, "Variable selection for Naïve Bayes classification,"
Computers & Operations Research, vol. 135, p. 105456, 2021, doi: 10.1016/j.cor.2021.105456.
[32] C. Viji, J. Beschi Raja, R. S. Ponmagal, S. T. Suganthi, P. Parthasarathi, and S. Pandiyan, "Efficient fuzzy based K-nearest
neighbour technique for web services classification," Microprocessors and Microsystems, vol. 76, p. 103097, 2020, doi:
10.1016/j.micpro.2020.103097.
[33] S. Radhakrishnan, S. G. Nair, and J. Isaac, "Multilayer perceptron neural network model development for mechanical ventilator
parameters prediction by real time system learning," Biomedical Signal Processing and Control, vol. 71, p. 103170, 2022, doi:
10.1016/j.bspc.2021.103170.
[34] B. A. Akinnuwesi, B. O. Macaulay, and B. S. Aribisala, "Breast cancer risk assessment and early diagnosis using principal
component analysis and support vector machine techniques," Informatics in Medicine Unlocked, vol. 21, p. 100459, 2020, doi:
10.1016/j.imu.2020.100459.
[35] M. H. Alshayeji, H. Ellethy, S. Abed, and R. Gupta, "Computer-aided detection of breast cancer on the Wisconsin dataset: An
artificial neural networks approach," Biomedical Signal Processing and Control, vol. 71, p. 103141, 2022, doi:
10.1016/j.bspc.2021.103141.
BIOGRAPHIES OF AUTHORS
Rania R. Kadhim holds a Master of Science (M.Sc.) in physics from
Mustansiriya University, College of Science, in 2016. She is currently a Ph.D. student in
digital image processing at Mustansiriya University in Iraq. Her research areas of interest
include medical image processing, machine learning, computer vision, and image
classification. She can be contacted at email: r90r61@gmail.com.
Mohammed Y. Kamil holds a Master of Science (M.Sc.) in Optics from
Mustansiriya University, in 2005. and Ph.D. in digital image processing from Mustansiriya
University, Iraq, in 2011, besides several professional certificates and skills. He is currently a
professor with the physics department at Mustansiriya University, Baghdad, Iraq. He is a
member of the Institute of Electrical and Electronics Engineers (IEEE) and an editor in the
Iraqi Journal of Physics. His research areas of interest include medical image processing,
computer vision, and Artificial Intelligence. He can be contacted at email:
m80y98@uomustansiriyah.edu.iq.