A Comparative Analysis of Hybridized Genetic Algorithm in Predictive Models of Breast Cancer Tumors.
Shakas Technologies ( Galaxy of Knowledge)
#11/A 2nd East Main Road,
Gandhi Nagar,
Vellore - 632006.
Mobile : +91-9500218218 / 8220150373| land line- 0416- 3552723
Shakas Training & Development | Shakas Sales & Services | Shakas Educational Trust|IEEE projects | Research & Development | Journal Publication |
Email : info@shakastech.com | shakastech@gmail.com |
website: www.shakastech.com
Facebook: https://www.facebook.com/pages/Shakas-Technologies
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
A Comparative Analysis of Hybridized Genetic Algorithm in Predictive Models of Breast Cancer Tumors.docx
1. Base paper Title: A Comparative Analysis of Hybridized Genetic Algorithm in Predictive
Models of Breast Cancer Tumors
Modified Title: A Comparative Examination of Hybridised Genetic Algorithms in Breast
Cancer Tumour Predictive Models
Abstract
Advancement in computer-aided tools towards accurate breast cancer early prediction
models has proven to be advantageous, which in turn helps to reduce the mortality rate
associated with this cancer. From the literature, random forest predictor has been observed to
have high accuracy in comparison to other machine learning regressors, also genetic algorithm
has been observed to be a good feature selection method in data pre-processing. In a bid to
improve the accuracy of breast cancer predictive models, several studies have developed
hybridized genetic algorithm models for feature selection, however, the order of hybridization
may not have been taken into consideration, as this can have an impact on the hybridized
model’s performance. Therefore, this paper proposes several high-performing predictive
models using hybridized genetic algorithm, based on other learning models, while taking into
consideration the placement order of the feature selection algorithms in the hybridized models.
The Wisconsin Breast Cancer dataset was used as the test bench, while filter, wrapper and
embedded feature selection algorithms were used in the proposed hybridized models. The
performances of proposed hybridized models were compared with those of the individual
learning models, considered in this work. These models include Fisher_Score, Mutual
Information Gain, Correlation Chi-square test, Coefficient, Variance, Genetic Algorithm,
Lasso and Linear Regressors with L1 regularization, Ridge Regressor with L2 regularization,
Tree-based methods. From the performance evaluation results, the proposed hybridized
Genetic Algorithm with Fisher_Score (GA + Fisher_Score) model showed promising results,
as it had an accuracy score of 99.12%, thereby out-performing other proposed hybridized
genetic algorithm models considered.
Existing System
Over the last few decades, Cancer has been known, universally, to be a deadly disease.
In a 2022 USA cancer report by the American Cancer Society, heart disease was ranked as the
most common disease closely followed by cancerous diseases. From the report, the total
2. number of expected new cancer cases recorded was almost 2 million with over 609 thousand
expected deaths [1]. The good news is that within the last 30 years, scientists have done
tremendous work to help reduce the mortality rate associated with cancer disease. According
to the American Cancer Society report [1], a significant decline in cancer related deaths was
observed in the last 3 decades, to around 3.5 million fewer cases than expected. One of the
most common cancers, in the year 2022, is breast cancer which accounts for 12.5% of all new
annual cancer cases, globally. From the American Cancer Society 2022 report, the most often
diagnosed cancer cases recorded, among American women, was Breast Cancer. At least, 1 in
every 8 American women will experience an invasive breast cancer during their life-time. The
estimated number of new cases associated with breast cancer in 2022 was over 300 thousand
women, with almost 289 thousand invasive cases and a little over 51 thousand non-invasive
cases [2]. The mortality rate associated with Breast Cancer has been observed to slowly decline
over the years, to about 43% through 2020, with the help of treatment advancement and earlier
detection. Nonetheless, more work still needs to be done to reduce the mortality rate thereby
achieving the objective of enhancing health and well-being as set forth by the United Nations
Sustainable Development Goals (SDG), as no definite cure or preventive measure has been
discovered yet for breast cancer [3]. Detecting any abnormalities at an early stage before they
develop into cancerous growth is vital, and mammogram screening and self-breast
examinations play a crucial role in achieving this goal. If a lump is discovered during
mammogram screening, it is classified as either a benign or malignant tumor. While benign
tumors can be managed effectively and do not pose a significant threat, malignant tumors are
highly aggressive and can invade nearby tissues rapidly.
Drawback in Existing System
Computational Complexity:
Hybridized genetic algorithms may increase the computational complexity of the
predictive model. Combining genetic algorithms with other optimization techniques or
machine learning algorithms can result in longer training times and higher resource
requirements.
3. Interpretability:
Hybrid models can be complex, making it challenging to interpret the relationships
between input features and the model's predictions. This lack of interpretability can be
a significant drawback, especially in medical applications where understanding the
factors contributing to predictions is crucial.
Data Dependency:
The effectiveness of hybridized genetic algorithms may be influenced by the
characteristics of the dataset. If the dataset is not representative or lacks diversity, the
hybrid model may not perform well on real-world scenarios or with different patient
demographics.
Ethical and Regulatory Considerations:
In the context of medical applications, especially in predicting breast cancer, ethical
and regulatory considerations are paramount. Hybrid models may introduce additional
complexity to the decision-making process, and ensuring the model's compliance with
ethical standards and regulatory requirements becomes crucial.
Proposed System
Introduction:
Background:
Provide an overview of breast cancer, emphasizing the need for accurate predictive
models for early detection and treatment.
Motivation:
Explain why a hybridized genetic algorithm is chosen and how it addresses limitations
in existing predictive models.
Objectives:
Clearly state the objectives of the study, such as improving prediction accuracy,
reducing computational time, or enhancing interpretability.
4. Literature Review:
Discuss existing predictive models for breast cancer tumors, including genetic
algorithms, machine learning techniques, and hybrid models. Highlight gaps and
challenges in current approaches.
Methodology:
Data Collection:
Specify the dataset(s) used for training and testing the predictive model. Explain the
relevance and representativeness of the data.
Preprocessing:
Detail the steps taken to clean, normalize, and preprocess the dataset to prepare it for
modeling.
Genetic Algorithm:
Explain the genetic algorithm used, including the representation of solutions,
selection mechanisms, crossover and mutation operators, and parameter settings.
Hybridization:
Describe how the genetic algorithm is hybridized with other techniques or algorithms.
This could include combining GAs with neural networks, support vector machines, or
other optimization methods.
Model Training:
Outline the process of training the hybridized model, specifying
training/validation/test splits and any cross-validation techniques employed.
Results and Discussion:
Present the results of the comparative analysis, including performance metrics and
comparisons with baseline models. Discuss the strengths and weaknesses of the
proposed hybridized genetic algorithm.
5. Algorithm
Genetic Algorithm (GA):
Initialization:
Define how the initial population of potential solutions is generated.
Representation:
Specify how solutions are encoded and represented in the genetic algorithm.
Selection Mechanism:
Describe the method used for selecting individuals from the population for
reproduction based on their fitness.
Hybridization Techniques:
Algorithm Integration:
Specify how the genetic algorithm is combined with other predictive modeling
techniques.
Justification:
Explain the rationale behind choosing the specific hybridization approach. For
example, hybridizing with neural networks, support vector machines, or other
optimization algorithms.
Parameter Tuning:
Describe any methods used to optimize the parameters of the hybridized model.
Evaluation Metrics:
Performance Metrics:
Define the evaluation metrics used to assess the performance of the predictive models
(e.g., accuracy, sensitivity, specificity, ROC-AUC).
Comparison Criteria:
Explain the criteria used for the comparative analysis between different algorithms.
Advantages
Improved Accuracy:
Enhanced Prediction Performance:
6. Discuss how the hybridized genetic algorithm, when combined with other machine
learning models, leads to improved accuracy in predicting breast cancer tumors
compared to traditional methods.
Model Interpretability:
Enhanced Interpretability:
Discuss how the hybridization with other algorithms might lead to models that are
more interpretable, aiding clinicians and researchers in understanding the factors
influencing predictions.
Enhanced Convergence:
Faster Convergence:
Discuss any observations or empirical evidence indicating that the hybridized genetic
algorithm converges faster during the training process compared to standalone
algorithms.
Comparative Performance Metrics:
Quantitative Performance Metrics:
Present specific quantitative metrics that demonstrate the superiority of the hybrid
model over other models in the comparative analysis.
Software Specification
Processor : I3 core processor
Ram : 4 GB
Hard disk : 500 GB
Software Specification
Operating System : Windows 10 /11
Frond End : Python
Back End : Mysql Server
IDE Tools : Pycharm