Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Blood Cancer
Detection Using CNN
Guide Name: Team Member(s):
Dr. Anju Chandna Anshit Tripathi (2101920100060)
Abhay Thakur
(2101920100006) Aditya Nath Tripathi
(2101920100026) Yasharth Singh
(2101920100338)
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Table of Content
• Introduction
• Objective(s)
• Literature Survey
• Motivation
• Proposed Methodology and process for Implementation
• Potential Impact on the target audience
• References (APA Format)
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Introduction
• Leukemia is classified based on the type of affected white blood cell—lymphoid or
myeloid cells—and can be either acute, which progresses rapidly, or chronic, which
develops slowly over time. The four major forms of leukemia are:
1.Acute Lymphoblastic Leukemia (ALL)
2.Acute Myelogenous Leukemia (AML)
3.Chronic Lymphocytic Leukemia (CLL)
4.Chronic Myelogenous Leukemia (CML)
• Detecting leukemia early is crucial, and Machine Learning (ML) offers a powerful
solution for this. ML enables computers to learn patterns from data, making it useful in
automating the analysis of medical images. One advanced ML technique is
Convolutional Neural Networks (CNN), which are specifically designed to process and
classify images.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Objective
• The objective of our project is to develop an automated system that can detect
cancer from blood cell images using a Convolutional Neural Network (CNN).
This system takes microscopic images of blood cells as input and outputs
whether the cells are infected with cancer or not. The detection of cancer in
blood cell images can be challenging due to the vague appearance of cancerous
cells, overlap with other conditions, and similarities to abnormalities. These
challenges often result in variability in diagnoses among medical personnel.
• By automating the detection process, our system aims to achieve diagnostic
accuracy comparable to that of expert medical professionals. This can
significantly improve clinical outcomes, particularly in regions with limited
access to diagnostic imaging specialists, ensuring timely and reliable detection
of blood cancer.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Literature Survey
• Automation in Blood Cancer Diagnosis: Studies emphasize the importance of
early and accurate diagnosis of blood cancer. Automated systems utilizing
machine learning, particularly deep learning techniques like Convolutional
Neural Networks (CNNs), have shown significant potential in reducing diagnostic
time and cost, while achieving high accuracy rates comparable to expert medical
personnel.
• Segmentation and Feature Extraction: Many papers propose enhanced
algorithms for image segmentation and feature extraction, such as the Ensemble
Method combined with Effective Fuzzy C Means (EFCM) and Iterative
Morphological Process (IMP). These methods improve cancer detection by
isolating critical features in blood cell images, leading to more precise analysis.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
• AutomationHybrid Models and Performance: Research highlights the effectiveness of
hybrid models combining machine learning techniques like Random Forests (RF),
Support Vector Machines (SVMs), and CNNs.
• Studies using feature selection methods like minimum redundancy maximum relevance
(mRMR) have demonstrated increased accuracy and reduced computational complexity.
• Challenges and Dataset Limitations: Several papers identify challenges with small
datasets, which can limit the generalizability of results. The use of data augmentation
and synthetic data generation techniques, such as GANs (Generative Adversarial
Networks), is proposed to improve training models and enhance classification accuracy
for leukemia detection.
• Advancements in Deep Learning Architectures: Various CNN architectures, including
VGG16, ResNet50, and W-Net, have been tested for blood cancer detection, showing
high accuracy in classifying white blood cells and cancerous cells. These models
outperform traditional methods, with some achieving accuracy rates exceeding 97%.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Motivation
• Traditional diagnostic methods often rely on manual examination, which is time-
consuming, expensive, and prone to human error. The variability in medical
expertise and access to advanced diagnostic tools in underdeveloped regions
further complicates early detection.
• By leveraging the power of machine learning, specifically Convolutional Neural
Networks (CNNs), this model aims to automate the detection process, offering a
faster, more consistent, and scalable solution. The potential to provide expert-
level diagnosis using automated systems can greatly enhance healthcare delivery,
making it accessible to populations with limited medical resources. The ultimate
goal is to reduce diagnostic times, minimize costs, and improve the accuracy of
cancer detection, leading to better treatment outcomes and survival rates.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Methodology
Dataset Preparation:
Data Structure: The dataset is split into different folders, e.g., "train", "test", and "validation".
Each folder contains images categorized into subfolders: "CANCER" (for cancerous cells) and
"NORMAL" (for healthy cells).
Labeling: Images in the dataset are labeled using folder names. "CANCER" is labeled as 1
(positive), and "NORMAL" as 0 (negative).
Data Preprocessing: Images are loaded, resized to (150, 150, 3), and converted into NumPy
arrays. Both the image data (X) and labels (y) are then stored for training.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Preprocessing and Data Augmentation:
Image Preprocessing:
Use cv2.imread to load images in grayscale.
Resize the images using skimage.transform.resize to a uniform shape (150x150x3).
Labels: Convert the labels into one-hot encoded vectors using Keras’ to_categorical function for binary
classification.
Splitting the Data: Split the data into training and test sets using the pre-labeled directories.
Convolutional Neural Network (CNN) Architecture:
Model Structure:
Convolutional Layers: Several convolution layers with ReLU activation and 3x3 filters. These layers extract
features from the images.
MaxPooling Layers: These are used after certain convolution layers to downsample the spatial dimensions
and reduce computational complexity.
Flattening: After the convolutional layers, flatten the output to convert it into a 1D array for input to fully
connected layers.
Fully Connected Layers: A dense layer with 64 units is added for further processing.
Output Layer: The final layer has 2 units with sigmoid activation for binary classification.
Dropout Layer: A dropout of 20% is applied to prevent overfitting.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Model Compilation:
Loss Function: The model uses binary cross-entropy as the loss function since it's a binary
classification problem.
Optimizer: RMSProp with a small learning rate of 5e-5 is used for optimization.
Metrics: The model is evaluated using accuracy.
Training the Model:
Batch Size and Epochs: The model is trained with a batch size of 256 and for 10 epochs.
Callbacks:
ReduceLROnPlateau: This callback reduces the learning rate when the validation accuracy plateaus,
preventing overfitting.
ModelCheckpoint: The best weights (based on validation accuracy) are saved to a file during
training.
Evaluation: The validation accuracy is tracked during training to see how well the model generalizes
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Visualization of Model Performance:
Accuracy and Loss Plots:
Using matplotlib, model's accuracy and loss over epochs are plotted to observe trends like
overfitting.
Confusion Matrix: A confusion matrix is generated to analyze the model’s performance in terms of
True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN).
Metrics Calculation:
Precision: The ratio of correctly predicted positive cases to the total predicted positives
(TP/(TP+FP)).
Recall: The ratio of correctly predicted positive cases to all actual positive cases (TP/(TP+FN)).
Accuracy: The overall correctness of the model ((TP+TN)/(TP+TN+FP+FN)).
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Potential Impact
Improved Diagnosis Accuracy: CNN systems enhance the precision of blood cancer
detection, reducing misdiagnosis and improving patient outcomes.
Early Detection: Facilitates early identification of blood cancer, enabling timely
treatment and increasing survival rates.
Reduced Healthcare Costs: By automating diagnosis, CNNs lower the need for
manual intervention, cutting overall healthcare expenses.
Increased Accessibility: Can be deployed in remote areas, improving access to
quality healthcare for underserved populations.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
Future Scope
Real-time Monitoring: Implementing CNNs in portable devices could enable real-time monitoring
of patients' blood samples, allowing for immediate detection of cancer recurrence or treatment
efficacy.
Telemedicine Applications: With the rise of telemedicine, CNN-based detection systems can
facilitate remote diagnosis, making it easier for patients in underserved areas to access quality
healthcare.
Research and Drug Development: CNNs can analyze vast datasets from clinical trials, identifying
patterns that could inform drug development and help in the discovery of new treatments.
Multi-modal Approaches: Future systems may integrate data from various sources—like imaging,
blood tests, and patient history—using advanced CNN architectures to improve diagnostic
capabilities.
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
References
• [1] Dharani, N. P., Sujatha, G., & Rani, R. (2023). Blood cancer detection using improved
machine learning algorithm. Proceedings of the International Conference on Power, Control,
Computing and Technology (ICCPCT), 1136–1141.
https://doi.org/10.1109/ICCPCT58313.2023.10245375
• [2] Rupapara, V., Rustam, F., Aljedaani, W., Aslam, W., & Choi, G. S. (2022). Blood cancer
prediction using leukemia microarray gene data and hybrid logistic vector trees model.
Scientific Reports, 12, 1000. https://doi.org/10.1038/s41598-022-04835-6
• [3] Ahmed, I. A., Senan, E. M., Shatnawi, H. S. A., Alkhraisha, Z. M., & Al-Azzam, M. M. A.
(2023). Hybrid techniques for the diagnosis of acute lymphoblastic leukemia based on fusion of
CNN features. Diagnostics, 13(6), 1026. https://doi.org/10.3390/diagnostics13061026
• [4] Alabdulqader, E. A., Alarfaj, A. A., Umer, M., & Choi, G. S. (2024). Improving prediction of
blood cancer using leukemia microarray gene data and Chi2 features with weighted
convolutional neural network. Scientific Reports, 14, 15625.
https://doi.org/10.1038/s41598-024-65315-7
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
• [5] Claro, M., Pereira, C. R., Batista, G. E. A. P. A., Lima, R. M., & Rocha, L. M. (2020).
Convolution neural network models for acute leukemia diagnosis. In 2020 International
Conference on Systems, Signals and Image Processing (IWSSIP) (pp. 63–68). IEEE.
https://doi.org/10.1109/IWSSIP48289.2020.9145406
• [6] Rasheed, H., & Abdulazeez, A. (2024). Leukemia detection and classification based on
machine learning and CNN: A review. Indonesian Journal of Computer Science, 13(3).
https://doi.org/10.33022/ijcs.v13i3.4044
• [7] Talaat, F. M., & Gamel, S. A. (2024). Machine learning in detection and classification of
leukemia using C-NMC_Leukemia. Multimedia Tools and Applications, 83, 8063–8076.
https://doi.org/10.1007/s11042-023-15923-8
• [8] Ananth, C., Tamilselvi, P., Joshy, A., & Ananth Kumar, T. (2022). Blood cancer detection
with microscopic images using machine learning. In Proceedings of the International
Conference on Intelligent Computing and Communication (pp. 45-56). Springer.
https://doi.org/10.1007/978-981-19-5090-2_4
Department of Computer Science & Engineering (CSE), G.L. Ba
jaj Institute of Technology and Management, Greater Noida
THANK YOU

PPt.pptxmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

  • 1.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Blood Cancer Detection Using CNN Guide Name: Team Member(s): Dr. Anju Chandna Anshit Tripathi (2101920100060) Abhay Thakur (2101920100006) Aditya Nath Tripathi (2101920100026) Yasharth Singh (2101920100338)
  • 2.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Table of Content • Introduction • Objective(s) • Literature Survey • Motivation • Proposed Methodology and process for Implementation • Potential Impact on the target audience • References (APA Format)
  • 3.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Introduction • Leukemia is classified based on the type of affected white blood cell—lymphoid or myeloid cells—and can be either acute, which progresses rapidly, or chronic, which develops slowly over time. The four major forms of leukemia are: 1.Acute Lymphoblastic Leukemia (ALL) 2.Acute Myelogenous Leukemia (AML) 3.Chronic Lymphocytic Leukemia (CLL) 4.Chronic Myelogenous Leukemia (CML) • Detecting leukemia early is crucial, and Machine Learning (ML) offers a powerful solution for this. ML enables computers to learn patterns from data, making it useful in automating the analysis of medical images. One advanced ML technique is Convolutional Neural Networks (CNN), which are specifically designed to process and classify images.
  • 4.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Objective • The objective of our project is to develop an automated system that can detect cancer from blood cell images using a Convolutional Neural Network (CNN). This system takes microscopic images of blood cells as input and outputs whether the cells are infected with cancer or not. The detection of cancer in blood cell images can be challenging due to the vague appearance of cancerous cells, overlap with other conditions, and similarities to abnormalities. These challenges often result in variability in diagnoses among medical personnel. • By automating the detection process, our system aims to achieve diagnostic accuracy comparable to that of expert medical professionals. This can significantly improve clinical outcomes, particularly in regions with limited access to diagnostic imaging specialists, ensuring timely and reliable detection of blood cancer.
  • 5.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Literature Survey • Automation in Blood Cancer Diagnosis: Studies emphasize the importance of early and accurate diagnosis of blood cancer. Automated systems utilizing machine learning, particularly deep learning techniques like Convolutional Neural Networks (CNNs), have shown significant potential in reducing diagnostic time and cost, while achieving high accuracy rates comparable to expert medical personnel. • Segmentation and Feature Extraction: Many papers propose enhanced algorithms for image segmentation and feature extraction, such as the Ensemble Method combined with Effective Fuzzy C Means (EFCM) and Iterative Morphological Process (IMP). These methods improve cancer detection by isolating critical features in blood cell images, leading to more precise analysis.
  • 6.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida • AutomationHybrid Models and Performance: Research highlights the effectiveness of hybrid models combining machine learning techniques like Random Forests (RF), Support Vector Machines (SVMs), and CNNs. • Studies using feature selection methods like minimum redundancy maximum relevance (mRMR) have demonstrated increased accuracy and reduced computational complexity. • Challenges and Dataset Limitations: Several papers identify challenges with small datasets, which can limit the generalizability of results. The use of data augmentation and synthetic data generation techniques, such as GANs (Generative Adversarial Networks), is proposed to improve training models and enhance classification accuracy for leukemia detection. • Advancements in Deep Learning Architectures: Various CNN architectures, including VGG16, ResNet50, and W-Net, have been tested for blood cancer detection, showing high accuracy in classifying white blood cells and cancerous cells. These models outperform traditional methods, with some achieving accuracy rates exceeding 97%.
  • 7.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Motivation • Traditional diagnostic methods often rely on manual examination, which is time- consuming, expensive, and prone to human error. The variability in medical expertise and access to advanced diagnostic tools in underdeveloped regions further complicates early detection. • By leveraging the power of machine learning, specifically Convolutional Neural Networks (CNNs), this model aims to automate the detection process, offering a faster, more consistent, and scalable solution. The potential to provide expert- level diagnosis using automated systems can greatly enhance healthcare delivery, making it accessible to populations with limited medical resources. The ultimate goal is to reduce diagnostic times, minimize costs, and improve the accuracy of cancer detection, leading to better treatment outcomes and survival rates.
  • 8.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Methodology Dataset Preparation: Data Structure: The dataset is split into different folders, e.g., "train", "test", and "validation". Each folder contains images categorized into subfolders: "CANCER" (for cancerous cells) and "NORMAL" (for healthy cells). Labeling: Images in the dataset are labeled using folder names. "CANCER" is labeled as 1 (positive), and "NORMAL" as 0 (negative). Data Preprocessing: Images are loaded, resized to (150, 150, 3), and converted into NumPy arrays. Both the image data (X) and labels (y) are then stored for training.
  • 9.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Preprocessing and Data Augmentation: Image Preprocessing: Use cv2.imread to load images in grayscale. Resize the images using skimage.transform.resize to a uniform shape (150x150x3). Labels: Convert the labels into one-hot encoded vectors using Keras’ to_categorical function for binary classification. Splitting the Data: Split the data into training and test sets using the pre-labeled directories. Convolutional Neural Network (CNN) Architecture: Model Structure: Convolutional Layers: Several convolution layers with ReLU activation and 3x3 filters. These layers extract features from the images. MaxPooling Layers: These are used after certain convolution layers to downsample the spatial dimensions and reduce computational complexity. Flattening: After the convolutional layers, flatten the output to convert it into a 1D array for input to fully connected layers. Fully Connected Layers: A dense layer with 64 units is added for further processing. Output Layer: The final layer has 2 units with sigmoid activation for binary classification. Dropout Layer: A dropout of 20% is applied to prevent overfitting.
  • 10.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Model Compilation: Loss Function: The model uses binary cross-entropy as the loss function since it's a binary classification problem. Optimizer: RMSProp with a small learning rate of 5e-5 is used for optimization. Metrics: The model is evaluated using accuracy. Training the Model: Batch Size and Epochs: The model is trained with a batch size of 256 and for 10 epochs. Callbacks: ReduceLROnPlateau: This callback reduces the learning rate when the validation accuracy plateaus, preventing overfitting. ModelCheckpoint: The best weights (based on validation accuracy) are saved to a file during training. Evaluation: The validation accuracy is tracked during training to see how well the model generalizes
  • 11.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Visualization of Model Performance: Accuracy and Loss Plots: Using matplotlib, model's accuracy and loss over epochs are plotted to observe trends like overfitting. Confusion Matrix: A confusion matrix is generated to analyze the model’s performance in terms of True Positives (TP), False Positives (FP), True Negatives (TN), and False Negatives (FN). Metrics Calculation: Precision: The ratio of correctly predicted positive cases to the total predicted positives (TP/(TP+FP)). Recall: The ratio of correctly predicted positive cases to all actual positive cases (TP/(TP+FN)). Accuracy: The overall correctness of the model ((TP+TN)/(TP+TN+FP+FN)).
  • 12.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida
  • 13.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Potential Impact Improved Diagnosis Accuracy: CNN systems enhance the precision of blood cancer detection, reducing misdiagnosis and improving patient outcomes. Early Detection: Facilitates early identification of blood cancer, enabling timely treatment and increasing survival rates. Reduced Healthcare Costs: By automating diagnosis, CNNs lower the need for manual intervention, cutting overall healthcare expenses. Increased Accessibility: Can be deployed in remote areas, improving access to quality healthcare for underserved populations.
  • 14.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida Future Scope Real-time Monitoring: Implementing CNNs in portable devices could enable real-time monitoring of patients' blood samples, allowing for immediate detection of cancer recurrence or treatment efficacy. Telemedicine Applications: With the rise of telemedicine, CNN-based detection systems can facilitate remote diagnosis, making it easier for patients in underserved areas to access quality healthcare. Research and Drug Development: CNNs can analyze vast datasets from clinical trials, identifying patterns that could inform drug development and help in the discovery of new treatments. Multi-modal Approaches: Future systems may integrate data from various sources—like imaging, blood tests, and patient history—using advanced CNN architectures to improve diagnostic capabilities.
  • 15.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida References • [1] Dharani, N. P., Sujatha, G., & Rani, R. (2023). Blood cancer detection using improved machine learning algorithm. Proceedings of the International Conference on Power, Control, Computing and Technology (ICCPCT), 1136–1141. https://doi.org/10.1109/ICCPCT58313.2023.10245375 • [2] Rupapara, V., Rustam, F., Aljedaani, W., Aslam, W., & Choi, G. S. (2022). Blood cancer prediction using leukemia microarray gene data and hybrid logistic vector trees model. Scientific Reports, 12, 1000. https://doi.org/10.1038/s41598-022-04835-6 • [3] Ahmed, I. A., Senan, E. M., Shatnawi, H. S. A., Alkhraisha, Z. M., & Al-Azzam, M. M. A. (2023). Hybrid techniques for the diagnosis of acute lymphoblastic leukemia based on fusion of CNN features. Diagnostics, 13(6), 1026. https://doi.org/10.3390/diagnostics13061026 • [4] Alabdulqader, E. A., Alarfaj, A. A., Umer, M., & Choi, G. S. (2024). Improving prediction of blood cancer using leukemia microarray gene data and Chi2 features with weighted convolutional neural network. Scientific Reports, 14, 15625. https://doi.org/10.1038/s41598-024-65315-7
  • 16.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida • [5] Claro, M., Pereira, C. R., Batista, G. E. A. P. A., Lima, R. M., & Rocha, L. M. (2020). Convolution neural network models for acute leukemia diagnosis. In 2020 International Conference on Systems, Signals and Image Processing (IWSSIP) (pp. 63–68). IEEE. https://doi.org/10.1109/IWSSIP48289.2020.9145406 • [6] Rasheed, H., & Abdulazeez, A. (2024). Leukemia detection and classification based on machine learning and CNN: A review. Indonesian Journal of Computer Science, 13(3). https://doi.org/10.33022/ijcs.v13i3.4044 • [7] Talaat, F. M., & Gamel, S. A. (2024). Machine learning in detection and classification of leukemia using C-NMC_Leukemia. Multimedia Tools and Applications, 83, 8063–8076. https://doi.org/10.1007/s11042-023-15923-8 • [8] Ananth, C., Tamilselvi, P., Joshy, A., & Ananth Kumar, T. (2022). Blood cancer detection with microscopic images using machine learning. In Proceedings of the International Conference on Intelligent Computing and Communication (pp. 45-56). Springer. https://doi.org/10.1007/978-981-19-5090-2_4
  • 17.
    Department of ComputerScience & Engineering (CSE), G.L. Ba jaj Institute of Technology and Management, Greater Noida THANK YOU