SlideShare a Scribd company logo
1 of 13
DATA MINING
TECHNIQUES FOR
CANCER DISEASE
DIAGNOSIS AND
PROGNOSIS
By TANUJA REDDY
MALIGIREDDY
Data Mining in Healthcare
● Data mining is a multidisciplinary field at the intersection of database
technology, statistics, ML, and pattern recognition that profits from all these
disciplines
● Data mining in healthcare are being used mainly for predicting various
diseases as well as in assisting for diagnosis for the doctors in making their
clinical decision
● Data mining based on clinical big data can produce effective and valuable
knowledge, which is essential for accurate evaluation of treatment
effectiveness and risk assessment
Breast Cancer
● Breast cancer is one of the many
types of cancers that have been
diagnosed in humans. It is majorly
detected in the females as
compared to the males
● The second leading cause of
death among women is breast
cancer, as it comes directly after
lung cancer.
● Breast cancer considered the
most common invasive cancer in
women, with more than one
million cases and nearly 600,000
deaths occurring worldwide
annually.
Classification
Techniques
● Building accurate and efficient
classifiers for large databases is
one of the essential tasks of
data mining and machine
learning research
● The ultimate reason for doing
classification is to increase
understanding of the domain or
to improve predictions
compared to unclassified data.
Vulnerability to developing breast cancer: Decision Trees
● Decision trees are a classification technique that uses a
tree-like structure to analyze decisions, possibilities,
consequences, and measures.
● A case-control research was conducted to investigate
the applicability of decision trees in identifying a group
with a high risk of developing breast cancer.
● Alcohol and smoking data is used which contains 164
controls and 94 patients with 32 SNPs from the
BRCA1, BRCA2, and TP53 genes
● Setting correlation coefficient as 0.8, the data has been
narrowed down to 32 SNPs with selected cases from
each gene category.
● The experiments are carried out with Weka J48 and
C4. 5 decision trees are produced using 10-fold cross
validation
Digital Mammography Classification using Association Rule
Mining
● A mammogram is an x-ray picture of the breast.
Mammograms can be used to check for breast cancer in
women
● Association rule mining is used analyze data for
patterns, or co-occurrences, in a database. It identifies
frequent if(an antecedent)-then(a consequent)
associations, which themselves are the association
rules.
● The apriori algorithm is used to find association rules
between the features with the antecedent being the
features and the consequent being the category.
● The association principles are utilized to create a
classification system that divides mammography into
normal, malignant, and benign with confidence at 0%,
Digital Mammography Classification
● Mammograms can be performed to check for breast
cancer
● While analyzing the mammograms during the
experiments
the ones that are abnormal are further divided into six
categories: microcalcification, circumscribed masses,
spiculated masses, ill-defined masses, architectural
distortion,
and asymmetry.
● Two Image Enhancement techniques: a cropping
operation and an image enhancement have been
performed before feature extraction.
● After feature extraction, the extracted features are
calculated over smaller windows of the original image.
Association rule based Classifier
● The apriori algorithm to find the association rules in the database, with the antecedent being the
features and the consequent being the category
● The association principles are utilized to create a classification system
that divides mammography into normal, malignant, and benign.
● The confidence was set at 0%, and the support at 10%. The association rule classifier had an
average success rate of 69.11%.
Success ratios of association rule-based
classifier using 10 splits
Neural Network based classifier system
● Three layers make up the neural network architecture: an input layer,
a hidden layer, and an output layer. The output layer node provides
the classification for the image.
● By implementing parallelism for the output calculation at each node
in the various levels of the network, Neural Networks
Parallelization
Strategies are employed in this to produce effective results.
● The dataset contains 10 plus class attributes and 669 instances
where each instance has two possible classes benign and malignant.
Taking into consideration various attributes and their domains, it is
proved that the neural networks technique provides satisfactory
results for the classification task.
Prognosis of Breast Cancer Patients’ Survival Rates: Naive Bayes
Classifier
● Based on analysis of three data mining techniques — Naive Bayes, back-propagated neural
network, and C4.5 decision tree algorithms — used to predict the survivability rate of breast
cancer patients in a newer version of the SEER database with two additional fields — Vital
Status Recode (VSR), and the Cause of Death (COD) it is found that C4.5 algorithm has a much
better performance than the other two techniques.
Accuracy of Cancer Dataset
Support Vector Machines and Logistic Regression
● The main goal is to use data mining techniques to try to identify breast cancer patients for whom
treatment increases survival time. Based on the study, the 6-feature space consists of one
pathology feature and five cytological features.
● Based on our analysis of the survival curve, these findings imply that patients in the Good group
shouldn’t take chemotherapy, but those in the Intermediate group should.
● Logistic Regression: The final dataset, which had 202,932 records and 17 variables, was created
after employing these data preparation and cleaning techniques.
● Classification accuracy, sensitivity, and specificity were used to evaluate the models
● The logistic regression model had a classification accuracy of 0.8920, a sensitivity of 0.9017,
and a specificity of 0.8786.
Poor Prognosis and Good Prognosis: Bayesian Networks
● An interface is created for the radiologists to engage with the project’s Bayesian network
learning algorithm in experiments to deploy the Bayesian Belief Network for automated breast
cancer detection support tools and computer-aided detection in mammography.
● The publicly available data on breast cancer patients is afterward divided into two prognostic
groups by combining clinical and microarray data.
● The prognosis for lymph node-negative breast cancer is what is being predicted. A bad
prognosis is one in which the disease returns within 5 years of diagnosis, whereas a good
prognosis is one in which at least 5 years pass without the presence of the disease.
Conclusion and Future Work
● Decision tree is determined to be the best predictor among the various data mining classifiers
and soft computing techniques, with 93.62% Accuracy.
● The predictor can be used to create a web application in the future that will accept the predictor
variables and automated method.
● The Bayesian network has proven to be a successful method for medical prediction,
particularly for the diagnosis and prognosis of Breast cancer. In the future, this framework can
be used to build web-based apps.

More Related Content

Similar to Datamining in BreastCancer.pptx

USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEIJCSEIT Journal
 
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...CSCJournals
 
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceA Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceDr. Amarjeet Singh
 
On Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachOn Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachMasud Rana Basunia
 
Cervical Cancer Analysis
Cervical Cancer AnalysisCervical Cancer Analysis
Cervical Cancer AnalysisIRJET Journal
 
Breast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningBreast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningIRJET Journal
 
coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learningFord Sleeman
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
Classification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer DataClassification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer DataIIRindia
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONijscai
 
Breast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine LearningBreast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine LearningIRJET Journal
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...IRJET Journal
 
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​Pritam Sarkar
 

Similar to Datamining in BreastCancer.pptx (20)

USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASEUSING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
USING DATA MINING TECHNIQUES FOR DIAGNOSIS AND PROGNOSIS OF CANCER DISEASE
 
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
A Novel Approach for Cancer Detection in MRI Mammogram Using Decision Tree In...
 
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer RecurrenceA Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
A Review on Data Mining Techniques for Prediction of Breast Cancer Recurrence
 
On Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining ApproachOn Predicting and Analyzing Breast Cancer using Data Mining Approach
On Predicting and Analyzing Breast Cancer using Data Mining Approach
 
Cervical Cancer Analysis
Cervical Cancer AnalysisCervical Cancer Analysis
Cervical Cancer Analysis
 
Breast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine LearningBreast Cancer Prediction using Machine Learning
Breast Cancer Prediction using Machine Learning
 
coad_machine_learning
coad_machine_learningcoad_machine_learning
coad_machine_learning
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
SET PROJECT PPT.pptx
SET PROJECT PPT.pptxSET PROJECT PPT.pptx
SET PROJECT PPT.pptx
 
Classification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer DataClassification AlgorithmBased Analysis of Breast Cancer Data
Classification AlgorithmBased Analysis of Breast Cancer Data
 
Machine Learning Ppt.pptx
Machine Learning Ppt.pptxMachine Learning Ppt.pptx
Machine Learning Ppt.pptx
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTIONSVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
SVM &GA-CLUSTERING BASED FEATURE SELECTION APPROACH FOR BREAST CANCER DETECTION
 
Updated proposal powerpoint.pptx
Updated proposal powerpoint.pptxUpdated proposal powerpoint.pptx
Updated proposal powerpoint.pptx
 
Breast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine LearningBreast Cancer Detection Using Machine Learning
Breast Cancer Detection Using Machine Learning
 
16
1616
16
 
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
Cervical Cancer Detection: An Enhanced Approach through Transfer Learning and...
 
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​
Computer-Aided Diagnosis using Class-Weighted Deep Neural Network​
 

Recently uploaded

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 

Recently uploaded (20)

Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 

Datamining in BreastCancer.pptx

  • 1. DATA MINING TECHNIQUES FOR CANCER DISEASE DIAGNOSIS AND PROGNOSIS By TANUJA REDDY MALIGIREDDY
  • 2. Data Mining in Healthcare ● Data mining is a multidisciplinary field at the intersection of database technology, statistics, ML, and pattern recognition that profits from all these disciplines ● Data mining in healthcare are being used mainly for predicting various diseases as well as in assisting for diagnosis for the doctors in making their clinical decision ● Data mining based on clinical big data can produce effective and valuable knowledge, which is essential for accurate evaluation of treatment effectiveness and risk assessment
  • 3. Breast Cancer ● Breast cancer is one of the many types of cancers that have been diagnosed in humans. It is majorly detected in the females as compared to the males ● The second leading cause of death among women is breast cancer, as it comes directly after lung cancer. ● Breast cancer considered the most common invasive cancer in women, with more than one million cases and nearly 600,000 deaths occurring worldwide annually.
  • 4. Classification Techniques ● Building accurate and efficient classifiers for large databases is one of the essential tasks of data mining and machine learning research ● The ultimate reason for doing classification is to increase understanding of the domain or to improve predictions compared to unclassified data.
  • 5. Vulnerability to developing breast cancer: Decision Trees ● Decision trees are a classification technique that uses a tree-like structure to analyze decisions, possibilities, consequences, and measures. ● A case-control research was conducted to investigate the applicability of decision trees in identifying a group with a high risk of developing breast cancer. ● Alcohol and smoking data is used which contains 164 controls and 94 patients with 32 SNPs from the BRCA1, BRCA2, and TP53 genes ● Setting correlation coefficient as 0.8, the data has been narrowed down to 32 SNPs with selected cases from each gene category. ● The experiments are carried out with Weka J48 and C4. 5 decision trees are produced using 10-fold cross validation
  • 6. Digital Mammography Classification using Association Rule Mining ● A mammogram is an x-ray picture of the breast. Mammograms can be used to check for breast cancer in women ● Association rule mining is used analyze data for patterns, or co-occurrences, in a database. It identifies frequent if(an antecedent)-then(a consequent) associations, which themselves are the association rules. ● The apriori algorithm is used to find association rules between the features with the antecedent being the features and the consequent being the category. ● The association principles are utilized to create a classification system that divides mammography into normal, malignant, and benign with confidence at 0%,
  • 7. Digital Mammography Classification ● Mammograms can be performed to check for breast cancer ● While analyzing the mammograms during the experiments the ones that are abnormal are further divided into six categories: microcalcification, circumscribed masses, spiculated masses, ill-defined masses, architectural distortion, and asymmetry. ● Two Image Enhancement techniques: a cropping operation and an image enhancement have been performed before feature extraction. ● After feature extraction, the extracted features are calculated over smaller windows of the original image.
  • 8. Association rule based Classifier ● The apriori algorithm to find the association rules in the database, with the antecedent being the features and the consequent being the category ● The association principles are utilized to create a classification system that divides mammography into normal, malignant, and benign. ● The confidence was set at 0%, and the support at 10%. The association rule classifier had an average success rate of 69.11%. Success ratios of association rule-based classifier using 10 splits
  • 9. Neural Network based classifier system ● Three layers make up the neural network architecture: an input layer, a hidden layer, and an output layer. The output layer node provides the classification for the image. ● By implementing parallelism for the output calculation at each node in the various levels of the network, Neural Networks Parallelization Strategies are employed in this to produce effective results. ● The dataset contains 10 plus class attributes and 669 instances where each instance has two possible classes benign and malignant. Taking into consideration various attributes and their domains, it is proved that the neural networks technique provides satisfactory results for the classification task.
  • 10. Prognosis of Breast Cancer Patients’ Survival Rates: Naive Bayes Classifier ● Based on analysis of three data mining techniques — Naive Bayes, back-propagated neural network, and C4.5 decision tree algorithms — used to predict the survivability rate of breast cancer patients in a newer version of the SEER database with two additional fields — Vital Status Recode (VSR), and the Cause of Death (COD) it is found that C4.5 algorithm has a much better performance than the other two techniques. Accuracy of Cancer Dataset
  • 11. Support Vector Machines and Logistic Regression ● The main goal is to use data mining techniques to try to identify breast cancer patients for whom treatment increases survival time. Based on the study, the 6-feature space consists of one pathology feature and five cytological features. ● Based on our analysis of the survival curve, these findings imply that patients in the Good group shouldn’t take chemotherapy, but those in the Intermediate group should. ● Logistic Regression: The final dataset, which had 202,932 records and 17 variables, was created after employing these data preparation and cleaning techniques. ● Classification accuracy, sensitivity, and specificity were used to evaluate the models ● The logistic regression model had a classification accuracy of 0.8920, a sensitivity of 0.9017, and a specificity of 0.8786.
  • 12. Poor Prognosis and Good Prognosis: Bayesian Networks ● An interface is created for the radiologists to engage with the project’s Bayesian network learning algorithm in experiments to deploy the Bayesian Belief Network for automated breast cancer detection support tools and computer-aided detection in mammography. ● The publicly available data on breast cancer patients is afterward divided into two prognostic groups by combining clinical and microarray data. ● The prognosis for lymph node-negative breast cancer is what is being predicted. A bad prognosis is one in which the disease returns within 5 years of diagnosis, whereas a good prognosis is one in which at least 5 years pass without the presence of the disease.
  • 13. Conclusion and Future Work ● Decision tree is determined to be the best predictor among the various data mining classifiers and soft computing techniques, with 93.62% Accuracy. ● The predictor can be used to create a web application in the future that will accept the predictor variables and automated method. ● The Bayesian network has proven to be a successful method for medical prediction, particularly for the diagnosis and prognosis of Breast cancer. In the future, this framework can be used to build web-based apps.