SlideShare a Scribd company logo
Classification of Breast Cancer dataset using  Decision Tree Induction Sunil Nair  Abel Gebreyesus   Masters of Health Informatics Dalhousie University HINF6210 Project Presentation – November 25, 2008
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Objective  ,[object Object],[object Object],[object Object]
Breast Cancer Dataset ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Wisconsin Breast Cancer Database (1991) University of Wisconsin Hospitals,  Dr. William H. Wolberg
Attributes ,[object Object],[object Object],Class  Benign (2), Malignant (4) 11 1-10 Mitoses 10 1-10 Normal Nucleoli 9 1-10 Bland Chromatin 8 1-10 Bare Nuclei 7 1-10 Single Epithelial Cell Size 6 1-10 Marginal Adhesion 5 1-10 Uniformity of Cell Shape 4 1-10 Uniformity of Cell Size 3 1-10 Clump Thickness 2 id number Sample code number 1
Attributes / class - distribution ,[object Object]
Our Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Pre-processing  ,[object Object],[object Object],[object Object]
Data preprocessing  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Comparison chart – Handle Missing Value Confusion Matrix Total Correctly Classified Instances Test split = 223 Accuracy Rate: 95.78% How many predictions by chance? Expected Accuracy Rate = Kappa Statistic - is used to measure the agreement between predicted and actual categorization of data while correcting for prediction that occurs by chance. 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 87% 94% 8% 14 Complete Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 233 70 163 Total 66 63 3 M 167 7 160 B Total M B Class
Data Pre-processing  ,[object Object],Missing Value  Removed  - Mean-Mode
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classification Methods Comparison 94% 97% 3% 233 Support Vector M 92% 97% 4% 233 DT-J48 79% 91% 10% 233 Neural Network 90% 96% 4% 233 Naïve Bayes Exp. Acc. Rate Act. Acc. Rate MAE # Total  Inst. CLASSIFIER PERFORMANCE EVALUATION Test Set
Classification using Decision Tree  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Attributes Selected – most IG weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker  89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 92% 97% 4% 11 Attributes  Selected Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 0.198 Mitosis 9 0.443 Marginal Adhesion 8 0.459 Clump Thickness 7 0.466 Normal Nucleoli 6 0.505 Single Epithelial Cell Size 5 0.543 Bland Chromatin 4 0.564 Bare Nucleoli 3 0.66 Uniformity of Cell Shape 2 0.675 Uniformity of Cell Size 1 Information Gain Attribute Rank
The DT – IG/Attribute selection Visualization
Decision Tree - Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Confusion Matrix – Performance Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],TN FP M (4) FN TP B (2) Act. Class M (4) B (2) Predicted Class
Unbalanced dataset problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Stratified Sampling Method
Performance Evaluation 92% 96% 3% 13 412 Testing set 97% 99% 2% 13 476 Training set Exp. Acc. Rate Act. Acc. Rate MAE # Rules # Instances Dataset PERFORMANCE EVALUATION Test Set
Tree Visualization
Unbalanced dataset Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Future direction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
ROC Curve - Visualization For Benign class For Malignant class ,[object Object],[object Object]
Questions / Comments Thank You !

More Related Content

What's hot

Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methodsReza Ramezani
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
Rashid Ansari
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
Pradeep Redddy Raamana
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
mrizwan969
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
Md. Main Uddin Rony
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
Salford Systems
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
Carlo Carandang
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machinesnextlib
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
SaurabhWani6
 
Decision tree
Decision treeDecision tree
Decision tree
SEMINARGROOT
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
CloudxLab
 
Support Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using WekaSupport Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using Weka
Macha Pujitha
 
Data mining project presentation
Data mining project presentationData mining project presentation
Data mining project presentationKaiwen Qi
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
Justin Cletus
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
Krish_ver2
 
Decision tree
Decision treeDecision tree
Decision tree
Ami_Surati
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
Milind Gokhale
 

What's hot (20)

Feature selection concepts and methods
Feature selection concepts and methodsFeature selection concepts and methods
Feature selection concepts and methods
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
 
Dimensionality Reduction
Dimensionality ReductionDimensionality Reduction
Dimensionality Reduction
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
CART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User GuideCART Classification and Regression Trees Experienced User Guide
CART Classification and Regression Trees Experienced User Guide
 
Artificial Neural Networks for Data Mining
Artificial Neural Networks for Data MiningArtificial Neural Networks for Data Mining
Artificial Neural Networks for Data Mining
 
Support Vector Machines- SVM
Support Vector Machines- SVMSupport Vector Machines- SVM
Support Vector Machines- SVM
 
Support Vector Machines
Support Vector MachinesSupport Vector Machines
Support Vector Machines
 
Multiclass classification of imbalanced data
Multiclass classification of imbalanced dataMulticlass classification of imbalanced data
Multiclass classification of imbalanced data
 
Decision tree
Decision treeDecision tree
Decision tree
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Support Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using WekaSupport Vector Machine and Implementation using Weka
Support Vector Machine and Implementation using Weka
 
Data mining project presentation
Data mining project presentationData mining project presentation
Data mining project presentation
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Tree Learning
Decision Tree LearningDecision Tree Learning
Decision Tree Learning
 

Viewers also liked

Decision theory
Decision theoryDecision theory
Decision theory
Aditya Mahagaonkar
 
Decision Trees
Decision TreesDecision Trees
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Interactive Technologies and Games: Education, Health and Disability
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
Krish_ver2
 
Decision tree
Decision treeDecision tree
Decision tree
Karan Deopura
 
Decision tree
Decision treeDecision tree
Decision tree
Mukund Trivedi
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
Dung Nguyen
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data mining
Data miningData mining
Data mining
Akannsha Totewar
 
Cancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNNCancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNN
Talles Nascimento Rodrigues
 
Distributed Decision Tree Induction
Distributed Decision Tree InductionDistributed Decision Tree Induction
Distributed Decision Tree Induction
gregoryg
 
Decision Tree and entropy
Decision Tree and entropyDecision Tree and entropy
Decision Tree and entropy
Saeed Siddik
 
Thomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay AreaThomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay Area
Ignite Bay Area
 
Lit Final Presentation
Lit Final PresentationLit Final Presentation
Lit Final Presentation
cpost7
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysis
emapesce
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
odsc
 
Lung Cancer Screening
Lung Cancer ScreeningLung Cancer Screening
Lung Cancer Screening
Allina Health
 
Decision making styles final
Decision making styles finalDecision making styles final
Decision making styles finalalexcortez0916
 

Viewers also liked (20)

Decision theory
Decision theoryDecision theory
Decision theory
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision trees
Decision treesDecision trees
Decision trees
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Cancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNNCancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNN
 
Distributed Decision Tree Induction
Distributed Decision Tree InductionDistributed Decision Tree Induction
Distributed Decision Tree Induction
 
Decision Tree and entropy
Decision Tree and entropyDecision Tree and entropy
Decision Tree and entropy
 
Thomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay AreaThomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay Area
 
Lit Final Presentation
Lit Final PresentationLit Final Presentation
Lit Final Presentation
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysis
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Lung Cancer Screening
Lung Cancer ScreeningLung Cancer Screening
Lung Cancer Screening
 
Decision making styles final
Decision making styles finalDecision making styles final
Decision making styles final
 

Similar to Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University

classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf
321106410027
 
research paper
research paperresearch paper
research paper
Kalyan Ram
 
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Rafiul Sabbir
 
Thesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risksThesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risks
TuanNguyen1697
 
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
SUJIT SHIBAPRASAD MAITY
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit iv
malathieswaran29
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Leo Salemann
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Karunakar Kotha
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
Wenfan Xu
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
ahmad abdelhafeez
 
Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
inventionjournals
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
csandit
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
cscpconf
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
Boston Institute of Analytics
 
Leveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNetLeveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNet
agdavis
 
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression ProfilesPredictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Ali Al Hamadani
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer Diseases
Mohamed Loey
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Projectbutest
 

Similar to Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University (20)

classification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdfclassification in data mining and data warehousing.pdf
classification in data mining and data warehousing.pdf
 
research paper
research paperresearch paper
research paper
 
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
 
Vanderbilt b
Vanderbilt bVanderbilt b
Vanderbilt b
 
Thesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risksThesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risks
 
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit iv
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
 
Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 
Data Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health ClassificationData Science Project: Advancements in Fetal Health Classification
Data Science Project: Advancements in Fetal Health Classification
 
OTTO-Report
OTTO-ReportOTTO-Report
OTTO-Report
 
Leveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNetLeveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNet
 
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression ProfilesPredictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer Diseases
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 

More from Sunil Nair

Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...
Sunil Nair
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Sunil Nair
 
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Sunil Nair
 
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Sunil Nair
 
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
Sunil Nair
 
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Sunil Nair
 
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Sunil Nair
 
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Sunil Nair
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Sunil Nair
 

More from Sunil Nair (9)

Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
 
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
 
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
 
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
 
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
 
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
 
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
 

Recently uploaded

New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
i3 Health
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
pal078100
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Saeid Safari
 
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness JourneyTom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
greendigital
 
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdfMANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
Jim Jacob Roy
 
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
GL Anaacs
 
Charaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
Charaka Samhita Sutra sthana Chapter 15 UpakalpaniyaadhyayaCharaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
Charaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
Dr KHALID B.M
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
MedicoseAcademics
 
24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
DrSathishMS1
 
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
Catherine Liao
 
heat stroke and heat exhaustion in children
heat stroke and heat exhaustion in childrenheat stroke and heat exhaustion in children
heat stroke and heat exhaustion in children
SumeraAhmad5
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
aljamhori teaching hospital
 
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
bkling
 
How to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for DoctorsHow to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for Doctors
LanceCatedral
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
Swetaba Besh
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
Anurag Sharma
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
VarunMahajani
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
Anujkumaranit
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
د.محمود نجيب
 
Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...
Sujoy Dasgupta
 

Recently uploaded (20)

New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
New Directions in Targeted Therapeutic Approaches for Older Adults With Mantl...
 
Ocular injury ppt Upendra pal optometrist upums saifai etawah
Ocular injury  ppt  Upendra pal  optometrist upums saifai etawahOcular injury  ppt  Upendra pal  optometrist upums saifai etawah
Ocular injury ppt Upendra pal optometrist upums saifai etawah
 
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists  Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists
 
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness JourneyTom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
Tom Selleck Health: A Comprehensive Look at the Iconic Actor’s Wellness Journey
 
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdfMANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
MANAGEMENT OF ATRIOVENTRICULAR CONDUCTION BLOCK.pdf
 
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...
 
Charaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
Charaka Samhita Sutra sthana Chapter 15 UpakalpaniyaadhyayaCharaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
Charaka Samhita Sutra sthana Chapter 15 Upakalpaniyaadhyaya
 
Physiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of TastePhysiology of Special Chemical Sensation of Taste
Physiology of Special Chemical Sensation of Taste
 
24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all24 Upakrama.pptx class ppt useful in all
24 Upakrama.pptx class ppt useful in all
 
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...The hemodynamic and autonomic determinants of elevated blood pressure in obes...
The hemodynamic and autonomic determinants of elevated blood pressure in obes...
 
heat stroke and heat exhaustion in children
heat stroke and heat exhaustion in childrenheat stroke and heat exhaustion in children
heat stroke and heat exhaustion in children
 
basicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdfbasicmodesofventilation2022-220313203758.pdf
basicmodesofventilation2022-220313203758.pdf
 
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
Report Back from SGO 2024: What’s the Latest in Cervical Cancer?
 
How to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for DoctorsHow to Give Better Lectures: Some Tips for Doctors
How to Give Better Lectures: Some Tips for Doctors
 
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptxANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
ANATOMY AND PHYSIOLOGY OF URINARY SYSTEM.pptx
 
micro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdfmicro teaching on communication m.sc nursing.pdf
micro teaching on communication m.sc nursing.pdf
 
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
Pulmonary Thromboembolism - etilogy, types, medical- Surgical and nursing man...
 
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdfARTIFICIAL INTELLIGENCE IN  HEALTHCARE.pdf
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdf
 
KDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologistsKDIGO 2024 guidelines for diabetologists
KDIGO 2024 guidelines for diabetologists
 
Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...Couples presenting to the infertility clinic- Do they really have infertility...
Couples presenting to the infertility clinic- Do they really have infertility...
 

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University

  • 1. Classification of Breast Cancer dataset using Decision Tree Induction Sunil Nair Abel Gebreyesus Masters of Health Informatics Dalhousie University HINF6210 Project Presentation – November 25, 2008
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Comparison chart – Handle Missing Value Confusion Matrix Total Correctly Classified Instances Test split = 223 Accuracy Rate: 95.78% How many predictions by chance? Expected Accuracy Rate = Kappa Statistic - is used to measure the agreement between predicted and actual categorization of data while correcting for prediction that occurs by chance. 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 87% 94% 8% 14 Complete Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 233 70 163 Total 66 63 3 M 167 7 160 B Total M B Class
  • 12.
  • 13.
  • 14. Classification Methods Comparison 94% 97% 3% 233 Support Vector M 92% 97% 4% 233 DT-J48 79% 91% 10% 233 Neural Network 90% 96% 4% 233 Naïve Bayes Exp. Acc. Rate Act. Acc. Rate MAE # Total Inst. CLASSIFIER PERFORMANCE EVALUATION Test Set
  • 15.
  • 16. Attributes Selected – most IG weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 92% 97% 4% 11 Attributes Selected Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 0.198 Mitosis 9 0.443 Marginal Adhesion 8 0.459 Clump Thickness 7 0.466 Normal Nucleoli 6 0.505 Single Epithelial Cell Size 5 0.543 Bland Chromatin 4 0.564 Bare Nucleoli 3 0.66 Uniformity of Cell Shape 2 0.675 Uniformity of Cell Size 1 Information Gain Attribute Rank
  • 17. The DT – IG/Attribute selection Visualization
  • 18.
  • 19.
  • 20.
  • 22. Performance Evaluation 92% 96% 3% 13 412 Testing set 97% 99% 2% 13 476 Training set Exp. Acc. Rate Act. Acc. Rate MAE # Rules # Instances Dataset PERFORMANCE EVALUATION Test Set
  • 24.
  • 25.
  • 26.
  • 27. Questions / Comments Thank You !