SlideShare a Scribd company logo
1 of 27
Download to read offline
Classification of Breast Cancer dataset using  Decision Tree Induction Sunil Nair  Abel Gebreyesus   Masters of Health Informatics Dalhousie University HINF6210 Project Presentation – November 25, 2008
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Objective  ,[object Object],[object Object],[object Object]
Breast Cancer Dataset ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Wisconsin Breast Cancer Database (1991) University of Wisconsin Hospitals,  Dr. William H. Wolberg
Attributes ,[object Object],[object Object],Class  Benign (2), Malignant (4) 11 1-10 Mitoses 10 1-10 Normal Nucleoli 9 1-10 Bland Chromatin 8 1-10 Bare Nuclei 7 1-10 Single Epithelial Cell Size 6 1-10 Marginal Adhesion 5 1-10 Uniformity of Cell Shape 4 1-10 Uniformity of Cell Size 3 1-10 Clump Thickness 2 id number Sample code number 1
Attributes / class - distribution ,[object Object]
Our Approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Pre-processing  ,[object Object],[object Object],[object Object]
Data preprocessing  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Comparison chart – Handle Missing Value Confusion Matrix Total Correctly Classified Instances Test split = 223 Accuracy Rate: 95.78% How many predictions by chance? Expected Accuracy Rate = Kappa Statistic - is used to measure the agreement between predicted and actual categorization of data while correcting for prediction that occurs by chance. 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 87% 94% 8% 14 Complete Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 233 70 163 Total 66 63 3 M 167 7 160 B Total M B Class
Data Pre-processing  ,[object Object],Missing Value  Removed  - Mean-Mode
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classification Methods Comparison 94% 97% 3% 233 Support Vector M 92% 97% 4% 233 DT-J48 79% 91% 10% 233 Neural Network 90% 96% 4% 233 Naïve Bayes Exp. Acc. Rate Act. Acc. Rate MAE # Total  Inst. CLASSIFIER PERFORMANCE EVALUATION Test Set
Classification using Decision Tree  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Attributes Selected – most IG weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker  89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 92% 97% 4% 11 Attributes  Selected Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 0.198 Mitosis 9 0.443 Marginal Adhesion 8 0.459 Clump Thickness 7 0.466 Normal Nucleoli 6 0.505 Single Epithelial Cell Size 5 0.543 Bland Chromatin 4 0.564 Bare Nucleoli 3 0.66 Uniformity of Cell Shape 2 0.675 Uniformity of Cell Size 1 Information Gain Attribute Rank
The DT – IG/Attribute selection Visualization
Decision Tree - Problems ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Confusion Matrix – Performance Evaluation ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],TN FP M (4) FN TP B (2) Act. Class M (4) B (2) Predicted Class
Unbalanced dataset problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Stratified Sampling Method
Performance Evaluation 92% 96% 3% 13 412 Testing set 97% 99% 2% 13 476 Training set Exp. Acc. Rate Act. Acc. Rate MAE # Rules # Instances Dataset PERFORMANCE EVALUATION Test Set
Tree Visualization
Unbalanced dataset Problem ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Future direction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
ROC Curve - Visualization For Benign class For Malignant class ,[object Object],[object Object]
Questions / Comments Thank You !

More Related Content

What's hot

Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptSubrata Kumer Paul
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)Sharayu Patil
 
Feature selection
Feature selectionFeature selection
Feature selectionDong Guo
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsPrashanth Guntal
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression TreesHemant Chetwani
 
k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptxRoshan86572
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & predictionhktripathy
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patternsKrish_ver2
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3Laila Fatehy
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...eSAT Journals
 
Logistic regression : Use Case | Background | Advantages | Disadvantages
Logistic regression : Use Case | Background | Advantages | DisadvantagesLogistic regression : Use Case | Background | Advantages | Disadvantages
Logistic regression : Use Case | Background | Advantages | DisadvantagesRajat Sharma
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic conceptsKrish_ver2
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Usama Fayyaz
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classificationSung Yub Kim
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classificationAshwan Abdulmunem
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...TEJVEER SINGH
 
Principal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding documentPrincipal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding documentNaveen Kumar
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data MiningR A Akerkar
 

What's hot (20)

Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.pptChapter 10. Cluster Analysis Basic Concepts and Methods.ppt
Chapter 10. Cluster Analysis Basic Concepts and Methods.ppt
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Support vector machines (svm)
Support vector machines (svm)Support vector machines (svm)
Support vector machines (svm)
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Types of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithmsTypes of clustering and different types of clustering algorithms
Types of clustering and different types of clustering algorithms
 
CART – Classification & Regression Trees
CART – Classification & Regression TreesCART – Classification & Regression Trees
CART – Classification & Regression Trees
 
k medoid clustering.pptx
k medoid clustering.pptxk medoid clustering.pptx
k medoid clustering.pptx
 
Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & prediction
 
5.3 mining sequential patterns
5.3 mining sequential patterns5.3 mining sequential patterns
5.3 mining sequential patterns
 
Decision tree lecture 3
Decision tree lecture 3Decision tree lecture 3
Decision tree lecture 3
 
Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...Breast cancer diagnosis and recurrence prediction using machine learning tech...
Breast cancer diagnosis and recurrence prediction using machine learning tech...
 
Logistic regression : Use Case | Background | Advantages | Disadvantages
Logistic regression : Use Case | Background | Advantages | DisadvantagesLogistic regression : Use Case | Background | Advantages | Disadvantages
Logistic regression : Use Case | Background | Advantages | Disadvantages
 
2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts2.1 Data Mining-classification Basic concepts
2.1 Data Mining-classification Basic concepts
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning Supervised learning and Unsupervised learning
Supervised learning and Unsupervised learning
 
Linear models for classification
Linear models for classificationLinear models for classification
Linear models for classification
 
Breast cancer classification
Breast cancer classificationBreast cancer classification
Breast cancer classification
 
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
Design principle of pattern recognition system and STATISTICAL PATTERN RECOGN...
 
Principal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding documentPrincipal Component Analysis(PCA) understanding document
Principal Component Analysis(PCA) understanding document
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 

Viewers also liked

2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision treeKrish_ver2
 
a novel approach for breast cancer detection using data mining tool weka
a novel approach for breast cancer detection using data mining tool wekaa novel approach for breast cancer detection using data mining tool weka
a novel approach for breast cancer detection using data mining tool wekaahmad abdelhafeez
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining ConceptsDung Nguyen
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Distributed Decision Tree Induction
Distributed Decision Tree InductionDistributed Decision Tree Induction
Distributed Decision Tree Inductiongregoryg
 
Decision Tree and entropy
Decision Tree and entropyDecision Tree and entropy
Decision Tree and entropySaeed Siddik
 
Thomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay AreaThomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay AreaIgnite Bay Area
 
Lit Final Presentation
Lit Final PresentationLit Final Presentation
Lit Final Presentationcpost7
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysisemapesce
 
Data Science 101
Data Science 101Data Science 101
Data Science 101odsc
 
Lung Cancer Screening
Lung Cancer ScreeningLung Cancer Screening
Lung Cancer ScreeningAllina Health
 

Viewers also liked (20)

Decision theory
Decision theoryDecision theory
Decision theory
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
Breast Cancer Diagnosis using a Hybrid Genetic Algorithm for Feature Selectio...
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Decision tree
Decision treeDecision tree
Decision tree
 
a novel approach for breast cancer detection using data mining tool weka
a novel approach for breast cancer detection using data mining tool wekaa novel approach for breast cancer detection using data mining tool weka
a novel approach for breast cancer detection using data mining tool weka
 
Decision tree
Decision treeDecision tree
Decision tree
 
Decision trees
Decision treesDecision trees
Decision trees
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Cancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNNCancer de mama usando Weka e MLP/KNN
Cancer de mama usando Weka e MLP/KNN
 
Distributed Decision Tree Induction
Distributed Decision Tree InductionDistributed Decision Tree Induction
Distributed Decision Tree Induction
 
Decision Tree and entropy
Decision Tree and entropyDecision Tree and entropy
Decision Tree and entropy
 
Thomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay AreaThomas Goetz on Decision Trees for Ignite Bay Area
Thomas Goetz on Decision Trees for Ignite Bay Area
 
Lit Final Presentation
Lit Final PresentationLit Final Presentation
Lit Final Presentation
 
DTI brain networks analysis
DTI brain networks analysisDTI brain networks analysis
DTI brain networks analysis
 
Data Science 101
Data Science 101Data Science 101
Data Science 101
 
Lung Cancer Screening
Lung Cancer ScreeningLung Cancer Screening
Lung Cancer Screening
 

Similar to Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University

research paper
research paperresearch paper
research paperKalyan Ram
 
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Rafiul Sabbir
 
Thesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risksThesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risksTuanNguyen1697
 
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.SUJIT SHIBAPRASAD MAITY
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit ivmalathieswaran29
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningLeo Salemann
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningKarunakar Kotha
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningWenfan Xu
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...ahmad abdelhafeez
 
Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniquesinventionjournals
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...cscpconf
 
Leveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNetLeveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNetagdavis
 
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression ProfilesPredictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression ProfilesAli Al Hamadani
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesMohamed Loey
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Projectbutest
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarAnn-Marie Roche
 
Presentation
PresentationPresentation
Presentationbutest
 

Similar to Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University (20)

research paper
research paperresearch paper
research paper
 
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...Enhancing the performance of Naive Bayesian Classifier using Information Gain...
Enhancing the performance of Naive Bayesian Classifier using Information Gain...
 
Vanderbilt b
Vanderbilt bVanderbilt b
Vanderbilt b
 
Thesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risksThesis presentation: Applications of machine learning in predicting supply risks
Thesis presentation: Applications of machine learning in predicting supply risks
 
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.Heart Disease Identification Method Using Machine Learnin in E-healthcare.
Heart Disease Identification Method Using Machine Learnin in E-healthcare.
 
Data mining techniques unit iv
Data mining techniques unit ivData mining techniques unit iv
Data mining techniques unit iv
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Predicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine LearningPredicting Moscow Real Estate Prices with Azure Machine Learning
Predicting Moscow Real Estate Prices with Azure Machine Learning
 
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
Robust Breast Cancer Diagnosis on Four Different Datasets Using Multi-Classif...
 
Classification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining TechniquesClassification of Breast Cancer Diseases using Data Mining Techniques
Classification of Breast Cancer Diseases using Data Mining Techniques
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
 
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO... ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
ENHANCED BREAST CANCER RECOGNITION BASED ON ROTATION FOREST FEATURE SELECTIO...
 
OTTO-Report
OTTO-ReportOTTO-Report
OTTO-Report
 
Leveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNetLeveraging Feature Selection Within TreeNet
Leveraging Feature Selection Within TreeNet
 
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression ProfilesPredictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
Predictive Analytics of Cell Types Using Single Cell Gene Expression Profiles
 
Design of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer DiseasesDesign of an Intelligent System for Improving Classification of Cancer Diseases
Design of an Intelligent System for Improving Classification of Cancer Diseases
 
CSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning ProjectCSCI 6505 Machine Learning Project
CSCI 6505 Machine Learning Project
 
How predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinarHow predictive models help Medicinal Chemists design better drugs_webinar
How predictive models help Medicinal Chemists design better drugs_webinar
 
Presentation
PresentationPresentation
Presentation
 

More from Sunil Nair

Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...Sunil Nair
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Sunil Nair
 
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...Sunil Nair
 
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...Sunil Nair
 
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...Sunil Nair
 
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Sunil Nair
 
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Sunil Nair
 
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Sunil Nair
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Sunil Nair
 

More from Sunil Nair (9)

Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...Change Management-Management Skills Development Project Health Informatics Su...
Change Management-Management Skills Development Project Health Informatics Su...
 
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
Meditech - Healthcare Information System - Sunil Nair Health Informatics Dalh...
 
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
Effects of exposure to mercury on health of dentists - Sunil Nair Health Info...
 
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
Effect Of Type Of Delivery On Birth Trauma And Length Of Stay - Sunil Nair He...
 
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
The Effect Race and Income on HIV AIDS infection in African-Americans - Sunil...
 
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
Personalized Disease Management - Thyroid Cancer - Knowledge Management - Sun...
 
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
Healthcare Technology Assessment Gideon Presentation - Sunil Nair Health Info...
 
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
Pandemic Flu Health Information and Work Flow Project - Sunil Nair Health Inf...
 
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
Clinical Decision Support Systems - Sunil Nair Health Informatics Dalhousie U...
 

Recently uploaded

SGK VIÊM KHỚP DẠNG THẤP YHN .pdf
SGK VIÊM KHỚP DẠNG THẤP YHN              .pdfSGK VIÊM KHỚP DẠNG THẤP YHN              .pdf
SGK VIÊM KHỚP DẠNG THẤP YHN .pdfHongBiThi1
 
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptx
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptxANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptx
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptxsiddharthroy26587
 
(IDE)and(IVD),QMS,21 CFR part820 , 801)
(IDE)and(IVD),QMS,21 CFR part820  , 801)(IDE)and(IVD),QMS,21 CFR part820  , 801)
(IDE)and(IVD),QMS,21 CFR part820 , 801)chahattyagi200
 
Spinal cord Gross anatomy with Clinical Anatomy.pptx
Spinal cord Gross anatomy with Clinical Anatomy.pptxSpinal cord Gross anatomy with Clinical Anatomy.pptx
Spinal cord Gross anatomy with Clinical Anatomy.pptxsiddharthroy26587
 
Ayurveda research in Hypothyroidism, P
Ayurveda  research  in Hypothyroidism, PAyurveda  research  in Hypothyroidism, P
Ayurveda research in Hypothyroidism, PDr.Shalu Jain
 
INTRODUCTION TO THE FORENSIC SCIENCE.ppt
INTRODUCTION TO THE FORENSIC SCIENCE.pptINTRODUCTION TO THE FORENSIC SCIENCE.ppt
INTRODUCTION TO THE FORENSIC SCIENCE.pptKavitha Krishnan
 
Problems associated with the production of recombinant protein.pdf
Problems associated with the production of recombinant protein.pdfProblems associated with the production of recombinant protein.pdf
Problems associated with the production of recombinant protein.pdfNetHelix
 
HERPES SIMPLEX VIRUS 12032019 TUESDAY pptx
HERPES SIMPLEX VIRUS 12032019 TUESDAY  pptxHERPES SIMPLEX VIRUS 12032019 TUESDAY  pptx
HERPES SIMPLEX VIRUS 12032019 TUESDAY pptxPulkitMittal54
 
SMA Implementation science seminar (Day 1).pptx
SMA Implementation science seminar (Day 1).pptxSMA Implementation science seminar (Day 1).pptx
SMA Implementation science seminar (Day 1).pptxAbdirahmanWaseem
 
Substance use disorder (drug addict ), Addiction
Substance use disorder (drug addict ), AddictionSubstance use disorder (drug addict ), Addiction
Substance use disorder (drug addict ), AddictionZeinabEmad3
 
Ortho Products Franchise-solace biotech limited
Ortho Products Franchise-solace biotech limitedOrtho Products Franchise-solace biotech limited
Ortho Products Franchise-solace biotech limitedSBL DIGITAL
 
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptx
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptxNECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptx
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptxSizan Thapa
 
Human Skeletal System_By Anupam Das......
Human Skeletal System_By Anupam Das......Human Skeletal System_By Anupam Das......
Human Skeletal System_By Anupam Das......anupamdas2143
 
Ovarian tumors Lecture notes for MBBS.pptx
Ovarian tumors Lecture notes for MBBS.pptxOvarian tumors Lecture notes for MBBS.pptx
Ovarian tumors Lecture notes for MBBS.pptxSizan Thapa
 
Histology of lymph node(lymph node histology)
Histology of lymph node(lymph node histology)Histology of lymph node(lymph node histology)
Histology of lymph node(lymph node histology)pranavguleria2
 
CELL BLOCK PREPARATION AND ITS IMPORTANCE
CELL BLOCK PREPARATION AND ITS IMPORTANCECELL BLOCK PREPARATION AND ITS IMPORTANCE
CELL BLOCK PREPARATION AND ITS IMPORTANCEShubhadip Ghosh
 
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTION
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTIONANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTION
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTIONojaswinihemane
 
bleeding disorders 1 Dr.Nannika Pradhan
bleeding disorders 1  Dr.Nannika Pradhanbleeding disorders 1  Dr.Nannika Pradhan
bleeding disorders 1 Dr.Nannika Pradhanthesalberry
 

Recently uploaded (20)

SGK VIÊM KHỚP DẠNG THẤP YHN .pdf
SGK VIÊM KHỚP DẠNG THẤP YHN              .pdfSGK VIÊM KHỚP DẠNG THẤP YHN              .pdf
SGK VIÊM KHỚP DẠNG THẤP YHN .pdf
 
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptx
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptxANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptx
ANATOMY OF THE CEREBRUM WITH CLINICAL ANATOMY.pptx
 
(IDE)and(IVD),QMS,21 CFR part820 , 801)
(IDE)and(IVD),QMS,21 CFR part820  , 801)(IDE)and(IVD),QMS,21 CFR part820  , 801)
(IDE)and(IVD),QMS,21 CFR part820 , 801)
 
Spinal cord Gross anatomy with Clinical Anatomy.pptx
Spinal cord Gross anatomy with Clinical Anatomy.pptxSpinal cord Gross anatomy with Clinical Anatomy.pptx
Spinal cord Gross anatomy with Clinical Anatomy.pptx
 
Ayurveda research in Hypothyroidism, P
Ayurveda  research  in Hypothyroidism, PAyurveda  research  in Hypothyroidism, P
Ayurveda research in Hypothyroidism, P
 
INTRODUCTION TO THE FORENSIC SCIENCE.ppt
INTRODUCTION TO THE FORENSIC SCIENCE.pptINTRODUCTION TO THE FORENSIC SCIENCE.ppt
INTRODUCTION TO THE FORENSIC SCIENCE.ppt
 
Problems associated with the production of recombinant protein.pdf
Problems associated with the production of recombinant protein.pdfProblems associated with the production of recombinant protein.pdf
Problems associated with the production of recombinant protein.pdf
 
Evolving Concepts in the Pathogenesis of Inflammatory Dermatologic Disorders ...
Evolving Concepts in the Pathogenesis of Inflammatory Dermatologic Disorders ...Evolving Concepts in the Pathogenesis of Inflammatory Dermatologic Disorders ...
Evolving Concepts in the Pathogenesis of Inflammatory Dermatologic Disorders ...
 
HERPES SIMPLEX VIRUS 12032019 TUESDAY pptx
HERPES SIMPLEX VIRUS 12032019 TUESDAY  pptxHERPES SIMPLEX VIRUS 12032019 TUESDAY  pptx
HERPES SIMPLEX VIRUS 12032019 TUESDAY pptx
 
SMA Implementation science seminar (Day 1).pptx
SMA Implementation science seminar (Day 1).pptxSMA Implementation science seminar (Day 1).pptx
SMA Implementation science seminar (Day 1).pptx
 
Substance use disorder (drug addict ), Addiction
Substance use disorder (drug addict ), AddictionSubstance use disorder (drug addict ), Addiction
Substance use disorder (drug addict ), Addiction
 
Oral disorders .pptx
Oral disorders .pptxOral disorders .pptx
Oral disorders .pptx
 
Ortho Products Franchise-solace biotech limited
Ortho Products Franchise-solace biotech limitedOrtho Products Franchise-solace biotech limited
Ortho Products Franchise-solace biotech limited
 
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptx
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptxNECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptx
NECROSIS FOR MBBS FIRST YEAR STUDENTS MADE EASY.pptx
 
Human Skeletal System_By Anupam Das......
Human Skeletal System_By Anupam Das......Human Skeletal System_By Anupam Das......
Human Skeletal System_By Anupam Das......
 
Ovarian tumors Lecture notes for MBBS.pptx
Ovarian tumors Lecture notes for MBBS.pptxOvarian tumors Lecture notes for MBBS.pptx
Ovarian tumors Lecture notes for MBBS.pptx
 
Histology of lymph node(lymph node histology)
Histology of lymph node(lymph node histology)Histology of lymph node(lymph node histology)
Histology of lymph node(lymph node histology)
 
CELL BLOCK PREPARATION AND ITS IMPORTANCE
CELL BLOCK PREPARATION AND ITS IMPORTANCECELL BLOCK PREPARATION AND ITS IMPORTANCE
CELL BLOCK PREPARATION AND ITS IMPORTANCE
 
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTION
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTIONANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTION
ANTIPERSPIRANTS AND DEODORANTS : MECHANISM OF ACTION
 
bleeding disorders 1 Dr.Nannika Pradhan
bleeding disorders 1  Dr.Nannika Pradhanbleeding disorders 1  Dr.Nannika Pradhan
bleeding disorders 1 Dr.Nannika Pradhan
 

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Induction - Sunil Nair Health Informatics Dalhousie University

  • 1. Classification of Breast Cancer dataset using Decision Tree Induction Sunil Nair Abel Gebreyesus Masters of Health Informatics Dalhousie University HINF6210 Project Presentation – November 25, 2008
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11. Comparison chart – Handle Missing Value Confusion Matrix Total Correctly Classified Instances Test split = 223 Accuracy Rate: 95.78% How many predictions by chance? Expected Accuracy Rate = Kappa Statistic - is used to measure the agreement between predicted and actual categorization of data while correcting for prediction that occurs by chance. 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 87% 94% 8% 14 Complete Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 233 70 163 Total 66 63 3 M 167 7 160 B Total M B Class
  • 12.
  • 13.
  • 14. Classification Methods Comparison 94% 97% 3% 233 Support Vector M 92% 97% 4% 233 DT-J48 79% 91% 10% 233 Neural Network 90% 96% 4% 233 Naïve Bayes Exp. Acc. Rate Act. Acc. Rate MAE # Total Inst. CLASSIFIER PERFORMANCE EVALUATION Test Set
  • 15.
  • 16. Attributes Selected – most IG weka.filters.supervised.attribute.AttributeSelection-Eweka.attributeSelection.InfoGainAttributeEval-Sweka.attributeSelection.Ranker 89% 95% 7% 14 Missing Replaced 90% 96% 5% 11 Missing Removed 92% 97% 4% 11 Attributes Selected Exp. Acc. Rate Act. Acc. Rate MAE # RULES DATASET PERFORMANCE EVALUATION 0.198 Mitosis 9 0.443 Marginal Adhesion 8 0.459 Clump Thickness 7 0.466 Normal Nucleoli 6 0.505 Single Epithelial Cell Size 5 0.543 Bland Chromatin 4 0.564 Bare Nucleoli 3 0.66 Uniformity of Cell Shape 2 0.675 Uniformity of Cell Size 1 Information Gain Attribute Rank
  • 17. The DT – IG/Attribute selection Visualization
  • 18.
  • 19.
  • 20.
  • 22. Performance Evaluation 92% 96% 3% 13 412 Testing set 97% 99% 2% 13 476 Training set Exp. Acc. Rate Act. Acc. Rate MAE # Rules # Instances Dataset PERFORMANCE EVALUATION Test Set
  • 24.
  • 25.
  • 26.
  • 27. Questions / Comments Thank You !