SlideShare a Scribd company logo
Department of Geomatics, National Cheng Kung University
[106-2] Data Mining, Homework 5, Instructor: Hsueh-Chan Lu
Muhammad Irsyadi Firdaus P66067055
Based on your collected dataset, please select the most relevant attribute as the target attribute. Using R tool
to analyze the following items:
1. Based on C5.0 Classification, 70% of data are randomly sampled for building classification model and
other 30% of data are used for testing. Output the tree model, confusion matrix and prediction accuracy.
2. Based on naiveBayes Classification, 70% of data are randomly sampled for building classification model
and other 30% of data are used for testing. Output the confusion matrix and prediction accuracy.
3. Write a short report to summarize what do you get / find after classification analysis.
Hint: Observe the decision tree and try to explain why these attributes are important to the target attribute.
The comparison of decision tree model and naïve Bayes model in terms of prediction accuracy.
Answers
1. In HW1, collected a dataset with about 68 records. In this dataset, Target attribute is Interest in Vacation
which is classified to Yes and No. and other six attributes are evaluated attributes. Gender attributes consists
of Male and Female, Age attributes consist of Young and Medium, and Marriage Status attributes consists
of Student and Not Student, The Intensity of a Vacation attributes consists of Low and High, Vacation Time
attributes consists of Weekend, School holidays, and National holiday.
Table 1. Training Data (classification model)
This method used tree structure to build the classification models. It divides a dataset into smaller subsets.
Leaf node represents a decision. Based on feature values of instances, the decision trees classify the
instances. Each node represents a feature in an instance in a decision tree which is to be classified, and each
branch represents a value. Classification of Instances starts from the root node and sorted based on their
feature values. Categorical and numerical data can be handled by decision tress.
To Classification the dataset will divide into two types ie 70% of data are randomly sampled for building
classification model and other 30% of data are used for testing. In this case, data for model classification
amounted to 48 while data for testing amounted to 20.
Testing data taken randomly as much as 20. The results can be seen below
Table 2. Testing Data
We need to become comfortable with some terminology. Recall that we can talk in terms of positive tuples
(tuples of the main class of interest) and negative tuples (all other tuples). Given two classes, for example,
the positive tuples may be Interest in Vacation = yes while the negative tuples are Interest in Vacation =
No. Suppose we use our classifier on a test set of labeled tuples . The result on confusion matrix in figure
1, the dataset has accuracy about 0.45 with sensitivity about 0.7. The accuracy of a classifier on a given
test set is the percentage of test set tuples that are correctly classified by the classifier. That is,
𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 =
𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝑃𝑃 + 𝑁𝑁
The sensitivity and specificity measures can be used, respectively, for this purpose. Sensitivity is also
referred to as the true positive (recognition) rate (i.e., the proportion of positive tuples that are correctly
identified), while specificity is the true negative rate (i.e., the proportion of negative tuples that are correctly
identified). These measures are defined as
𝑠𝑠𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 =
𝑇𝑇𝑇𝑇
𝑃𝑃
𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 =
𝑇𝑇𝑇𝑇
𝑁𝑁
Figure 1. Confusion Matrix and statistics using decision tree
Figure 2. Decision Tree
Classification trees are used for the kind of Data Mining problem which are concerned with prediction.
2. Bayesian classification can predict class membership probabilities. The effect of an attribute value on a
given class is independent of the value of the other attributes is assumed by the Naïve Bayes algorithm.
The Naïve Bayes algorithm scales continuously in the number of predictors and rows and builds rapidly
models. Naive Bayes algorithm derives the probability of a prediction. The probability of event X occurring
given that event Y has occurred (𝑃𝑃(𝑋𝑋|𝑌𝑌)) is proportional to the probability of event Y occurring given
that event X has occurred multiplied by the probability of event X occurring ((𝑃𝑃(𝑌𝑌|𝑋𝑋)𝑃𝑃(𝑋𝑋)).
If using the Bayesian classification method then the confusion matrix and statistics can be seen below
Figure 3. Confusion Matrix and statistics using Bayesian classification
From the above confusion matrix result, the dataset has accuracy about 0.6 with sensitivity about 0.8750.
We wish to predict the class label of a “Interest in Vacation” using na¨ıve Bayesian classification, given the
same training data as in Table 1 for decision tree induction. The results of the prediction model can be seen
below
3. Once we get the result of the decision tree model and Bayesian classification model then we can compare
the accuracy of both models. From these results it is found that the accuracy of the Bayesian model
classification is better than using the decision tree model.
Table 3. Comparison between Bayesian Classification and Decision Tree
Interest.in.Vacation
by Ground
Interest.in.Vacation
by B. C
Interest.in.Vacation
by D. T
21 No No No
33 No No No
39 No No No
43 Yes Yes Yes
49 Yes Yes Yes
Appendix
Algorithms Classification Analysis in RStudio

More Related Content

What's hot

Missing Data and data imputation techniques
Missing Data and data imputation techniquesMissing Data and data imputation techniques
Missing Data and data imputation techniques
Omar F. Althuwaynee
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
Valerii Klymchuk
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
NBER
 
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
NewUOPCourse
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
ijsrd.com
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
緯鈞 沈
 
Repurposing predictive tools for causal research
Repurposing predictive tools for causal researchRepurposing predictive tools for causal research
Repurposing predictive tools for causal research
Galit Shmueli
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
Derek Kane
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
Abhimanyu Dwivedi
 
To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?
Galit Shmueli
 
Illustration of Mental Health Clustering Calculator ajmitchell
Illustration of Mental Health Clustering Calculator ajmitchellIllustration of Mental Health Clustering Calculator ajmitchell
Illustration of Mental Health Clustering Calculator ajmitchell
Alex J Mitchell
 
Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...
Galit Shmueli
 

What's hot (12)

Missing Data and data imputation techniques
Missing Data and data imputation techniquesMissing Data and data imputation techniques
Missing Data and data imputation techniques
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
QNT 275 Week 5 Apply Connect Week 5 Case Qnt 275 qnt275 https://uopcourses.co...
 
A Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of DiseasesA Decision Tree Based Classifier for Classification & Prediction of Diseases
A Decision Tree Based Classifier for Classification & Prediction of Diseases
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Repurposing predictive tools for causal research
Repurposing predictive tools for causal researchRepurposing predictive tools for causal research
Repurposing predictive tools for causal research
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?To Explain, To Predict, or To Describe?
To Explain, To Predict, or To Describe?
 
Illustration of Mental Health Clustering Calculator ajmitchell
Illustration of Mental Health Clustering Calculator ajmitchellIllustration of Mental Health Clustering Calculator ajmitchell
Illustration of Mental Health Clustering Calculator ajmitchell
 
Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...Repurposing Classification & Regression Trees for Causal Research with High-D...
Repurposing Classification & Regression Trees for Causal Research with High-D...
 

Similar to Building classification model, tree model, confusion matrix and prediction accuracy

Nbe rcausalpredictionv111 lecture2
Nbe rcausalpredictionv111 lecture2Nbe rcausalpredictionv111 lecture2
Nbe rcausalpredictionv111 lecture2
NBER
 
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docxWeek 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
cockekeshia
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - FinalBrian Lin
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
rhettwhitee
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
kingrani623
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
Noahliamwilliam
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
Faarooqkhaann
 
E1802023741
E1802023741E1802023741
E1802023741
IOSR Journals
 
Classification modelling review
Classification modelling reviewClassification modelling review
Classification modelling review
Jaideep Adusumelli
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
eyavagal
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
uopassignment
 
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docxAssessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
cargillfilberto
 
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docxWeek 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
cockekeshia
 
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docxHistograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
pooleavelina
 
Basic statistics
Basic statisticsBasic statistics
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
swapnaraghav
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292HARDIK SINGH
 
Heart disease classification
Heart disease classificationHeart disease classification
Heart disease classification
SnehaDey21
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
Anthony Kilili
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
SANSKAR20
 

Similar to Building classification model, tree model, confusion matrix and prediction accuracy (20)

Nbe rcausalpredictionv111 lecture2
Nbe rcausalpredictionv111 lecture2Nbe rcausalpredictionv111 lecture2
Nbe rcausalpredictionv111 lecture2
 
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docxWeek 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
 
2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final2016 Symposium Poster - statistics - Final
2016 Symposium Poster - statistics - Final
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
E1802023741
E1802023741E1802023741
E1802023741
 
Classification modelling review
Classification modelling reviewClassification modelling review
Classification modelling review
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
Ash bus 308 week 2 problem set new
Ash bus 308 week 2 problem set newAsh bus 308 week 2 problem set new
Ash bus 308 week 2 problem set new
 
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docxAssessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
Assessment 3 – Hypothesis, Effect Size, Power, and t Tests.docx
 
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docxWeek 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
 
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docxHistograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
Histograms and Descriptive Statistics Scoring GuideCRITERIANON.docx
 
Basic statistics
Basic statisticsBasic statistics
Basic statistics
 
Data science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptxData science notes for ASDS calicut 2.pptx
Data science notes for ASDS calicut 2.pptx
 
IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292IJCSI-10-6-1-288-292
IJCSI-10-6-1-288-292
 
Heart disease classification
Heart disease classificationHeart disease classification
Heart disease classification
 
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACROBOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
BOOTSTRAPPING TO EVALUATE RESPONSE MODELS: A SAS® MACRO
 
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docxExcel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
Excel Files AssingmentsCopy of Student_Assignment_File.11.01..docx
 

More from National Cheng Kung University

Accuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
Accuracy assessment and 3D Mapping by Consumer Grade Spherical CameraAccuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
Accuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
National Cheng Kung University
 
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
National Cheng Kung University
 
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
National Cheng Kung University
 
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
National Cheng Kung University
 
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
National Cheng Kung University
 
Handbook PPI Tainan Taiwan 2018
Handbook PPI Tainan Taiwan 2018Handbook PPI Tainan Taiwan 2018
Handbook PPI Tainan Taiwan 2018
National Cheng Kung University
 
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest NeighborSatellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
National Cheng Kung University
 
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU DataOptimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
National Cheng Kung University
 
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest NeighborSatellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
National Cheng Kung University
 
EKF and RTS smoother toolbox
EKF and RTS smoother toolboxEKF and RTS smoother toolbox
EKF and RTS smoother toolbox
National Cheng Kung University
 
Kalman Filter Basic
Kalman Filter BasicKalman Filter Basic
Kalman Filter Basic
National Cheng Kung University
 
A Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of InterestA Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of Interest
National Cheng Kung University
 
DSM Extraction from Pleiades Images Using RSP
DSM Extraction from Pleiades Images Using RSPDSM Extraction from Pleiades Images Using RSP
DSM Extraction from Pleiades Images Using RSP
National Cheng Kung University
 
Calibration of Inertial Sensor within Smartphone
Calibration of Inertial Sensor within SmartphoneCalibration of Inertial Sensor within Smartphone
Calibration of Inertial Sensor within Smartphone
National Cheng Kung University
 
Pengukuran GPS Menggunakan Trimble Secara Manual
Pengukuran GPS Menggunakan Trimble Secara ManualPengukuran GPS Menggunakan Trimble Secara Manual
Pengukuran GPS Menggunakan Trimble Secara Manual
National Cheng Kung University
 
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
National Cheng Kung University
 
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
National Cheng Kung University
 
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
National Cheng Kung University
 
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
National Cheng Kung University
 
3D reconstruction by photogrammetry and 4D deformation measurement
3D reconstruction by photogrammetry and 4D deformation measurement3D reconstruction by photogrammetry and 4D deformation measurement
3D reconstruction by photogrammetry and 4D deformation measurement
National Cheng Kung University
 

More from National Cheng Kung University (20)

Accuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
Accuracy assessment and 3D Mapping by Consumer Grade Spherical CameraAccuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
Accuracy assessment and 3D Mapping by Consumer Grade Spherical Camera
 
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
 
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
3D Rekonstruksi Bangunan Menggunakan Gambar Panorama Sebagai Upaya Untuk Miti...
 
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
 
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
3D Indoor and Outdoor Mapping from Point Cloud Generated by Spherical Camera
 
Handbook PPI Tainan Taiwan 2018
Handbook PPI Tainan Taiwan 2018Handbook PPI Tainan Taiwan 2018
Handbook PPI Tainan Taiwan 2018
 
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest NeighborSatellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
 
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU DataOptimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
Optimal Filtering with Kalman Filters and Smoothers Using AndroSensor IMU Data
 
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest NeighborSatellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
Satellite Image Classification using Decision Tree, SVM and k-Nearest Neighbor
 
EKF and RTS smoother toolbox
EKF and RTS smoother toolboxEKF and RTS smoother toolbox
EKF and RTS smoother toolbox
 
Kalman Filter Basic
Kalman Filter BasicKalman Filter Basic
Kalman Filter Basic
 
A Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of InterestA Method of Mining Association Rules for Geographical Points of Interest
A Method of Mining Association Rules for Geographical Points of Interest
 
DSM Extraction from Pleiades Images Using RSP
DSM Extraction from Pleiades Images Using RSPDSM Extraction from Pleiades Images Using RSP
DSM Extraction from Pleiades Images Using RSP
 
Calibration of Inertial Sensor within Smartphone
Calibration of Inertial Sensor within SmartphoneCalibration of Inertial Sensor within Smartphone
Calibration of Inertial Sensor within Smartphone
 
Pengukuran GPS Menggunakan Trimble Secara Manual
Pengukuran GPS Menggunakan Trimble Secara ManualPengukuran GPS Menggunakan Trimble Secara Manual
Pengukuran GPS Menggunakan Trimble Secara Manual
 
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
Accuracy Analysis of Three-Dimensional Model Reconstructed by Spherical Video...
 
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
Association Rule (Data Mining) - Frequent Itemset Generation, Closed Frequent...
 
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
The rotation matrix (DCM) and quaternion in Inertial Survey and Navigation Sy...
 
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...SIFT/SURF can achieve scale, rotation and illumination invariant during image...
SIFT/SURF can achieve scale, rotation and illumination invariant during image...
 
3D reconstruction by photogrammetry and 4D deformation measurement
3D reconstruction by photogrammetry and 4D deformation measurement3D reconstruction by photogrammetry and 4D deformation measurement
3D reconstruction by photogrammetry and 4D deformation measurement
 

Recently uploaded

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
Kamal Acharya
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
ssuser9bd3ba
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
MuhammadTufail242431
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
PrashantGoswami42
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
Kamal Acharya
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 

Recently uploaded (20)

Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
Halogenation process of chemical process industries
Halogenation process of chemical process industriesHalogenation process of chemical process industries
Halogenation process of chemical process industries
 
Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.Quality defects in TMT Bars, Possible causes and Potential Solutions.
Quality defects in TMT Bars, Possible causes and Potential Solutions.
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Vaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdfVaccine management system project report documentation..pdf
Vaccine management system project report documentation..pdf
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 

Building classification model, tree model, confusion matrix and prediction accuracy

  • 1. Department of Geomatics, National Cheng Kung University [106-2] Data Mining, Homework 5, Instructor: Hsueh-Chan Lu Muhammad Irsyadi Firdaus P66067055 Based on your collected dataset, please select the most relevant attribute as the target attribute. Using R tool to analyze the following items: 1. Based on C5.0 Classification, 70% of data are randomly sampled for building classification model and other 30% of data are used for testing. Output the tree model, confusion matrix and prediction accuracy. 2. Based on naiveBayes Classification, 70% of data are randomly sampled for building classification model and other 30% of data are used for testing. Output the confusion matrix and prediction accuracy. 3. Write a short report to summarize what do you get / find after classification analysis. Hint: Observe the decision tree and try to explain why these attributes are important to the target attribute. The comparison of decision tree model and naïve Bayes model in terms of prediction accuracy. Answers 1. In HW1, collected a dataset with about 68 records. In this dataset, Target attribute is Interest in Vacation which is classified to Yes and No. and other six attributes are evaluated attributes. Gender attributes consists of Male and Female, Age attributes consist of Young and Medium, and Marriage Status attributes consists of Student and Not Student, The Intensity of a Vacation attributes consists of Low and High, Vacation Time attributes consists of Weekend, School holidays, and National holiday. Table 1. Training Data (classification model) This method used tree structure to build the classification models. It divides a dataset into smaller subsets. Leaf node represents a decision. Based on feature values of instances, the decision trees classify the
  • 2. instances. Each node represents a feature in an instance in a decision tree which is to be classified, and each branch represents a value. Classification of Instances starts from the root node and sorted based on their feature values. Categorical and numerical data can be handled by decision tress. To Classification the dataset will divide into two types ie 70% of data are randomly sampled for building classification model and other 30% of data are used for testing. In this case, data for model classification amounted to 48 while data for testing amounted to 20. Testing data taken randomly as much as 20. The results can be seen below Table 2. Testing Data We need to become comfortable with some terminology. Recall that we can talk in terms of positive tuples (tuples of the main class of interest) and negative tuples (all other tuples). Given two classes, for example, the positive tuples may be Interest in Vacation = yes while the negative tuples are Interest in Vacation = No. Suppose we use our classifier on a test set of labeled tuples . The result on confusion matrix in figure 1, the dataset has accuracy about 0.45 with sensitivity about 0.7. The accuracy of a classifier on a given test set is the percentage of test set tuples that are correctly classified by the classifier. That is, 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = 𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 𝑃𝑃 + 𝑁𝑁 The sensitivity and specificity measures can be used, respectively, for this purpose. Sensitivity is also referred to as the true positive (recognition) rate (i.e., the proportion of positive tuples that are correctly identified), while specificity is the true negative rate (i.e., the proportion of negative tuples that are correctly identified). These measures are defined as 𝑠𝑠𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 = 𝑇𝑇𝑇𝑇 𝑃𝑃 𝑠𝑠𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 = 𝑇𝑇𝑇𝑇 𝑁𝑁
  • 3. Figure 1. Confusion Matrix and statistics using decision tree Figure 2. Decision Tree Classification trees are used for the kind of Data Mining problem which are concerned with prediction. 2. Bayesian classification can predict class membership probabilities. The effect of an attribute value on a given class is independent of the value of the other attributes is assumed by the Naïve Bayes algorithm. The Naïve Bayes algorithm scales continuously in the number of predictors and rows and builds rapidly models. Naive Bayes algorithm derives the probability of a prediction. The probability of event X occurring given that event Y has occurred (𝑃𝑃(𝑋𝑋|𝑌𝑌)) is proportional to the probability of event Y occurring given
  • 4. that event X has occurred multiplied by the probability of event X occurring ((𝑃𝑃(𝑌𝑌|𝑋𝑋)𝑃𝑃(𝑋𝑋)). If using the Bayesian classification method then the confusion matrix and statistics can be seen below Figure 3. Confusion Matrix and statistics using Bayesian classification From the above confusion matrix result, the dataset has accuracy about 0.6 with sensitivity about 0.8750. We wish to predict the class label of a “Interest in Vacation” using na¨ıve Bayesian classification, given the same training data as in Table 1 for decision tree induction. The results of the prediction model can be seen below 3. Once we get the result of the decision tree model and Bayesian classification model then we can compare the accuracy of both models. From these results it is found that the accuracy of the Bayesian model classification is better than using the decision tree model. Table 3. Comparison between Bayesian Classification and Decision Tree Interest.in.Vacation by Ground Interest.in.Vacation by B. C Interest.in.Vacation by D. T 21 No No No 33 No No No 39 No No No 43 Yes Yes Yes 49 Yes Yes Yes