SlideShare a Scribd company logo
1 of 5
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1382
Analysis of Chi-Square Independence Test for Naïve Bayes
Feature Selection
Pratik Kadam
Graduate Student, Major in Operations Research, Northeastern University, Boston, Massachusetts
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Naïve Bayes based upon the Bayes theorem and
Conditional Probability is one of the most popular classifiers
used for classifying data. However, the accuracy of this
algorithm highly depends on the features of the datasets used.
Present paper thus focuses on using Chi-Square Independence
Test for feature selection at various confidence intervals.
Tableau was used for data visualization, Minitab as a
statistical tool and RStudio was used for developing the Naïve
Bayes Model.
Key Words: Naïve Bayes Classifier, Chi-Square
Independence Test, Feature Selection, Data Science,
Student Performance
1. INTRODUCTION
Naïve Bayes Classifier has been used extensively for spam
filtering & other such predictive modelling applications. [1]
Naïve Bayes Classifier is a probabilistic classifier based on
Bayes’ theorem. It assumes independence between the
predictors or features of the dataset. This assumption
doesn’t always necessarily hold true. However, despite the
features being dependent, the Naïve Bayes Classifier offers
surprisingly accurate predictions. Hence it is termed as
‘Naïve’ Bayes Classifier.
However, the accuracy of this classifier is highly dependent
on the number of features in the dataset. It is quite unclear
what is the effect on the accuracy of the model due to factors
such as the quantity & quality of features.
The present paper thus attempts at selecting features that
are prominent in determining the outcome using the Chi-
Square Test of Independence. The featurestobeselectedare
determined using the p-values of the features. The accuracy
of the classifier would be then judged for different sets of
these features based on the mean & standard deviation of a
total fraction of correct predictions made.
1.1 CHI-SQUARE TEST OF INDEPENDENCY
To study the dependence of various factors on the failure
proportions of the students, Chi-Square test for
independence was chosen as the statistical test to be
performed assuming the distribution for the sum of squared
normal deviates to be a Chi-Square Distribution. This test is
applied when you have two categorical variables from a
single population. It is used to determine whether there is a
significant association between the two variables [2].
Following are the prerequisites to be considered before
performing the test:
A. The sampling method is a simple random sampling.
B. The variables under study are each categorical.
C. If sample data are displayed in a contingency table, the
expected frequency count for each cell of the table is at least
5.
If the above perquisites are satisfied, the following are the
steps to be performed to check the dependency of the
features.
Step 1: State the Hypothesis
H0: The considered factor and the failure proportion forthat
factor are independent
H1: The considered factor and the failure proportion for
that factor are not independent
Thus, the alternative hypothesisstatesthatknowinglevelsof
any one factor can help us determine the chances of failure
for the student.
Step 2: Calculating the Test Statistic
The Chi-Square value is given by the equation below:
𝛘2 = Σ [ (Or,c - Er,c)2 / Er,c]
where,
O[r,c] = Observed frequency counts of pass/fail studentsfor
every level of factor.
E[r,c] = (Nr * Nc) / N = Expected frequency counts of a class
of students for every level of factor.
Nr = the Total number of pass/fail students for a factor.
Nc = Total number of students one for a level of the factor.
N = Total sample size.
Step 3: Defining the decision rule.
Deciding the significance level for p-values.
Step 4: Making a Decision.
If the p-value is less than the significance level, we reject H0
and accept H1. i.e. The variables are dependent on each
other.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1383
2. DATASET DESCRIPTION
The dataset chosen for this paper, included student’s
performances in two different Portuguese schools for Math
class [3]. The data attributes include student’s grades for 3
different periods, demographic, school, personal and social
related features. The data was collected was using mark
reports and questionnaires. Following are the attributes of
the dataset:
1. school - student's school (binary: 'GP' - Gabriel Pereira
or 'MS' - Mousinho da Silveira)
2. sex - student's sex (binary: 'F' - female or 'M' - male)
3. age - student's age (numeric: from 15 to 22)
4. address - student's home address type (binary: 'U' -
urban or 'R' - rural)
5. famsize - family size (binary: 'LE3' - less or equal to3or
'GT3' - greater than 3)
6. Pstatus - parent's cohabitationstatus(binary:'T' -living
together or 'A' - apart)
7. Medu - mother's education (numeric: 0 - none, 1 -
primary education (4th grade), 2 - 5th to 9th grade, 3 -
secondary education or 4 - higher education)
8. Fedu - father's education (numeric: 0 -none,1-primary
education (4th grade), 2 -5th to 9th grade,3- secondary
education or 4 - higher education)
9. Mjob - mother's job (nominal: 'teacher', 'health' care
related, civil 'services' (e.g. administrative or police),
'at_home' or 'other')
10. Fjob - father's job (nominal: 'teacher', 'health' care
related, civil 'services' (e.g. administrative or police),
'at_home' or 'other')
11. reason - reason to choose this school (nominal: close to
'home', school 'reputation', 'course' preference or
'other')
12. guardian - student's guardian (nominal: 'mother',
'father' or 'other')
13. traveltime - home to school travel time (numeric: 1 -
<15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 -
>1 hour)
14. studytime - weekly study time (numeric: 1 -<2hours,2
- 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours)
15. failures - number of past class failures (numeric: n if
1<=n<3, else 4)
16. schoolsup - extra educational support (binary: yes or
no)
17. famsup - family educational support (binary: yes orno)
18. paid - extra paid classes within the course subject
(binary: yes or no)
19. activities - extra-curricular activities(binary:yesor no)
20. nursery - attended nursery school (binary: yes or no)
21. higher - wants to take higher education (binary: yes or
no)
22. internet - Internet access at home (binary: yes or no)
23. romantic - with a romantic relationship (binary: yes or
no)
24. famrel - quality of family relationships(numeric:from1
- very bad to 5 - excellent)
25. freetime - free time after school (numeric: from1 -very
low to 5 - very high)
26. goout - going out with friends (numeric: from 1 - very
low to 5 - very high)
27. Dalc - workday alcohol consumption (numeric: from 1 -
very low to 5 - very high)
28. Walc - weekend alcohol consumption(numeric:from1 -
very low to 5 - very high)
29. health - current health status (numeric: from 1 - very
bad to 5 - very good)
30. absences - number of school absences(numeric:from0
to 93)
31. G1 - first period grade (numeric: from 0 to 20)
32. G2 - second period grade (numeric: from 0 to 20)
33. G3 - final grade (numeric: from 0 to 20, output target)
[3].
An extra attribute named ‘success' was further added to the
existing dataset. This attribute gave informationregarding if
the student has passed the class or not. This attribute is our
categorization attribute. We checked the accuracy of the
Naïve Bayes Classifier for two different types of
classification. The two types of classification are as follows
[4]:
A. 2-class Classification
Pass: If G3 > 10, else Fail
B. 5-class Classification
A: G3 = 16 to 20;
B: G3 = 14 to 15;
C: G3 = 12 to 13
D: G3 = 10 to 11;
F: G3 = 0 to 9.
3. DATA VISUALIZATION AND CHI SQUARE TEST
RESULTS
Data was at first visualized using Tableau Server to look for
any anomalies within the data. Bar charts were generated to
visualize the proportions of different classes of success for
different levels of every variable.
Following are some of the visualizations obtained from
Tableau Server:
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1384
Chart 1: Tableau Server Visualization for % of Total
Number of Records for Age and proportion of pass/fail
students.
Chart 2: Tableau Server Visualization for % of Total
Number of Records for Age and proportion of students
with A, B, C, D, F grade.
The data summary obtained was then further used for Chi-
Square Independency test. The Chi-Square Independency
test was performed using Minitab 18. Following were the
results obtained:
Table 1: P-values for Chi-Square Independency test for 2-
class Classification.
Features P-Values
Prev. Class Failures 0
Class Attendance 0
Planning for Higher Education 0.001
Hangouts with Friends 0.002
Age 0.027
School Support 0.038
Relationship Status 0.061
Extra Courses 0.099
Student's Guardian 0.139
Mother's Education 0.143
Family Support 0.146
Sex 0.155
Workday Alcoholic Consumption 0.178
Father's Education 0.193
Reason for Choosing 0.21
Internet Access at Home 0.298
Amount of Free Time 0.334
Mother's Job 0.334
Amount of Study Time 0.387
Parent's Cohabitation Status 0.42
Family Size 0.485
Location 0.521
Quality of Health 0.542
Quality of Family Relationships 0.583
Father's Job 0.674
Travel Time 0.694
Professor 0.714
Attended Nursery 0.74
Weekday Alcoholic Consumption 0.776
Extra Activities 0.807
Table 2: P-values for Chi-Square Independency test for 5-
class Classification.
Feature P-value
Previous Class Failures 0
Class Attendance 0.001
Mother's Education 0.003
School Support 0.005
Planning for Higher Education 0.011
Hangouts with Friends 0.03
Workday Alcoholic Consumption 0.036
Extra Courses 0.049
Relationship Status 0.056
Mother's Job 0.071
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1385
Location 0.11
Father's Job 0.158
Sex 0.175
Amount of Study Time 0.202
Father's Education 0.225
Weekday Alcoholic Consumption 0.227
Age 0.237
Family Size 0.296
Internet Access at Home 0.298
Reason for Choosing 0.306
Amount of Free Time 0.339
Student’s Guardian 0.354
Travel Time 0.447
Attended Nursery 0.494
Parent's Cohabitation Status 0.569
Quality of Health 0.571
Professor 0.584
Quality of Family Relationships 0.664
Extra Activities 0.693
Family Support 0.722
4. NAÏVE BAYES CLASSIFIER MODELING
Confidence Intervals for the Chi-Squared tests. Confidence
Intervals of 75%, 85%, 95% & 99% were decided to be
utilized. These datasets were then dividedinto80-20%ratio
of training and testing datasets respectively using R's
random uniform distribution. Modeling was repeated 100
times to mitigate the effect of the random distribution of
training and testing datasets.
Furthermore, datasets were constructed considering the
effect of the grade attributes on the accuracy.Threeseparate
datasets were built, having no grade attribute, first-period
grade, and first two period grades.
The e1071 R package was used for building the Naïve Bayes
Model for the dataset. After building the model, it was tested
on the test dataset and proportions of correct predictions
were obtained. Standard Deviation is the measure of the
variance between the repetitions. Following are the results
obtained for all types of the datasets:
Table 3: Summary of Correct Predictions for 2-Class
Classification.
Data type
Data feature
type
Accuracy Std. Dev.
COMPLETE
DATASET
W/O Grades 0.7 0.0435
75% W/O Grades 0.716 0.0432
85% W/O Grades 0.714 0.04
95% W/O Grades 0.734 0.044
99% W/O Grades 0.724 0.0445
COMPLETE
DATASET
G1 0.795 0.0402
75% G1 0.832 0.0455
85% G1 0.827 0.0435
95% G1 0.834 0.0353
99% G1 0.826 0.04
COMPLETE
DATASET
G1 & G2 0.8 0.0422
75% G1 & G2 0.88 0.0325
85% G1 & G2 0.884 0.03
95% G1 & G2 0.885 0.031
99% G1 & G2 0.885 0.034
Table 4: Summary of Correct Predictions for 5-Class
Classification.
Data type
Data feature
type
Accuracy Std. Dev.
COMPLETE
DATASET
W/O Grades 0.282 0.0481
75% W/O Grades 0.305 0.0483
85% W/O Grades 0.309 0.0483
95% W/O Grades 0.285 0.0552
99% W/O Grades 0.272 0.0479
COMPLETE
DATASET
G1 0.468 0.0544
75% G1 0.508 0.0599
85% G1 0.537 0.0575
95% G1 0.529 0.0643
99% G1 0.536 0.0631
COMPLETE
DATASET
G1 & G2 0.683 0.0522
75% G1 & G2 0.692 0.0572
85% G1 & G2 0.703 0.0573
95% G1 & G2 0.703 0.0582
99% G1 & G2 0.695 0.0605
The resultshada meanstandarddeviationof0.0474
which was acceptable tolerance.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1386
5. CONCLUSIONS
A. For 2-Class Classification
As was expected, highest accuracy was achieved when G1 &
G2 both were included in the dataset as G3 was final grade
which was heavily dependent on G1 & G2. 88.5% was the
maximum accuracy achieved. It was observed that using a
Chi-Squared test did help in eliminating unnecessary
attributes and increasing the accuracy of the model.
However, setting a high confidence interval resultedinover-
elimination of variables and decreased the accuracy of the
model. Following are the increases in accuracy of the model
achieved using the Chi-Squared test for elimination.
Table 5: Summary of Increase in Accuracy for 2-Class
Classification.
Data feature
type
Data Type
(Maximum Accuracy
obtained at)
Increase in
Accuracy
Without
grades
95% 4.86%
G1 95% 4.9%
G1 & G2 95% 10.6%
As we can observe, maximum accuracy was achievedat95%
Confidence Interval with as much as 10.6% increase in
accuracy observed.
B. For 5-class Classification
Similarly, for 5-class Classification, highest accuracy was
obtained for dataset including G1 & G2. However, unlike 2-
class Classification, a maximum accuracy of only 70% was
achieved which could possibly be a case of a small dataset
and increase in levels of classification. Irrespective of that,
Chi-Squared test did improve the accuracy of the model.
Maximum accuracy was achieved at 85% Confidence
Interval, possibly indicating that model requires a wider
dataset for high levels of classification. Following are the
increases in the accuracy of the model obtained:
Table 6: Summary of Increase in Accuracy for 5-Class
Classification.
Data feature
type
Data Type
(Maximum Accuracy
obtained at)
Increase in
Accuracy
Without
grades
85% 9.57%
G1 85% 14.74%
G1 & G2 85% 2.93%
We can thus conclude that Chi-Squared is an effective tool
in reducing features for building a Naïve Bayes.
REFERENCES
[1] https://en.wikipedia.org/wiki/Naive_Bayes_classifier
[2] https://stattrek.com/chi-square-
test/independence.aspx
[3] https://archive.ics.uci.edu/ml/datasets/student+
performance
[4] Paulo Cortez & Alice Silva. “Using Data Mining toPredict
Secondary School Student Performance”, EUROSIS
(2008).
BIOGRAPHY
A Graduate Student at
Northeastern University, Boston,
Massachusetts with majors in
Operations Research.

More Related Content

What's hot

Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsHybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsMatthias Braunhofer
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students PerformanceIRJET Journal
 
Business statistics-i-part1-aarhus-bss
Business statistics-i-part1-aarhus-bssBusiness statistics-i-part1-aarhus-bss
Business statistics-i-part1-aarhus-bssAntonio Rivero Ostoic
 
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...William Kritsonis
 
Probability Theory and Mathematical Statistics
Probability Theory and Mathematical StatisticsProbability Theory and Mathematical Statistics
Probability Theory and Mathematical Statisticsmetamath
 
Probability Theory and Mathematical Statistics in Tver State University
Probability Theory and Mathematical Statistics in Tver State UniversityProbability Theory and Mathematical Statistics in Tver State University
Probability Theory and Mathematical Statistics in Tver State Universitymetamath
 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAdamBab
 
Multivariate analyses
Multivariate analysesMultivariate analyses
Multivariate analysesNaveen Deswal
 
Recommender system
Recommender systemRecommender system
Recommender systemBhumi Patel
 

What's hot (10)

Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender SystemsHybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
Hybridisation Techniques for Cold-Starting Context-Aware Recommender Systems
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students Performance
 
Business statistics-i-part1-aarhus-bss
Business statistics-i-part1-aarhus-bssBusiness statistics-i-part1-aarhus-bss
Business statistics-i-part1-aarhus-bss
 
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...
Clarence Johnson, Dissertation PPT, Dr. William Allan Kritsonis, Dissertation...
 
Probability Theory and Mathematical Statistics
Probability Theory and Mathematical StatisticsProbability Theory and Mathematical Statistics
Probability Theory and Mathematical Statistics
 
Probability Theory and Mathematical Statistics in Tver State University
Probability Theory and Mathematical Statistics in Tver State UniversityProbability Theory and Mathematical Statistics in Tver State University
Probability Theory and Mathematical Statistics in Tver State University
 
Analysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments DatasetAnalysis of Textual Data Classification with a Reddit Comments Dataset
Analysis of Textual Data Classification with a Reddit Comments Dataset
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Multivariate analyses
Multivariate analysesMultivariate analyses
Multivariate analyses
 
Recommender system
Recommender systemRecommender system
Recommender system
 

Similar to IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selection

IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for StudentsIRJET Journal
 
Capstone eLearning Deck
Capstone eLearning DeckCapstone eLearning Deck
Capstone eLearning DeckZacharyCote2
 
IJSRED-V3I6P13
IJSRED-V3I6P13IJSRED-V3I6P13
IJSRED-V3I6P13IJSRED
 
Top schools in noida
Top schools in noidaTop schools in noida
Top schools in noidaEdhole.com
 
IRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary EducationIRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary EducationIRJET Journal
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptxVishalLabde
 
Graduate Admission Predictor
Graduate Admission PredictorGraduate Admission Predictor
Graduate Admission PredictorIRJET Journal
 
Classification Algorithms with Attribute Selection: an evaluation study using...
Classification Algorithms with Attribute Selection: an evaluation study using...Classification Algorithms with Attribute Selection: an evaluation study using...
Classification Algorithms with Attribute Selection: an evaluation study using...Eswar Publications
 
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...IRJET Journal
 
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...IRJET Journal
 
t-Test Project Instructions and Rubric Project Overvi.docx
t-Test Project Instructions and Rubric  Project Overvi.docxt-Test Project Instructions and Rubric  Project Overvi.docx
t-Test Project Instructions and Rubric Project Overvi.docxmattinsonjanel
 
The Evaluation Model of Garbage Classification System Based on AHP
The Evaluation Model of Garbage Classification System Based on AHPThe Evaluation Model of Garbage Classification System Based on AHP
The Evaluation Model of Garbage Classification System Based on AHPDr. Amarjeet Singh
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET Journal
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET Journal
 
IRJET- Attribute Based Adaptive Evaluation System
IRJET-  	  Attribute Based Adaptive Evaluation SystemIRJET-  	  Attribute Based Adaptive Evaluation System
IRJET- Attribute Based Adaptive Evaluation SystemIRJET Journal
 

Similar to IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selection (20)

IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATIONRESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
 
Capstone eLearning Deck
Capstone eLearning DeckCapstone eLearning Deck
Capstone eLearning Deck
 
IJSRED-V3I6P13
IJSRED-V3I6P13IJSRED-V3I6P13
IJSRED-V3I6P13
 
Top schools in noida
Top schools in noidaTop schools in noida
Top schools in noida
 
IRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary EducationIRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary Education
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
Graduate Admission Predictor
Graduate Admission PredictorGraduate Admission Predictor
Graduate Admission Predictor
 
Classification Algorithms with Attribute Selection: an evaluation study using...
Classification Algorithms with Attribute Selection: an evaluation study using...Classification Algorithms with Attribute Selection: an evaluation study using...
Classification Algorithms with Attribute Selection: an evaluation study using...
 
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
 
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...IRJET-  	  Effectiveness of Constructivist Instructional Approach on Achievem...
IRJET- Effectiveness of Constructivist Instructional Approach on Achievem...
 
t-Test Project Instructions and Rubric Project Overvi.docx
t-Test Project Instructions and Rubric  Project Overvi.docxt-Test Project Instructions and Rubric  Project Overvi.docx
t-Test Project Instructions and Rubric Project Overvi.docx
 
Sampling.pptx
Sampling.pptxSampling.pptx
Sampling.pptx
 
The Evaluation Model of Garbage Classification System Based on AHP
The Evaluation Model of Garbage Classification System Based on AHPThe Evaluation Model of Garbage Classification System Based on AHP
The Evaluation Model of Garbage Classification System Based on AHP
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
IRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification AlgorithmsIRJET- Performance Evaluation of Various Classification Algorithms
IRJET- Performance Evaluation of Various Classification Algorithms
 
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
IRJET- Expert Independent Bayesian Data Fusion and Decision Making Model for ...
 
IRJET- Attribute Based Adaptive Evaluation System
IRJET-  	  Attribute Based Adaptive Evaluation SystemIRJET-  	  Attribute Based Adaptive Evaluation System
IRJET- Attribute Based Adaptive Evaluation System
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 

Recently uploaded (20)

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 

IRJET- Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selection

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1382 Analysis of Chi-Square Independence Test for Naïve Bayes Feature Selection Pratik Kadam Graduate Student, Major in Operations Research, Northeastern University, Boston, Massachusetts ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Naïve Bayes based upon the Bayes theorem and Conditional Probability is one of the most popular classifiers used for classifying data. However, the accuracy of this algorithm highly depends on the features of the datasets used. Present paper thus focuses on using Chi-Square Independence Test for feature selection at various confidence intervals. Tableau was used for data visualization, Minitab as a statistical tool and RStudio was used for developing the Naïve Bayes Model. Key Words: Naïve Bayes Classifier, Chi-Square Independence Test, Feature Selection, Data Science, Student Performance 1. INTRODUCTION Naïve Bayes Classifier has been used extensively for spam filtering & other such predictive modelling applications. [1] Naïve Bayes Classifier is a probabilistic classifier based on Bayes’ theorem. It assumes independence between the predictors or features of the dataset. This assumption doesn’t always necessarily hold true. However, despite the features being dependent, the Naïve Bayes Classifier offers surprisingly accurate predictions. Hence it is termed as ‘Naïve’ Bayes Classifier. However, the accuracy of this classifier is highly dependent on the number of features in the dataset. It is quite unclear what is the effect on the accuracy of the model due to factors such as the quantity & quality of features. The present paper thus attempts at selecting features that are prominent in determining the outcome using the Chi- Square Test of Independence. The featurestobeselectedare determined using the p-values of the features. The accuracy of the classifier would be then judged for different sets of these features based on the mean & standard deviation of a total fraction of correct predictions made. 1.1 CHI-SQUARE TEST OF INDEPENDENCY To study the dependence of various factors on the failure proportions of the students, Chi-Square test for independence was chosen as the statistical test to be performed assuming the distribution for the sum of squared normal deviates to be a Chi-Square Distribution. This test is applied when you have two categorical variables from a single population. It is used to determine whether there is a significant association between the two variables [2]. Following are the prerequisites to be considered before performing the test: A. The sampling method is a simple random sampling. B. The variables under study are each categorical. C. If sample data are displayed in a contingency table, the expected frequency count for each cell of the table is at least 5. If the above perquisites are satisfied, the following are the steps to be performed to check the dependency of the features. Step 1: State the Hypothesis H0: The considered factor and the failure proportion forthat factor are independent H1: The considered factor and the failure proportion for that factor are not independent Thus, the alternative hypothesisstatesthatknowinglevelsof any one factor can help us determine the chances of failure for the student. Step 2: Calculating the Test Statistic The Chi-Square value is given by the equation below: 𝛘2 = Σ [ (Or,c - Er,c)2 / Er,c] where, O[r,c] = Observed frequency counts of pass/fail studentsfor every level of factor. E[r,c] = (Nr * Nc) / N = Expected frequency counts of a class of students for every level of factor. Nr = the Total number of pass/fail students for a factor. Nc = Total number of students one for a level of the factor. N = Total sample size. Step 3: Defining the decision rule. Deciding the significance level for p-values. Step 4: Making a Decision. If the p-value is less than the significance level, we reject H0 and accept H1. i.e. The variables are dependent on each other.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1383 2. DATASET DESCRIPTION The dataset chosen for this paper, included student’s performances in two different Portuguese schools for Math class [3]. The data attributes include student’s grades for 3 different periods, demographic, school, personal and social related features. The data was collected was using mark reports and questionnaires. Following are the attributes of the dataset: 1. school - student's school (binary: 'GP' - Gabriel Pereira or 'MS' - Mousinho da Silveira) 2. sex - student's sex (binary: 'F' - female or 'M' - male) 3. age - student's age (numeric: from 15 to 22) 4. address - student's home address type (binary: 'U' - urban or 'R' - rural) 5. famsize - family size (binary: 'LE3' - less or equal to3or 'GT3' - greater than 3) 6. Pstatus - parent's cohabitationstatus(binary:'T' -living together or 'A' - apart) 7. Medu - mother's education (numeric: 0 - none, 1 - primary education (4th grade), 2 - 5th to 9th grade, 3 - secondary education or 4 - higher education) 8. Fedu - father's education (numeric: 0 -none,1-primary education (4th grade), 2 -5th to 9th grade,3- secondary education or 4 - higher education) 9. Mjob - mother's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other') 10. Fjob - father's job (nominal: 'teacher', 'health' care related, civil 'services' (e.g. administrative or police), 'at_home' or 'other') 11. reason - reason to choose this school (nominal: close to 'home', school 'reputation', 'course' preference or 'other') 12. guardian - student's guardian (nominal: 'mother', 'father' or 'other') 13. traveltime - home to school travel time (numeric: 1 - <15 min., 2 - 15 to 30 min., 3 - 30 min. to 1 hour, or 4 - >1 hour) 14. studytime - weekly study time (numeric: 1 -<2hours,2 - 2 to 5 hours, 3 - 5 to 10 hours, or 4 - >10 hours) 15. failures - number of past class failures (numeric: n if 1<=n<3, else 4) 16. schoolsup - extra educational support (binary: yes or no) 17. famsup - family educational support (binary: yes orno) 18. paid - extra paid classes within the course subject (binary: yes or no) 19. activities - extra-curricular activities(binary:yesor no) 20. nursery - attended nursery school (binary: yes or no) 21. higher - wants to take higher education (binary: yes or no) 22. internet - Internet access at home (binary: yes or no) 23. romantic - with a romantic relationship (binary: yes or no) 24. famrel - quality of family relationships(numeric:from1 - very bad to 5 - excellent) 25. freetime - free time after school (numeric: from1 -very low to 5 - very high) 26. goout - going out with friends (numeric: from 1 - very low to 5 - very high) 27. Dalc - workday alcohol consumption (numeric: from 1 - very low to 5 - very high) 28. Walc - weekend alcohol consumption(numeric:from1 - very low to 5 - very high) 29. health - current health status (numeric: from 1 - very bad to 5 - very good) 30. absences - number of school absences(numeric:from0 to 93) 31. G1 - first period grade (numeric: from 0 to 20) 32. G2 - second period grade (numeric: from 0 to 20) 33. G3 - final grade (numeric: from 0 to 20, output target) [3]. An extra attribute named ‘success' was further added to the existing dataset. This attribute gave informationregarding if the student has passed the class or not. This attribute is our categorization attribute. We checked the accuracy of the Naïve Bayes Classifier for two different types of classification. The two types of classification are as follows [4]: A. 2-class Classification Pass: If G3 > 10, else Fail B. 5-class Classification A: G3 = 16 to 20; B: G3 = 14 to 15; C: G3 = 12 to 13 D: G3 = 10 to 11; F: G3 = 0 to 9. 3. DATA VISUALIZATION AND CHI SQUARE TEST RESULTS Data was at first visualized using Tableau Server to look for any anomalies within the data. Bar charts were generated to visualize the proportions of different classes of success for different levels of every variable. Following are some of the visualizations obtained from Tableau Server:
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1384 Chart 1: Tableau Server Visualization for % of Total Number of Records for Age and proportion of pass/fail students. Chart 2: Tableau Server Visualization for % of Total Number of Records for Age and proportion of students with A, B, C, D, F grade. The data summary obtained was then further used for Chi- Square Independency test. The Chi-Square Independency test was performed using Minitab 18. Following were the results obtained: Table 1: P-values for Chi-Square Independency test for 2- class Classification. Features P-Values Prev. Class Failures 0 Class Attendance 0 Planning for Higher Education 0.001 Hangouts with Friends 0.002 Age 0.027 School Support 0.038 Relationship Status 0.061 Extra Courses 0.099 Student's Guardian 0.139 Mother's Education 0.143 Family Support 0.146 Sex 0.155 Workday Alcoholic Consumption 0.178 Father's Education 0.193 Reason for Choosing 0.21 Internet Access at Home 0.298 Amount of Free Time 0.334 Mother's Job 0.334 Amount of Study Time 0.387 Parent's Cohabitation Status 0.42 Family Size 0.485 Location 0.521 Quality of Health 0.542 Quality of Family Relationships 0.583 Father's Job 0.674 Travel Time 0.694 Professor 0.714 Attended Nursery 0.74 Weekday Alcoholic Consumption 0.776 Extra Activities 0.807 Table 2: P-values for Chi-Square Independency test for 5- class Classification. Feature P-value Previous Class Failures 0 Class Attendance 0.001 Mother's Education 0.003 School Support 0.005 Planning for Higher Education 0.011 Hangouts with Friends 0.03 Workday Alcoholic Consumption 0.036 Extra Courses 0.049 Relationship Status 0.056 Mother's Job 0.071
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1385 Location 0.11 Father's Job 0.158 Sex 0.175 Amount of Study Time 0.202 Father's Education 0.225 Weekday Alcoholic Consumption 0.227 Age 0.237 Family Size 0.296 Internet Access at Home 0.298 Reason for Choosing 0.306 Amount of Free Time 0.339 Student’s Guardian 0.354 Travel Time 0.447 Attended Nursery 0.494 Parent's Cohabitation Status 0.569 Quality of Health 0.571 Professor 0.584 Quality of Family Relationships 0.664 Extra Activities 0.693 Family Support 0.722 4. NAÏVE BAYES CLASSIFIER MODELING Confidence Intervals for the Chi-Squared tests. Confidence Intervals of 75%, 85%, 95% & 99% were decided to be utilized. These datasets were then dividedinto80-20%ratio of training and testing datasets respectively using R's random uniform distribution. Modeling was repeated 100 times to mitigate the effect of the random distribution of training and testing datasets. Furthermore, datasets were constructed considering the effect of the grade attributes on the accuracy.Threeseparate datasets were built, having no grade attribute, first-period grade, and first two period grades. The e1071 R package was used for building the Naïve Bayes Model for the dataset. After building the model, it was tested on the test dataset and proportions of correct predictions were obtained. Standard Deviation is the measure of the variance between the repetitions. Following are the results obtained for all types of the datasets: Table 3: Summary of Correct Predictions for 2-Class Classification. Data type Data feature type Accuracy Std. Dev. COMPLETE DATASET W/O Grades 0.7 0.0435 75% W/O Grades 0.716 0.0432 85% W/O Grades 0.714 0.04 95% W/O Grades 0.734 0.044 99% W/O Grades 0.724 0.0445 COMPLETE DATASET G1 0.795 0.0402 75% G1 0.832 0.0455 85% G1 0.827 0.0435 95% G1 0.834 0.0353 99% G1 0.826 0.04 COMPLETE DATASET G1 & G2 0.8 0.0422 75% G1 & G2 0.88 0.0325 85% G1 & G2 0.884 0.03 95% G1 & G2 0.885 0.031 99% G1 & G2 0.885 0.034 Table 4: Summary of Correct Predictions for 5-Class Classification. Data type Data feature type Accuracy Std. Dev. COMPLETE DATASET W/O Grades 0.282 0.0481 75% W/O Grades 0.305 0.0483 85% W/O Grades 0.309 0.0483 95% W/O Grades 0.285 0.0552 99% W/O Grades 0.272 0.0479 COMPLETE DATASET G1 0.468 0.0544 75% G1 0.508 0.0599 85% G1 0.537 0.0575 95% G1 0.529 0.0643 99% G1 0.536 0.0631 COMPLETE DATASET G1 & G2 0.683 0.0522 75% G1 & G2 0.692 0.0572 85% G1 & G2 0.703 0.0573 95% G1 & G2 0.703 0.0582 99% G1 & G2 0.695 0.0605 The resultshada meanstandarddeviationof0.0474 which was acceptable tolerance.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1386 5. CONCLUSIONS A. For 2-Class Classification As was expected, highest accuracy was achieved when G1 & G2 both were included in the dataset as G3 was final grade which was heavily dependent on G1 & G2. 88.5% was the maximum accuracy achieved. It was observed that using a Chi-Squared test did help in eliminating unnecessary attributes and increasing the accuracy of the model. However, setting a high confidence interval resultedinover- elimination of variables and decreased the accuracy of the model. Following are the increases in accuracy of the model achieved using the Chi-Squared test for elimination. Table 5: Summary of Increase in Accuracy for 2-Class Classification. Data feature type Data Type (Maximum Accuracy obtained at) Increase in Accuracy Without grades 95% 4.86% G1 95% 4.9% G1 & G2 95% 10.6% As we can observe, maximum accuracy was achievedat95% Confidence Interval with as much as 10.6% increase in accuracy observed. B. For 5-class Classification Similarly, for 5-class Classification, highest accuracy was obtained for dataset including G1 & G2. However, unlike 2- class Classification, a maximum accuracy of only 70% was achieved which could possibly be a case of a small dataset and increase in levels of classification. Irrespective of that, Chi-Squared test did improve the accuracy of the model. Maximum accuracy was achieved at 85% Confidence Interval, possibly indicating that model requires a wider dataset for high levels of classification. Following are the increases in the accuracy of the model obtained: Table 6: Summary of Increase in Accuracy for 5-Class Classification. Data feature type Data Type (Maximum Accuracy obtained at) Increase in Accuracy Without grades 85% 9.57% G1 85% 14.74% G1 & G2 85% 2.93% We can thus conclude that Chi-Squared is an effective tool in reducing features for building a Naïve Bayes. REFERENCES [1] https://en.wikipedia.org/wiki/Naive_Bayes_classifier [2] https://stattrek.com/chi-square- test/independence.aspx [3] https://archive.ics.uci.edu/ml/datasets/student+ performance [4] Paulo Cortez & Alice Silva. “Using Data Mining toPredict Secondary School Student Performance”, EUROSIS (2008). BIOGRAPHY A Graduate Student at Northeastern University, Boston, Massachusetts with majors in Operations Research.