SlideShare a Scribd company logo
1 of 12
Download to read offline
Analysis of Results of Final Semester of B.E. (Civil Engineering),
2014 of Purbanchal University using Weka Tool: A mini Research
Raj Kumar Thakur
Associate Professor( Computer)
Purbanchal University School of Engineering and Technology
Biratnagar, Nepal
Abstract - Data mining has been used as a very important tool in many areas of research, industry
and business. This paper focuses on the application of data mining tool in educational domain to
analyze the results of final semester examination to find out causes in the form of precise rules that
technically controls final examination results. Once these rules have been found out, management
control measures can be developed and implemented to improve examination results.
Keywords- Data Mining, Business Intelligence, WEKA, Data Visualization, Classification
1. Introduction
With the opening of more universities in Nepal and neighboring countries India and China,
admissions in any public universities and educational institutions are likely to face imminent
admission in crisis in near future. Nevertheless, number of admissions in B. E. degrees, especially
Civil Engineering has been rising consistently over the past few years. A University would
always like to see to it that not only quality education is being provided by all colleges run under
it but also that passing percentage is also as high as possible specially in case of final semester
students. As a great number of students is being admitted in B. E. Civil Engineering Program run
in different colleges of Purbanchal University all over Nepal, it is extremely essential that
students, especially of the final semester study hard and most hopefully all of them pass their
final semester examination. However, final examination results of B.E. programs are not
encouraging.
Data mining is a very powerful tool used for the extraction of hidden predictive information
from large databases and has a great potential to help educational institutes focus on the most
important information in the data they have generated. Data mining techniques need to be applied
to determine precise rules controlling the final examination results.
With the help of data mining techniques, such as classification it is possible to discover the key
decision rules from the final examination results of students and possibly use those rules to figure
out the key course(s) that controls whether the student passes or fails. This paper presents
classification based on J48 algorithm as a simple and efficient tool to analyze the final
Examination results of B. E. (Civil Engineering) of Purbanchal University, Nepal.
2. METHODOLOGY
The study followed the steps suggested by Fayyad, Piatetsky-Shapiro, and Smyth (1996)
for the knowledge discovery process: data selection, data pre-processing and cleanup, data
transformation, data mining, data interpretation, and the evaluation of results. Among the
available data mining techniques, decision tree, J8 algorithm which is an extension of ID3
algorithm creates a small tree representing rules that provides extremely valuable insight as
regards the classification and prediction of data.
Algorithm used by ID3 is as follows
Algorithm
function ID3
Input: (R: a set of non-target attributes,
C: the target attribute,
S: a training set) returns a decision tree;
begin
If S is empty, return a single node with value failure;
If S consists of records all with the same
value for the target attribute,
return a single leaf node with that value;
If R is empty, then return a single node with
the value of the most frequent of the values of the
target attribute that are found in records of S; [in that case
there may be errors, examples that will be improperly classified];
Let A be the attribute with largest Gain(A,S) among attributes in R;
Let {Aj | j=1,2, .., m} be the values of attribute A;
Let {Sj | j=1,2, .., m} be the subsets of S consisting
respectively of records with value aj for A;
Return a tree with root labeled A and arcs
labeled a1, a2, .., am going respectively
to the trees (ID3(R-{A}, C, S1), ID3(R-{A}, C, S2),
.....,ID3(R-{A}, C, Sm);
Recursively apply ID3 to subsets {Sj | j=1,2, .., m}
until they are empty
end.
2.1Building the model
A.Data Collection
MS Excel sheets containing Final Examination results of the Eighth semester of B.E.
( Civil Engineering) of the year2014 of seven colleges under Purbanchal University
were obtained from the Examination Office of Purbanchal University. Total number of
students contained in the Excel examination result sheet were 474. All subjects taught,
internal assessment marks in theory, practical, and final examination and grade obtained
in each course, total marks obtained, SGPA and the final result of every student were
taken into account.
B. Tools Used
To apply the classification algorithm, we used WEKA toolkit , a widely used software
for data mining that was developed at the University of Waikato in New Zealand. This
toolkit provides a wide range of different data mining algorithms implemented in JAVA.
It has been widely used in educational data mining researches and for teaching purposes.
C. Data Preparation and Pre-Processing
During this phase, some pre-processing for the collected data was applied to prepare
it for the mining techniques. At first, some irrelevant attributes, e.g. S. No., Exam Roll
No. were eliminated. Excel sheets were converted into arff files so as to subject them to
analysis by WEKA tool.
E. Classification
In this research, the aim was to find out the causes and the decision rules that
controlled the result whether the student passed or failed. Classification technique was
used because the objective of classification techniques in educational data mining is to
identify what are the important factors that contribute to categorizing students’ results.
Decision trees are the most popular classification technique in data mining. They
represent the group of classification rules in a tree form, and they have several
advantages over other techniques as stated in [1]:
The simplicity of its presentation makes them easy to understand
They can work for different types of attributes, nominal or numerical
They can classify new examples fast.
One of the earliest decision tree algorithms is the C4.5 tree developed by Ross Quinlan
[2]. The basic idea of this tree is to build trees from a group of training data using the
concept of information entropy [3]. J48 is an open source Java implementation of the
C4.5 algorithm in the WEKA. We chose this algorithm after proving its capabilities to
handle educational dataset and provide a high accuracy results as mentioned in[4], [5]-
[6].
F. Results and Discussion
.
Classification technique using J48 Tree algorithm was applied onto data of the
examination result of one of the colleges, here named as as first dataset in MS Excel sheet
in Fig, 1 and its ARFF file in Fig, 2, and the results of analysis obtained are shown in
Figure 3. Here, the examination results of 76 students corresponding to the college were
initially used as the training data .
Fig. 1 : First Dataset corresponding to a college
Fig. 2: ARFF of the first dataset
Fig. 3: Analysis output of the first dataset
The result of classification shows that out of a total of 76 data instances, each data instance being
the details of marks obtained in all subjects by a particular student along with grades obtained and
the final result, J48 algorithm of classification predicted results correctly for 75 data instances.
That is, there is only one incorrectly classified data instance. The visualized tree obtained by the
WEKA Classifier Tree Visualizer is shown in Fig.4. A total of 12 students who got the grade B
in the course Hydropower Engineering passed their examination and a total of 63 students getting
grade F in the same subject failed in the examination.
So, for the college 76 students results were considered for the analysis, the course that controlled
their results is Hydropower Engineering. Decision rules that controlled the attribute RESULT are
given below.
=== Classifier model (full training set) ===
J48 pruned tree
------------------
HYDROPOWER-ENGINEERING-GRADE = A: Failed (0.0)
HYDROPOWER-ENGINEERING-GRADE = B: Passed (13.0/1.0)
HYDROPOWER-ENGINEERING-GRADE = C: Failed (0.0)
HYDROPOWER-ENGINEERING-GRADE = D: Failed (0.0)
HYDROPOWER-ENGINEERING-GRADE = F: Failed (63.0)
HYDROPOWER-ENGINEERING-GRADE = I: Failed (0.0)
Fig. 4: J48 Tree visualized for the first dataset
Upon using the sample data which was kept same as the training data except that the
result data were kept blanks and represented in the arff file as ?, the same algorithm could
give exactly the same examination results.
The predicted result is shown in Fig. 5.
Fig. 5: Predicted result of first dataset in ARFF
Conducting classification using the same J48 algorithm with dataset for other five colleges also
showed that the Course Hydropower Engineering controlled the results of students.
However, for one of the colleges, classification using the same J48 Tree algorithm showed the
results given in Fig. 7.
Fig. 7: Analysis output of the dataset corresponding to the seventh college.
In the classification for the final examination of the Eighth semester of the seventh college,
results of all 75 students were correctly classified, that is, the accuracy of classification was
found to be 100%. And the visualized tree for the seventh college is shown in Fig. 8.
Fig. 8: J48 Tree visualized for the dataset corresponding to the seventh college
The rules that controlled classification for the seventh college are:
=== Classifier model (full training set) ===
J48 pruned tree
------------------
HYDROPOWER-ENGINEERING-FINAL-TH <= 27: Failed (16.0)
HYDROPOWER-ENGINEERING-FINAL-TH > 27
| CONSTRUCTION-MANAGEMENT-TOTAL <= 42: Failed (3.0)
| CONSTRUCTION-MANAGEMENT-TOTAL > 42: Passed (56.0)
This decision tree tells that students who scored less than or equal to 27 marks in the final
(theoretical) examination of the course Hydropower Engineering failed, and there were 16 of
them. Those students who scored more than 27 marks in the final examination (theoretical) of the
course Hydropower Engineering but scored less than or equal to 42 marks in total for the course
Construction Management failed ; and there were three of them who failed in this class. And
those students who scored more than 27 marks in the final examination (theoretical) of the course
Hydropower Engineering and scored greater than 42 marks in total for the course Construction
Management passed the Eighth semester examination passed the examination ; and there were 56
of them in this class.
4. Conclusions
In this paper, a mini research study was conducted using data Mining technique in order
to enquire what are causing failure of the majority of students in their Eighth semester
examination. The analysis showed that it was mainly only one course, Hydropower
Engineering, which controlled the result of students. It shows the potential of data
mining in higher education. It was especially used to know courses which were causing
students to fail. Once this fact is known, the college management can take appropriate
measures to guide and make their students improve themselves in the course,
Hydropower Engineering.
5. References
[1] W. H¨am¨al¨ainen and M. Vinni, "Classifiers for educational data mining," Handbook
of Educational Data Mining, 2010.
[2] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan
[3] D. Kabakchieva, "Predicting student performance by using data mining methods for
classification," Cybernetics and Information Technologies, vol. 13, 2013.
[4] Q. A. Al-Radaideh, A. A. Ananbeh, and E. M. Al-Shawakfa, "A classification model
for predicting the suitable study track for school students " International Journal of
Research and Reviews in Applied Sciences, vol. 8, 2001.
[5] A. Nandeshwar and S. Chaudhari. (2009). Enrollment Prediction
Models Using Data Mining. [Online]. Available: http://nandeshwar.info/wp-
content/uploads/2008/11/DMWVU_Project.pdf
[6] D. Garc´ıa-Saiz and M. Zorrilla, "Comparing classification methods for predicting
distance students’ performance," The Journal of Machine Learning Research, 2011.

More Related Content

Similar to A Mini Research

Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
IIRindia
 
Discussion QuestionAre Americans becoming ruder in their inter.docx
Discussion QuestionAre Americans becoming ruder in their inter.docxDiscussion QuestionAre Americans becoming ruder in their inter.docx
Discussion QuestionAre Americans becoming ruder in their inter.docx
elinoraudley582231
 

Similar to A Mini Research (20)

IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
Cs268
Cs268Cs268
Cs268
 
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
 
Correlation based feature selection (cfs) technique to predict student perfro...
Correlation based feature selection (cfs) technique to predict student perfro...Correlation based feature selection (cfs) technique to predict student perfro...
Correlation based feature selection (cfs) technique to predict student perfro...
 
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
CORRELATION BASED FEATURE SELECTION (CFS) TECHNIQUE TO PREDICT STUDENT PERFRO...
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
 
Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...
 
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
 
University Recommendation Support System using ML Algorithms
University Recommendation Support System using ML AlgorithmsUniversity Recommendation Support System using ML Algorithms
University Recommendation Support System using ML Algorithms
 
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
IRJET  - Student Pass Percentage Dedection using Ensemble LearninngIRJET  - Student Pass Percentage Dedection using Ensemble Learninng
IRJET - Student Pass Percentage Dedection using Ensemble Learninng
 
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and PredictionUsing ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
 
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
IRJET- Performance for Student Higher Education using Decision Tree to Predic...IRJET- Performance for Student Higher Education using Decision Tree to Predic...
IRJET- Performance for Student Higher Education using Decision Tree to Predic...
 
Ijciet 10 02_007
Ijciet 10 02_007Ijciet 10 02_007
Ijciet 10 02_007
 
Data Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemData Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout System
 
DATA MINING METHODOLOGIES TO STUDY STUDENT'S ACADEMIC PERFORMANCE USING THE...
DATA MINING METHODOLOGIES TO  STUDY STUDENT'S ACADEMIC  PERFORMANCE USING THE...DATA MINING METHODOLOGIES TO  STUDY STUDENT'S ACADEMIC  PERFORMANCE USING THE...
DATA MINING METHODOLOGIES TO STUDY STUDENT'S ACADEMIC PERFORMANCE USING THE...
 
IRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary EducationIRJET- Student Performance Analysis System for Higher Secondary Education
IRJET- Student Performance Analysis System for Higher Secondary Education
 
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
Performance Evaluation of Feature Selection Algorithms in Educational Data Mi...
 
Discussion QuestionAre Americans becoming ruder in their inter.docx
Discussion QuestionAre Americans becoming ruder in their inter.docxDiscussion QuestionAre Americans becoming ruder in their inter.docx
Discussion QuestionAre Americans becoming ruder in their inter.docx
 

More from Amy Cernava

More from Amy Cernava (20)

What Should I Write My College Essay About 15
What Should I Write My College Essay About 15What Should I Write My College Essay About 15
What Should I Write My College Essay About 15
 
A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.A New Breakdown Of. Online assignment writing service.
A New Breakdown Of. Online assignment writing service.
 
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
Evaluative Writing. 6 Ways To Evaluate. Online assignment writing service.
 
General Water. Online assignment writing service.
General Water. Online assignment writing service.General Water. Online assignment writing service.
General Water. Online assignment writing service.
 
Essay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private HiEssay Websites Sample Parent Essays For Private Hi
Essay Websites Sample Parent Essays For Private Hi
 
How To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - CoverletterpediaHow To Write About Myself Examples - Coverletterpedia
How To Write About Myself Examples - Coverletterpedia
 
Punctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters QuotationsPunctuating Titles MLA Printable Classroom Posters Quotations
Punctuating Titles MLA Printable Classroom Posters Quotations
 
Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.Essay Introductions For Kids. Online assignment writing service.
Essay Introductions For Kids. Online assignment writing service.
 
Writing Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help AWriting Creative Essays - College Homework Help A
Writing Creative Essays - College Homework Help A
 
Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.Free Printable Primary Paper Te. Online assignment writing service.
Free Printable Primary Paper Te. Online assignment writing service.
 
Diversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which CaDiversity Essay Sample Graduate School Which Ca
Diversity Essay Sample Graduate School Which Ca
 
Large Notepad - Heart Border Writing Paper Print
Large Notepad - Heart Border  Writing Paper PrintLarge Notepad - Heart Border  Writing Paper Print
Large Notepad - Heart Border Writing Paper Print
 
Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.Personal Challenges Essay. Online assignment writing service.
Personal Challenges Essay. Online assignment writing service.
 
Buy College Application Essays Dos And Dont
Buy College Application Essays Dos And DontBuy College Application Essays Dos And Dont
Buy College Application Essays Dos And Dont
 
8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - Sa8 Printable Outline Template - SampleTemplatess - Sa
8 Printable Outline Template - SampleTemplatess - Sa
 
Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.Analytical Essay Intro Example. Online assignment writing service.
Analytical Essay Intro Example. Online assignment writing service.
 
Types Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types OTypes Of Essay And Examples. 4 Major Types O
Types Of Essay And Examples. 4 Major Types O
 
026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce Myself026 Describe Yourself Essay Example Introduce Myself
026 Describe Yourself Essay Example Introduce Myself
 
Term Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An IntrTerm Paper Introduction Help - How To Write An Intr
Term Paper Introduction Help - How To Write An Intr
 
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...Analysis Of Students  Critical Thinking Skill Of Middle School Through STEM E...
Analysis Of Students Critical Thinking Skill Of Middle School Through STEM E...
 

Recently uploaded

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Recently uploaded (20)

Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Basic Intentional Injuries Health Education
Basic Intentional Injuries Health EducationBasic Intentional Injuries Health Education
Basic Intentional Injuries Health Education
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
OSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & SystemsOSCM Unit 2_Operations Processes & Systems
OSCM Unit 2_Operations Processes & Systems
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

A Mini Research

  • 1. Analysis of Results of Final Semester of B.E. (Civil Engineering), 2014 of Purbanchal University using Weka Tool: A mini Research Raj Kumar Thakur Associate Professor( Computer) Purbanchal University School of Engineering and Technology Biratnagar, Nepal Abstract - Data mining has been used as a very important tool in many areas of research, industry and business. This paper focuses on the application of data mining tool in educational domain to analyze the results of final semester examination to find out causes in the form of precise rules that technically controls final examination results. Once these rules have been found out, management control measures can be developed and implemented to improve examination results. Keywords- Data Mining, Business Intelligence, WEKA, Data Visualization, Classification 1. Introduction With the opening of more universities in Nepal and neighboring countries India and China, admissions in any public universities and educational institutions are likely to face imminent admission in crisis in near future. Nevertheless, number of admissions in B. E. degrees, especially Civil Engineering has been rising consistently over the past few years. A University would always like to see to it that not only quality education is being provided by all colleges run under it but also that passing percentage is also as high as possible specially in case of final semester students. As a great number of students is being admitted in B. E. Civil Engineering Program run in different colleges of Purbanchal University all over Nepal, it is extremely essential that students, especially of the final semester study hard and most hopefully all of them pass their final semester examination. However, final examination results of B.E. programs are not encouraging. Data mining is a very powerful tool used for the extraction of hidden predictive information from large databases and has a great potential to help educational institutes focus on the most important information in the data they have generated. Data mining techniques need to be applied to determine precise rules controlling the final examination results. With the help of data mining techniques, such as classification it is possible to discover the key decision rules from the final examination results of students and possibly use those rules to figure out the key course(s) that controls whether the student passes or fails. This paper presents classification based on J48 algorithm as a simple and efficient tool to analyze the final Examination results of B. E. (Civil Engineering) of Purbanchal University, Nepal. 2. METHODOLOGY The study followed the steps suggested by Fayyad, Piatetsky-Shapiro, and Smyth (1996) for the knowledge discovery process: data selection, data pre-processing and cleanup, data transformation, data mining, data interpretation, and the evaluation of results. Among the available data mining techniques, decision tree, J8 algorithm which is an extension of ID3 algorithm creates a small tree representing rules that provides extremely valuable insight as regards the classification and prediction of data. Algorithm used by ID3 is as follows
  • 2. Algorithm function ID3 Input: (R: a set of non-target attributes, C: the target attribute, S: a training set) returns a decision tree; begin If S is empty, return a single node with value failure; If S consists of records all with the same value for the target attribute, return a single leaf node with that value; If R is empty, then return a single node with the value of the most frequent of the values of the target attribute that are found in records of S; [in that case there may be errors, examples that will be improperly classified]; Let A be the attribute with largest Gain(A,S) among attributes in R; Let {Aj | j=1,2, .., m} be the values of attribute A; Let {Sj | j=1,2, .., m} be the subsets of S consisting respectively of records with value aj for A; Return a tree with root labeled A and arcs labeled a1, a2, .., am going respectively to the trees (ID3(R-{A}, C, S1), ID3(R-{A}, C, S2), .....,ID3(R-{A}, C, Sm); Recursively apply ID3 to subsets {Sj | j=1,2, .., m} until they are empty end. 2.1Building the model A.Data Collection MS Excel sheets containing Final Examination results of the Eighth semester of B.E. ( Civil Engineering) of the year2014 of seven colleges under Purbanchal University were obtained from the Examination Office of Purbanchal University. Total number of students contained in the Excel examination result sheet were 474. All subjects taught, internal assessment marks in theory, practical, and final examination and grade obtained in each course, total marks obtained, SGPA and the final result of every student were taken into account. B. Tools Used To apply the classification algorithm, we used WEKA toolkit , a widely used software for data mining that was developed at the University of Waikato in New Zealand. This
  • 3. toolkit provides a wide range of different data mining algorithms implemented in JAVA. It has been widely used in educational data mining researches and for teaching purposes. C. Data Preparation and Pre-Processing During this phase, some pre-processing for the collected data was applied to prepare it for the mining techniques. At first, some irrelevant attributes, e.g. S. No., Exam Roll No. were eliminated. Excel sheets were converted into arff files so as to subject them to analysis by WEKA tool. E. Classification In this research, the aim was to find out the causes and the decision rules that controlled the result whether the student passed or failed. Classification technique was used because the objective of classification techniques in educational data mining is to identify what are the important factors that contribute to categorizing students’ results. Decision trees are the most popular classification technique in data mining. They represent the group of classification rules in a tree form, and they have several advantages over other techniques as stated in [1]: The simplicity of its presentation makes them easy to understand They can work for different types of attributes, nominal or numerical They can classify new examples fast. One of the earliest decision tree algorithms is the C4.5 tree developed by Ross Quinlan [2]. The basic idea of this tree is to build trees from a group of training data using the concept of information entropy [3]. J48 is an open source Java implementation of the C4.5 algorithm in the WEKA. We chose this algorithm after proving its capabilities to handle educational dataset and provide a high accuracy results as mentioned in[4], [5]- [6]. F. Results and Discussion . Classification technique using J48 Tree algorithm was applied onto data of the examination result of one of the colleges, here named as as first dataset in MS Excel sheet in Fig, 1 and its ARFF file in Fig, 2, and the results of analysis obtained are shown in Figure 3. Here, the examination results of 76 students corresponding to the college were initially used as the training data .
  • 4. Fig. 1 : First Dataset corresponding to a college
  • 5. Fig. 2: ARFF of the first dataset
  • 6. Fig. 3: Analysis output of the first dataset
  • 7. The result of classification shows that out of a total of 76 data instances, each data instance being the details of marks obtained in all subjects by a particular student along with grades obtained and the final result, J48 algorithm of classification predicted results correctly for 75 data instances. That is, there is only one incorrectly classified data instance. The visualized tree obtained by the WEKA Classifier Tree Visualizer is shown in Fig.4. A total of 12 students who got the grade B in the course Hydropower Engineering passed their examination and a total of 63 students getting grade F in the same subject failed in the examination. So, for the college 76 students results were considered for the analysis, the course that controlled their results is Hydropower Engineering. Decision rules that controlled the attribute RESULT are given below. === Classifier model (full training set) === J48 pruned tree ------------------ HYDROPOWER-ENGINEERING-GRADE = A: Failed (0.0) HYDROPOWER-ENGINEERING-GRADE = B: Passed (13.0/1.0) HYDROPOWER-ENGINEERING-GRADE = C: Failed (0.0) HYDROPOWER-ENGINEERING-GRADE = D: Failed (0.0) HYDROPOWER-ENGINEERING-GRADE = F: Failed (63.0) HYDROPOWER-ENGINEERING-GRADE = I: Failed (0.0)
  • 8. Fig. 4: J48 Tree visualized for the first dataset Upon using the sample data which was kept same as the training data except that the result data were kept blanks and represented in the arff file as ?, the same algorithm could give exactly the same examination results. The predicted result is shown in Fig. 5.
  • 9. Fig. 5: Predicted result of first dataset in ARFF Conducting classification using the same J48 algorithm with dataset for other five colleges also showed that the Course Hydropower Engineering controlled the results of students. However, for one of the colleges, classification using the same J48 Tree algorithm showed the results given in Fig. 7.
  • 10. Fig. 7: Analysis output of the dataset corresponding to the seventh college. In the classification for the final examination of the Eighth semester of the seventh college, results of all 75 students were correctly classified, that is, the accuracy of classification was found to be 100%. And the visualized tree for the seventh college is shown in Fig. 8.
  • 11. Fig. 8: J48 Tree visualized for the dataset corresponding to the seventh college The rules that controlled classification for the seventh college are: === Classifier model (full training set) === J48 pruned tree ------------------ HYDROPOWER-ENGINEERING-FINAL-TH <= 27: Failed (16.0) HYDROPOWER-ENGINEERING-FINAL-TH > 27 | CONSTRUCTION-MANAGEMENT-TOTAL <= 42: Failed (3.0) | CONSTRUCTION-MANAGEMENT-TOTAL > 42: Passed (56.0) This decision tree tells that students who scored less than or equal to 27 marks in the final (theoretical) examination of the course Hydropower Engineering failed, and there were 16 of them. Those students who scored more than 27 marks in the final examination (theoretical) of the course Hydropower Engineering but scored less than or equal to 42 marks in total for the course Construction Management failed ; and there were three of them who failed in this class. And
  • 12. those students who scored more than 27 marks in the final examination (theoretical) of the course Hydropower Engineering and scored greater than 42 marks in total for the course Construction Management passed the Eighth semester examination passed the examination ; and there were 56 of them in this class. 4. Conclusions In this paper, a mini research study was conducted using data Mining technique in order to enquire what are causing failure of the majority of students in their Eighth semester examination. The analysis showed that it was mainly only one course, Hydropower Engineering, which controlled the result of students. It shows the potential of data mining in higher education. It was especially used to know courses which were causing students to fail. Once this fact is known, the college management can take appropriate measures to guide and make their students improve themselves in the course, Hydropower Engineering. 5. References [1] W. H¨am¨al¨ainen and M. Vinni, "Classifiers for educational data mining," Handbook of Educational Data Mining, 2010. [2] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan [3] D. Kabakchieva, "Predicting student performance by using data mining methods for classification," Cybernetics and Information Technologies, vol. 13, 2013. [4] Q. A. Al-Radaideh, A. A. Ananbeh, and E. M. Al-Shawakfa, "A classification model for predicting the suitable study track for school students " International Journal of Research and Reviews in Applied Sciences, vol. 8, 2001. [5] A. Nandeshwar and S. Chaudhari. (2009). Enrollment Prediction Models Using Data Mining. [Online]. Available: http://nandeshwar.info/wp- content/uploads/2008/11/DMWVU_Project.pdf [6] D. Garc´ıa-Saiz and M. Zorrilla, "Comparing classification methods for predicting distance students’ performance," The Journal of Machine Learning Research, 2011.