SlideShare a Scribd company logo
1 of 20
Educational Data
Mining & Students’
Performance Prediction
By Amjad Abu Saa Information Technology Department Ajman University of Science and
Technology Ajman, United Arab Emirates
Paper Review by: Karishma Kuria
What is Educational Data Mining
 EDM is emerging discipline of data mining, which deals with finding useful patterns and
discovering knowledgeable insights from vast variety of educational informational systems data.
 The data is accumulated from various educational sources such as admissions systems, registration
systems, course management systems.
 It focuses on developing new tools and techniques for discovering data insights.
 EDM applies techniques from statistics, data mining algorithms, machine learning and AI to analyze
the data accumulated during teaching and learning.
Goals of EDM
 To predict the performance of the students in future semesters and assist the educational institutes
and students to enhance their performance in future.
 To help management improvise their teaching practices and financial activities to yield good results.
 Identify the choices of student and to let them understand which areas they need to focus on to
improve their grades. This saves valuable time of both student and authorities.
 Understanding all the factors which influence the performance of students and hence determining
the pedagogical support that can be provided by learning software.
Classification
 Most famous, easy and widely used data mining technique.
 It belongs to supervised learning technique, where target values are provided with the input data
 In this technique we predict the class for each data element from the set of predefined class labels.
 There are various classification techniques used in data mining for instance Neural Networks, Naïve
Bayes, K-Nearest Neighbour and Decision Trees.
 In this paper 4 Decision trees and Naïve Bayes algorithm is used to design prediction models.
Data Mining Process
Dataset:
 A survey conducted anonymously and without any bias on a group of from University of Science and
Technology (AUST), Ajman, United Arab Emirates.
 Questionnaire consisted of several personal, academic and social questions which influences the
performance of student.
 The data is further preprocessed, transformed to be appropriate for data mining techniques.
 Grade Point Average (GPA) is used here to measure student performance with 4.0 being the
maximum.
 Dataset consists of several attributes such as Student’s gender, Nationality, First Language, High
School percentage, Student Discount, Transportation etc.
Visualization of Data
Visualization of Data
Data Mining Implementation
& Results
What is Decision Tree?
A Decision tree is a supervised data
mining technique which is used to build
classification and predictive models. As
the name suggests it creates a top-down
model structured as tree from the
incoming dataset attributes, having a root
node and multiple incoming and outgoing
edges called Interior Nodes and Leaf
Nodes.
Data Mining Implementation & Results
In this paper following 4 Decision trees algorithm are used to design
prediction models.
 C4.5 Decision Tree
 ID3 Decision Tree
 CART Decision Tree
 CHAID Decision Tree
C4.5 Decision
Tree
Settings used:
Ø Splitting criterion = information gain
ratio
Ø Minimal size of split = 4
Ø Minimal leaf size = 1
Ø Minimal gain = 0.1
Ø Maximal depth = 20
Ø Confidence = 0.5
From the above matrix we can depict that out of 270 objects this algorithm predicted class of 95 objects and
provided an Accuracy value of 35.19%.
ID3 Decision
Tree
Settings used:
Ø Splitting criterion = information gain
ratio
Ø Minimal size of split = 4
Ø Minimal leaf size = 1
Ø Minimal gain = 0.1
From the above matrix we can depict that out of 270 objects this algorithm predicted class of 90 objects and
provided an Accuracy value of 33.33%.
Classification
and Regression
Tree
Settings used:
Ø Minimal leaf size = 1
Ø Number of folds used in minimal
cost-complexity pruning = 5Minimal
leaf size = 1
From the above matrix we can depict that out of 270 objects this algorithm predicted class of 108 objects and
provided an Accuracy value of 40%.
CHi-squared Automatic
Interaction DetectionTree
Settings used:
Ø Minimal size of split = 4
Ø Minimal leaf size = 2
Ø Minimal gain = 0.1
Ø Maximal depth = 20
Ø Confidence = 0.5
From the above matrix we can depict that out of 270 objects this algorithm predicted class of 92 objects and
provided an Accuracy value of 34.07%.
Analysis and Summary
Ø CART outperformed all the other
algorithms with an accuracy of 40%.
Which is significantly higher then the
expected accuracy.
Ø CHAID and C4.5 was next with 34.07%
and 35.19% respectively.
Ø The least accurate was ID3 with
33.33%.
Analysis and Summary
Ø Most of the algorithms scuffled in distinguishing similar class object.
Ø For instance in the class Pass of the CHAID matrix 25 out of 61 are
considered very good which comes in the upper 2 nearest class of grades
and 23 are considered in Pass category which comes in lower class of
grades.
Ø This refers that the discretization of class attributes was not independent
enough to capture the difference in other attributes.
Ø This confuses the model which deteriorates its accuracy and performance.
Naïve Bayes
Classification
Ø It is a probabilistic machine
learning model used in
classification tasks
Ø It assumes that there is no
dependencies between the
attributes in dataset.
Ø MOCS = Service: Interestingly, when the mother occupation status is on service, it appears that
students get higher grades.
Ø DISCOUNT: Students with higher grades tend to get discounts from the university more than low
grades students
Interesting Probabilities
CHi-squared Automatic
Interaction
DetectionTree
From the above matrix we can depict that out of 270 objects this algorithm predicted class of 95 objects and
provided an Accuracy value of 36.40%.
Conclusion
As per the data mining pipeline the following steps were performed.
 Data collection: Survey with Students
 Data preprocessing and transformation
 Data mining tasks, such as various Decision Trees and Naïve Bayes algorithm applied
 Knowledgeable insights were drawn and performance was predicted
 It can be inferred from the study in the paper that, student performance is not completely
dependent on their previous grades, there are other social and personal factor influencing their
performance
Thank You

More Related Content

What's hot

Data mining to predict academic performance.
Data mining to predict academic performance. Data mining to predict academic performance.
Data mining to predict academic performance. Ranjith Gowda
 
Evaluation of Data Mining Techniques for Predicting Student’s Performance
Evaluation of Data Mining Techniques for Predicting Student’s PerformanceEvaluation of Data Mining Techniques for Predicting Student’s Performance
Evaluation of Data Mining Techniques for Predicting Student’s PerformanceLovely Professional University
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesOlugbenga Wilson Adejo
 
machine learning based predictive analytics of student academic performance i...
machine learning based predictive analytics of student academic performance i...machine learning based predictive analytics of student academic performance i...
machine learning based predictive analytics of student academic performance i...CloudTechnologies
 
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUESTUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUEIJDKP
 
Predicting instructor performance using data mining techniques in higher educ...
Predicting instructor performance using data mining techniques in higher educ...Predicting instructor performance using data mining techniques in higher educ...
Predicting instructor performance using data mining techniques in higher educ...redpel dot com
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEIJDKP
 
Advances in Learning Analytics and Educational Data Mining
Advances in Learning Analytics and Educational Data Mining Advances in Learning Analytics and Educational Data Mining
Advances in Learning Analytics and Educational Data Mining MehrnooshV
 
Student performance prediction batch 15 cse DUET.AC.BD
Student performance prediction batch 15 cse DUET.AC.BDStudent performance prediction batch 15 cse DUET.AC.BD
Student performance prediction batch 15 cse DUET.AC.BDMd.Golam Muktadir
 
Student Grade Prediction
Student Grade PredictionStudent Grade Prediction
Student Grade PredictionGaurav Sawant
 
A Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data MiningA Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data MiningEditor IJCATR
 
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMS
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMSUSING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMS
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMSAfrican Virtual University
 
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, TurkeyAn insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkeystrehlst
 
Student Selection Based On Academic Achievement System using k-mean Algorithm
Student Selection Based On Academic Achievement System using k-mean AlgorithmStudent Selection Based On Academic Achievement System using k-mean Algorithm
Student Selection Based On Academic Achievement System using k-mean AlgorithmNik Ridhuan
 
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENT
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENTA LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENT
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENTAIRCC Publishing Corporation
 
Smartphone, PLC Control, Bluetooth, Android, Arduino.
Smartphone, PLC Control, Bluetooth, Android, Arduino. Smartphone, PLC Control, Bluetooth, Android, Arduino.
Smartphone, PLC Control, Bluetooth, Android, Arduino. ijcsit
 
Data Education project briefing for Royal Society
Data Education project briefing for Royal SocietyData Education project briefing for Royal Society
Data Education project briefing for Royal SocietyKate Farrell
 
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...IJCSIS Research Publications
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4Ferdin Joe John Joseph PhD
 

What's hot (20)

Data mining to predict academic performance.
Data mining to predict academic performance. Data mining to predict academic performance.
Data mining to predict academic performance.
 
Evaluation of Data Mining Techniques for Predicting Student’s Performance
Evaluation of Data Mining Techniques for Predicting Student’s PerformanceEvaluation of Data Mining Techniques for Predicting Student’s Performance
Evaluation of Data Mining Techniques for Predicting Student’s Performance
 
Predicting student performance using aggregated data sources
Predicting student performance using aggregated data sourcesPredicting student performance using aggregated data sources
Predicting student performance using aggregated data sources
 
machine learning based predictive analytics of student academic performance i...
machine learning based predictive analytics of student academic performance i...machine learning based predictive analytics of student academic performance i...
machine learning based predictive analytics of student academic performance i...
 
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUESTUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
STUDENTS’ PERFORMANCE PREDICTION SYSTEM USING MULTI AGENT DATA MINING TECHNIQUE
 
Predicting instructor performance using data mining techniques in higher educ...
Predicting instructor performance using data mining techniques in higher educ...Predicting instructor performance using data mining techniques in higher educ...
Predicting instructor performance using data mining techniques in higher educ...
 
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVEDATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
DATA MINING IN EDUCATION : A REVIEW ON THE KNOWLEDGE DISCOVERY PERSPECTIVE
 
Advances in Learning Analytics and Educational Data Mining
Advances in Learning Analytics and Educational Data Mining Advances in Learning Analytics and Educational Data Mining
Advances in Learning Analytics and Educational Data Mining
 
Student performance prediction batch 15 cse DUET.AC.BD
Student performance prediction batch 15 cse DUET.AC.BDStudent performance prediction batch 15 cse DUET.AC.BD
Student performance prediction batch 15 cse DUET.AC.BD
 
Student Grade Prediction
Student Grade PredictionStudent Grade Prediction
Student Grade Prediction
 
A Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data MiningA Survey on the Classification Techniques In Educational Data Mining
A Survey on the Classification Techniques In Educational Data Mining
 
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATIONRESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
RESULT MINING: ANALYSIS OF DATA MINING TECHNIQUES IN EDUCATION
 
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMS
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMSUSING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMS
USING LEARNING ANALYTICS TO PREDICT STUDENTS’ PERFORMANCE IN MOODLE LMS
 
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, TurkeyAn insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
 
Student Selection Based On Academic Achievement System using k-mean Algorithm
Student Selection Based On Academic Achievement System using k-mean AlgorithmStudent Selection Based On Academic Achievement System using k-mean Algorithm
Student Selection Based On Academic Achievement System using k-mean Algorithm
 
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENT
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENTA LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENT
A LEARNING ANALYTICS APPROACH FOR STUDENT PERFORMANCE ASSESSMENT
 
Smartphone, PLC Control, Bluetooth, Android, Arduino.
Smartphone, PLC Control, Bluetooth, Android, Arduino. Smartphone, PLC Control, Bluetooth, Android, Arduino.
Smartphone, PLC Control, Bluetooth, Android, Arduino.
 
Data Education project briefing for Royal Society
Data Education project briefing for Royal SocietyData Education project briefing for Royal Society
Data Education project briefing for Royal Society
 
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
A Comparative Study of Educational Data Mining Techniques for Skill-based Pre...
 
2019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 42019 DSA 105 Introduction to Data Science Week 4
2019 DSA 105 Introduction to Data Science Week 4
 

Similar to Educational Data Mining Predicts Student Performance

M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...IRJET Journal
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptxssuser6654de1
 
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and PredictionUsing ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Predictionijtsrd
 
Educational Data Mining to Analyze Students Performance – Concept Plan
Educational Data Mining to Analyze Students Performance – Concept PlanEducational Data Mining to Analyze Students Performance – Concept Plan
Educational Data Mining to Analyze Students Performance – Concept PlanIRJET Journal
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students PerformanceIRJET Journal
 
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELijcsit
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionIRJET Journal
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data miningEr. Nawaraj Bhandari
 
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...cscpconf
 
Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...csandit
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET Journal
 
Using Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a StudentUsing Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a Studentijtsrd
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Alexander Decker
 
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...ijcnes
 
icaisd presentation 2021.pptx
icaisd presentation 2021.pptxicaisd presentation 2021.pptx
icaisd presentation 2021.pptxEllyMuningsih2
 
03 20250 classifiers ensemble
03 20250 classifiers ensemble03 20250 classifiers ensemble
03 20250 classifiers ensembleIAESIJEECS
 
Data Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemData Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemKumar Goud
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...Editor IJCATR
 

Similar to Educational Data Mining Predicts Student Performance (20)

Short story ppt
Short story pptShort story ppt
Short story ppt
 
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
M-Learners Performance Using Intelligence and Adaptive E-Learning Classify th...
 
5. Machine Learning.pptx
5.  Machine Learning.pptx5.  Machine Learning.pptx
5. Machine Learning.pptx
 
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and PredictionUsing ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
Using ID3 Decision Tree Algorithm to the Student Grade Analysis and Prediction
 
Educational Data Mining to Analyze Students Performance – Concept Plan
Educational Data Mining to Analyze Students Performance – Concept PlanEducational Data Mining to Analyze Students Performance – Concept Plan
Educational Data Mining to Analyze Students Performance – Concept Plan
 
IRJET- Using Data Mining to Predict Students Performance
IRJET-  	  Using Data Mining to Predict Students PerformanceIRJET-  	  Using Data Mining to Predict Students Performance
IRJET- Using Data Mining to Predict Students Performance
 
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODELADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
 
L016136369
L016136369L016136369
L016136369
 
Review of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & PredictionReview of Algorithms for Crime Analysis & Prediction
Review of Algorithms for Crime Analysis & Prediction
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
ASSOCIATION RULE DISCOVERY FOR STUDENT PERFORMANCE PREDICTION USING METAHEURI...
 
Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...Association rule discovery for student performance prediction using metaheuri...
Association rule discovery for student performance prediction using metaheuri...
 
IRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data MiningIRJET- A Detailed Study on Classification Techniques for Data Mining
IRJET- A Detailed Study on Classification Techniques for Data Mining
 
Using Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a StudentUsing Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a Student
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...
 
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
Perfomance Comparison of Decsion Tree Algorithms to Findout the Reason for St...
 
icaisd presentation 2021.pptx
icaisd presentation 2021.pptxicaisd presentation 2021.pptx
icaisd presentation 2021.pptx
 
03 20250 classifiers ensemble
03 20250 classifiers ensemble03 20250 classifiers ensemble
03 20250 classifiers ensemble
 
Data Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout SystemData Mining Techniques for School Failure and Dropout System
Data Mining Techniques for School Failure and Dropout System
 
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
A Model for Predicting Students’ Academic Performance using a Hybrid of K-mea...
 

Recently uploaded

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 

Recently uploaded (20)

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

Educational Data Mining Predicts Student Performance

  • 1. Educational Data Mining & Students’ Performance Prediction By Amjad Abu Saa Information Technology Department Ajman University of Science and Technology Ajman, United Arab Emirates Paper Review by: Karishma Kuria
  • 2. What is Educational Data Mining  EDM is emerging discipline of data mining, which deals with finding useful patterns and discovering knowledgeable insights from vast variety of educational informational systems data.  The data is accumulated from various educational sources such as admissions systems, registration systems, course management systems.  It focuses on developing new tools and techniques for discovering data insights.  EDM applies techniques from statistics, data mining algorithms, machine learning and AI to analyze the data accumulated during teaching and learning.
  • 3. Goals of EDM  To predict the performance of the students in future semesters and assist the educational institutes and students to enhance their performance in future.  To help management improvise their teaching practices and financial activities to yield good results.  Identify the choices of student and to let them understand which areas they need to focus on to improve their grades. This saves valuable time of both student and authorities.  Understanding all the factors which influence the performance of students and hence determining the pedagogical support that can be provided by learning software.
  • 4. Classification  Most famous, easy and widely used data mining technique.  It belongs to supervised learning technique, where target values are provided with the input data  In this technique we predict the class for each data element from the set of predefined class labels.  There are various classification techniques used in data mining for instance Neural Networks, Naïve Bayes, K-Nearest Neighbour and Decision Trees.  In this paper 4 Decision trees and Naïve Bayes algorithm is used to design prediction models.
  • 5. Data Mining Process Dataset:  A survey conducted anonymously and without any bias on a group of from University of Science and Technology (AUST), Ajman, United Arab Emirates.  Questionnaire consisted of several personal, academic and social questions which influences the performance of student.  The data is further preprocessed, transformed to be appropriate for data mining techniques.  Grade Point Average (GPA) is used here to measure student performance with 4.0 being the maximum.  Dataset consists of several attributes such as Student’s gender, Nationality, First Language, High School percentage, Student Discount, Transportation etc.
  • 8. Data Mining Implementation & Results What is Decision Tree? A Decision tree is a supervised data mining technique which is used to build classification and predictive models. As the name suggests it creates a top-down model structured as tree from the incoming dataset attributes, having a root node and multiple incoming and outgoing edges called Interior Nodes and Leaf Nodes.
  • 9. Data Mining Implementation & Results In this paper following 4 Decision trees algorithm are used to design prediction models.  C4.5 Decision Tree  ID3 Decision Tree  CART Decision Tree  CHAID Decision Tree
  • 10. C4.5 Decision Tree Settings used: Ø Splitting criterion = information gain ratio Ø Minimal size of split = 4 Ø Minimal leaf size = 1 Ø Minimal gain = 0.1 Ø Maximal depth = 20 Ø Confidence = 0.5 From the above matrix we can depict that out of 270 objects this algorithm predicted class of 95 objects and provided an Accuracy value of 35.19%.
  • 11. ID3 Decision Tree Settings used: Ø Splitting criterion = information gain ratio Ø Minimal size of split = 4 Ø Minimal leaf size = 1 Ø Minimal gain = 0.1 From the above matrix we can depict that out of 270 objects this algorithm predicted class of 90 objects and provided an Accuracy value of 33.33%.
  • 12. Classification and Regression Tree Settings used: Ø Minimal leaf size = 1 Ø Number of folds used in minimal cost-complexity pruning = 5Minimal leaf size = 1 From the above matrix we can depict that out of 270 objects this algorithm predicted class of 108 objects and provided an Accuracy value of 40%.
  • 13. CHi-squared Automatic Interaction DetectionTree Settings used: Ø Minimal size of split = 4 Ø Minimal leaf size = 2 Ø Minimal gain = 0.1 Ø Maximal depth = 20 Ø Confidence = 0.5 From the above matrix we can depict that out of 270 objects this algorithm predicted class of 92 objects and provided an Accuracy value of 34.07%.
  • 14. Analysis and Summary Ø CART outperformed all the other algorithms with an accuracy of 40%. Which is significantly higher then the expected accuracy. Ø CHAID and C4.5 was next with 34.07% and 35.19% respectively. Ø The least accurate was ID3 with 33.33%.
  • 15. Analysis and Summary Ø Most of the algorithms scuffled in distinguishing similar class object. Ø For instance in the class Pass of the CHAID matrix 25 out of 61 are considered very good which comes in the upper 2 nearest class of grades and 23 are considered in Pass category which comes in lower class of grades. Ø This refers that the discretization of class attributes was not independent enough to capture the difference in other attributes. Ø This confuses the model which deteriorates its accuracy and performance.
  • 16. Naïve Bayes Classification Ø It is a probabilistic machine learning model used in classification tasks Ø It assumes that there is no dependencies between the attributes in dataset. Ø MOCS = Service: Interestingly, when the mother occupation status is on service, it appears that students get higher grades. Ø DISCOUNT: Students with higher grades tend to get discounts from the university more than low grades students
  • 18. CHi-squared Automatic Interaction DetectionTree From the above matrix we can depict that out of 270 objects this algorithm predicted class of 95 objects and provided an Accuracy value of 36.40%.
  • 19. Conclusion As per the data mining pipeline the following steps were performed.  Data collection: Survey with Students  Data preprocessing and transformation  Data mining tasks, such as various Decision Trees and Naïve Bayes algorithm applied  Knowledgeable insights were drawn and performance was predicted  It can be inferred from the study in the paper that, student performance is not completely dependent on their previous grades, there are other social and personal factor influencing their performance