Using the Students Performance in Exams Dataset we will try to understand what affects the exam scores. The data is limited, but it will present a good visualization to spot the relations. First of all, we explore our data and after that we apply Naive Bayes Classification technique for evaluation purpose.
In this study, the effect of combining variables from the different data sources for student academic performance prediction was examined using three state-of-the–art classifiers: Decision Tree (DT), Artificial Neural Network (ANN) and Support Vector Machine (SVM). The study examined the use of heterogeneous multi-model ensemble techniques to predict student academic performance based on the combination of these classifiers and three different data sources. A quantitative approach was used to develop the various base classifier models while the ensemble models were developed using stacked generalisation ensemble method in order to overcome the individual weaknesses of the different models. Variables were extracted from the institution’s Student Record System and Learning Management System (Moodle) and from a structured student questionnaire. At present, negligible work has been done using this integrated approach and ensemble techniques especially with aggregated learner data in performance prediction in HE. The empirical results obtained show that the ensemble models.........................
Data Analytics PowerPoint Presentation SlidesSlideTeam
This complete deck is oriented to make sure you do not lag in your presentations. Our creatively crafted slides come with apt research and planning. This exclusive deck with twenty slides is here to help you to strategize, plan, analyse, or segment the topic with clear understanding and apprehension. Utilize ready to use presentation slides on Data Analytics PowerPoint Presentation Slides with all sorts of editable templates, charts and graphs, overviews, analysis templates. It is usable for marking important decisions and covering critical issues. Display and present all possible kinds of underlying nuances, progress factors for an all inclusive presentation for the teams. This presentation deck can be used by all professionals, managers, individuals, internal external teams involved in any company organization.
In this study, the effect of combining variables from the different data sources for student academic performance prediction was examined using three state-of-the–art classifiers: Decision Tree (DT), Artificial Neural Network (ANN) and Support Vector Machine (SVM). The study examined the use of heterogeneous multi-model ensemble techniques to predict student academic performance based on the combination of these classifiers and three different data sources. A quantitative approach was used to develop the various base classifier models while the ensemble models were developed using stacked generalisation ensemble method in order to overcome the individual weaknesses of the different models. Variables were extracted from the institution’s Student Record System and Learning Management System (Moodle) and from a structured student questionnaire. At present, negligible work has been done using this integrated approach and ensemble techniques especially with aggregated learner data in performance prediction in HE. The empirical results obtained show that the ensemble models.........................
Data Analytics PowerPoint Presentation SlidesSlideTeam
This complete deck is oriented to make sure you do not lag in your presentations. Our creatively crafted slides come with apt research and planning. This exclusive deck with twenty slides is here to help you to strategize, plan, analyse, or segment the topic with clear understanding and apprehension. Utilize ready to use presentation slides on Data Analytics PowerPoint Presentation Slides with all sorts of editable templates, charts and graphs, overviews, analysis templates. It is usable for marking important decisions and covering critical issues. Display and present all possible kinds of underlying nuances, progress factors for an all inclusive presentation for the teams. This presentation deck can be used by all professionals, managers, individuals, internal external teams involved in any company organization.
This is a complete project plan which is prepared using a given business case. It included determining project scope, schedule, cost, budgeting, communication, risk management & human resource management and etc.
In today’s busy and expensive life we are in a great rush to make money. But at the end of the month we broke off. As we are unknowingly spending money on little and unwanted things. So, we have come over with the idea to track our earnings. Daily Expense Tracker (DET) aims to help everyone who are planning to know their expenses and save from it. DET is an android app which users can execute in their mobile phones and update their daily expenses so that they are well known to their expenses. Here user can define their own categories for expense type like food, clothing, rent and bills where they have to enter the money that has been spent and also can add some information in additional information to specify the expense. User can also define expense categories. User will be able to see pie chart of expense. Also, DET app is capable of clustering. Personal and administration clustering is possible by the use of Apriori algorithm. Although this app is focused on new job holders, interns, and teenagers, everyone who wants to track their expense can use this app.
Customer segmentation is a Project on Machine learning that is developed by using Clustering & clustering is the technique that comes under unsupervised learning of machine learning.
Segmentation allows prospects based on their wants and needs. It allows identifying the most valuable customer segment so the basis of it vender improve their return on marketing investment by only targeting those likely to be your best customer.
This is a complete project plan which is prepared using a given business case. It included determining project scope, schedule, cost, budgeting, communication, risk management & human resource management and etc.
In today’s busy and expensive life we are in a great rush to make money. But at the end of the month we broke off. As we are unknowingly spending money on little and unwanted things. So, we have come over with the idea to track our earnings. Daily Expense Tracker (DET) aims to help everyone who are planning to know their expenses and save from it. DET is an android app which users can execute in their mobile phones and update their daily expenses so that they are well known to their expenses. Here user can define their own categories for expense type like food, clothing, rent and bills where they have to enter the money that has been spent and also can add some information in additional information to specify the expense. User can also define expense categories. User will be able to see pie chart of expense. Also, DET app is capable of clustering. Personal and administration clustering is possible by the use of Apriori algorithm. Although this app is focused on new job holders, interns, and teenagers, everyone who wants to track their expense can use this app.
Customer segmentation is a Project on Machine learning that is developed by using Clustering & clustering is the technique that comes under unsupervised learning of machine learning.
Segmentation allows prospects based on their wants and needs. It allows identifying the most valuable customer segment so the basis of it vender improve their return on marketing investment by only targeting those likely to be your best customer.
Multi Criteria Decision Making Methodology on Selection of a Student for All ...ijtsrd
Selecting a student for all round excellent award is based on a complex, elaborate combination of abilities and skills. A multi criteria Decision Making method, AHP is used to help in making decision consistently by doing a pairwise comparison matrix process between criteria based on selected alternatives and determining the priority order of criteria and alternatives used. The results of these calculations are used to determine the outstanding student receiving a scholarship based on the final results of the AHP method calculation. The results demonstrated that the student ranking is more likely influenced by the relative importance of management, leadership and motivation by sub criteria, education, cooperation, innovation, disciplinary, attendance, knowledge, sports activity, social activity and awards. Kyi Kyi Mynit "Multi Criteria Decision Making Methodology on Selection of a Student for All Round Excellent Award" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26428.pdfPaper URL: https://www.ijtsrd.com/management/research-method/26428/multi-criteria-decision-making-methodology-on-selection-of-a-student-for-all-round-excellent-award/kyi-kyi-mynit
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...IJITE
A fundamental subject delivered at the tertiary level could have a cohort of several hundreds of students
distributed into multiple campuses. The running of such a unit typically calls for a teaching team of which a
major task is to fairly mark all students’ various assessment items. It is well observed that a given
assessment is likely to receive different marks if it is given to different markers, often regardless of how
detailed the marking criteria are, especially when the content is of subjective or opinion based nature. In
this work, we propose an effective strategy to improve the fairness on the students’ overall marks by
accepting that markers may have inherent marking leniency of different magnitude and by dynamically
reselecting markers for different groups of students in such a way that the students will eventually share a
similar amount of marking leniency in their overall marks. This strategy is completely objective, purely
based on the markers’ previous marking statistics, and is independent of the design and interpretation of
the marking criteria.
IMPROVING FAIRNESS ON STUDENTS’ OVERALL MARKS VIA DYNAMIC RESELECTION OF ASSE...IJITE
A fundamental subject delivered at the tertiary level could have a cohort of several hundreds of students
distributed into multiple campuses. The running of such a unit typically calls for a teaching team of which a
major task is to fairly mark all students’ various assessment items. It is well observed that a given
assessment is likely to receive different marks if it is given to different markers, often regardless of how
detailed the marking criteria are, especially when the content is of subjective or opinion based nature. In
this work, we propose an effective strategy to improve the fairness on the students’ overall marks by
accepting that markers may have inherent marking leniency of different magnitude and by dynamically
reselecting markers for different groups of students in such a way that the students will eventually share a
similar amount of marking leniency in their overall marks. This strategy is completely objective, purely
based on the markers’ previous marking statistics, and is independent of the design and interpretation of
the marking criteria.
Improving Fairness on Students' Overall Marks via Dynamic Reselection of Asse...IJITE
A fundamental subject delivered at the tertiary level could have a cohort of several hundreds of students
distributed into multiple campuses. The running of such a unit typically calls for a teaching team of which a
major task is to fairly mark all students’ various assessment items. It is well observed that a given
assessment is likely to receive different marks if it is given to different markers, often regardless of how
detailed the marking criteria are, especially when the content is of subjective or opinion based nature. In
this work, we propose an effective strategy to improve the fairness on the students’ overall marks by
accepting that markers may have inherent marking leniency of different magnitude and by dynamically
reselecting markers for different groups of students in such a way that the students will eventually share a
similar amount of marking leniency in their overall marks. This strategy is completely objective, purely
based on the markers’ previous marking statistics, and is independent of the design and interpretation of
the marking criteria.
Improving Fairness on Students' Overall Marks via Dynamic Reselection of Asse...IJITE
A fundamental subject delivered at the tertiary level could have a cohort of several hundreds of students distributed into multiple campuses. The running of such a unit typically calls for a teaching team of which a major task is to fairly mark all students’ various assessment items. It is well observed that a given assessment is likely to receive different marks if it is given to different markers, often regardless of how detailed the marking criteria are, especially when the content is of subjective or opinion based nature. In this work, we propose an effective strategy to improve the fairness on the students’ overall marks by accepting that markers may have inherent marking leniency of different magnitude and by dynamically reselecting markers for different groups of students in such a way that the students will eventually share a similar amount of marking leniency in their overall marks. This strategy is completely objective, purely
based on the markers’ previous marking statistics, and is independent of the design and interpretation of the marking criteria.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
2019 Midwest Scholarship of Teaching & Learning (SOTL) conference presentation. The goal of this presentation is to share our data-informed approach to re-engineer the exam design, delivery, grading, and item analysis process in order to construct better exams that maximize all students potential to flourish. Can we make the use of exam analytics so easy and time efficient that faculty clearly see the benefit? For more info see our blog at https://kaneb.nd.edu/real/
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
A Comprehensive Look at Generative AI in Retail App Testing.pdfkalichargn70th171
Traditional software testing methods are being challenged in retail, where customer expectations and technological advancements continually shape the landscape. Enter generative AI—a transformative subset of artificial intelligence technologies poised to revolutionize software testing.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Designing for Privacy in Amazon Web ServicesKrzysztofKkol1
Data privacy is one of the most critical issues that businesses face. This presentation shares insights on the principles and best practices for ensuring the resilience and security of your workload.
Drawing on a real-life project from the HR industry, the various challenges will be demonstrated: data protection, self-healing, business continuity, security, and transparency of data processing. This systematized approach allowed to create a secure AWS cloud infrastructure that not only met strict compliance rules but also exceeded the client's expectations.
Modern design is crucial in today's digital environment, and this is especially true for SharePoint intranets. The design of these digital hubs is critical to user engagement and productivity enhancement. They are the cornerstone of internal collaboration and interaction within enterprises.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
Your Digital Assistant.
Making complex approach simple. Straightforward process saves time. No more waiting to connect with people that matter to you. Safety first is not a cliché - Securely protect information in cloud storage to prevent any third party from accessing data.
Would you rather make your visitors feel burdened by making them wait? Or choose VizMan for a stress-free experience? VizMan is an automated visitor management system that works for any industries not limited to factories, societies, government institutes, and warehouses. A new age contactless way of logging information of visitors, employees, packages, and vehicles. VizMan is a digital logbook so it deters unnecessary use of paper or space since there is no requirement of bundles of registers that is left to collect dust in a corner of a room. Visitor’s essential details, helps in scheduling meetings for visitors and employees, and assists in supervising the attendance of the employees. With VizMan, visitors don’t need to wait for hours in long queues. VizMan handles visitors with the value they deserve because we know time is important to you.
Feasible Features
One Subscription, Four Modules – Admin, Employee, Receptionist, and Gatekeeper ensures confidentiality and prevents data from being manipulated
User Friendly – can be easily used on Android, iOS, and Web Interface
Multiple Accessibility – Log in through any device from any place at any time
One app for all industries – a Visitor Management System that works for any organisation.
Stress-free Sign-up
Visitor is registered and checked-in by the Receptionist
Host gets a notification, where they opt to Approve the meeting
Host notifies the Receptionist of the end of the meeting
Visitor is checked-out by the Receptionist
Host enters notes and remarks of the meeting
Customizable Components
Scheduling Meetings – Host can invite visitors for meetings and also approve, reject and reschedule meetings
Single/Bulk invites – Invitations can be sent individually to a visitor or collectively to many visitors
VIP Visitors – Additional security of data for VIP visitors to avoid misuse of information
Courier Management – Keeps a check on deliveries like commodities being delivered in and out of establishments
Alerts & Notifications – Get notified on SMS, email, and application
Parking Management – Manage availability of parking space
Individual log-in – Every user has their own log-in id
Visitor/Meeting Analytics – Evaluate notes and remarks of the meeting stored in the system
Visitor Management System is a secure and user friendly database manager that records, filters, tracks the visitors to your organization.
"Secure Your Premises with VizMan (VMS) – Get It Now"
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Student Performance Data Mining Project Report
1. Page | 1
PROJECT REPORT
STUDENT PERFORMANCE
(DATAMINING)
BS (SE)2017
GROUP MEMBER(S):
NAME: HAFSAHABIB 2017/COMP/BS(SE)-21597
NAME: MUNIBAJAVIAD 2017/COMP/BS(SE)-21621
SUPERVISOR:
MISS SADIA JAVED
29TH APRIL, 2019
DEPARTMENT OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY
JINNAH UNIVERSITY FOR WOMEN
5-C NAZIMABAD, KARACHI 74600
2. Page | 2
Table of Contents
1. Introduction.................................................................................................................................3
2. Description of the problem and problem domain....................................................................3
3. Description of implemented data mining techniques/methods...............................................3
3.1. Naïve Bayes Classifier..........................................................................................................3
4. Data Set........................................................................................................................................3
4.1. Exploring the Data Set.........................................................................................................4
4.1.1. General Distribution of Exam Scores ..........................................................................4
4.1.2. Exam scores based on the gender.................................................................................5
4.1.3. Exam scores based on the Parent Level of Education................................................6
4.1.4. Exam scores based on the Lunch Type........................................................................7
4.1.5. Exam scores based on theTest Prepration Course .....................................................7
5. Implementation ...........................................................................................................................8
5.1. Operators ..............................................................................................................................8
6. Results and evaluation/discussion of the results ......................................................................9
7. Future directions/ideas how to extend and enhance the technique......................................10
8. Conclusion .................................................................................................................................10
9. References..................................................................................................................................10
3. Page | 3
1. Introduction
Using the Students Performance in Exams Dataset we will try to understand what affects the
exam scores. The data is limited, but it will present a good visualization to spot the relations. First
of all, we explore our data and after that we apply Naive Bayes Classification technique for
evaluation purpose.
2. Description of the problem and problem domain
To understand the influence of the parent’s background, test preparation etc. on students’
performance.
Objectives
Check the dataset and tidying the data if needed.
Visualize the data to understand the effects of different factors on a student performance.
Check the effectiveness of test preparation course.
Check what are the major factors influencing the test scores.
3. Description of implemented data mining techniques/methods
3.1. Naïve Bayes Classifier
Bayesian classifiers are statistical classifiers that predict class membership by probabilities, such
as the probability that a given sample belongs to a particular class. Naive Bayes algorithms
assume that the effect that an attribute plays on a given class is independent of the values of other
attributes. However, in practice, dependencies often exist among attributes; hence Bayesian
networks are graphical models, which can describe joint conditional probability distributions.
Bayesian classifiers are popular classification algorithms due to their simplicity, computational
efficiency and very good performance for real-world problems. Another important advantage is
also that the Bayesian models are fast to train and to evaluate, and have a high accuracy in many
domains.
4. Data Set
Gender: Gender of the student (i.e. Male, Female)
Ethnicity: Ethnicity to which the student belongs (i.e. group A, B, C, D, E)
Parent level of Education: Education level of the parents/guardian of the student (i.e.
high school, bachelor’s degree, master’s degree, some college, associate’s degree)
Lunch: Standard of the lunch provided to the student in school (i.e. standard,
free/reduced)
Test preparation course: Whether the student took the preparation course (i.e. none,
completed)
Math score: Mathematics score of the student (from 0 to 100)
Reading score: Reading score of the student (from 0 to 100)
Writing score: Writing score of the student (from 0 to 100)
Student Performance: Overall performance of the student (i.e. Good, Average, Bad,
Worst)
4. Page | 4
4.1. Exploring the Data Set
Firstly, We Import the dataset repository and display first few rows of the dataset.
4.1.1. General Distribution of Exam Scores
There are 5 features which might affect the scores of each exam. First thing to analyses would be
to see how the scores are distributed within each exam (Math’s, Reading, and Writing). We will
plot histograms to see if there any differences in the scores' distribution.
5. Page | 5
The scores are distributed in the Gaussian manner. It is hard to draw any conclusion from the
graphs above: they all look very similar and we don't have enough data for the plots to look more
smoothly.
4.1.2. Exam scores based on the gender
Graphical representation of the exam scores’ based on the gender (i.e. Male, Female).
6. Page | 6
4.1.3. Exam scores based on the Parent Level of Education
Displaying the mean values as a table or a heat map.
Indeed, it seems that a lower parental level of education has a negative impact on the exam scores.
A child of parents who’s the highest education level was college or high school has noticeably
lower exam scores than their peers. Similarly, parents with master's or bachelor's degree have
children who scores much better in the exams.
7. Page | 7
4.1.4. Exam scores based on the Lunch Type
It might be amusing to think that type of lunch students have is correlated to their exam scores.
On the other hand, we can see from the dataset that there are two types of
lunch: standard and free/reduced. So it depends on the parents' financial situation rather than on
the type of the dish. There might be some correlation be here, so let's try to visualize the problem.
According to above visualization, there is a huge disproportion between students who have
a free/reduced lunch when compared to those having standard lunch.
4.1.5. Exam scores based on theTest Prepration Course
The last thing we explore in this dataset is to determine how the completion of the test preparation
course affects the exam scores by using heat map. There are only two categorical
variables: none and completed.
8. Page | 8
5. Implementation
This dataset is clean and free of unwanted data. We don’t have to go through the processes of
cleaning the data. In our data set Student Performance, we apply Naïve Bayes classification
technique. Naïve Bayes classifier is a famous approach for supervised learning. It mainly
classifies a test data provided with the fact that training data is used to train up the model. There
exist 8 features and 1 label named as Student performance.
Student Performance is the class label which needs to be predicted. As the testing data is not
separately provided thereby, we will split this dataset for training and testing respectively. We are
using the ratio of 70:30 for training and testing respectively.
We then train Naïve Bayes model using 70% of the dataset and then classify the rest 30% of the
data. After that we Measure performance parameters i.e. accuracy, precision and recall to show
how much accurate the model has been for the dataset.
5.1. Operators
The details of the operators that are used for the creation of the process are as follows:
Retrieve
This Operator can access stored information in the Repository and load them into the Process.
Set Role
This Operator is used to change the role of one or more Attributes.
Split data
This operator produces the desired number of subsets of the given Data Set.
Naïve Bayes
This Operator generates a Naive Bayes classification model.
Apply Model
This Operator applies a model on the given Data Set.
Performance
This operator is used for performance evaluation. It delivers a list of performance criteria
values. These performance criteria are automatically determined in order to fit the learning
task type.
9. Page | 9
6. Results and evaluation/discussion of the results
Confusion Matrix
Here, the result of the process of data set “Student Performance” is shown below in the form of
confusion matrix. This table shows the accuracy, class precision and class recall.
The following criteria are added for binominal classification tasks:
Accuracy
Precision
Recall
Accuracy is calculated by taking the percentage of correct predictions over the total number of
examples. Correct prediction means examples where the value of the prediction attribute is equal
to the value of the label attribute.
Here, the Accuracy of the Student Performance data set is 92.64%
10. Page | 10
7. Future directions/ideas how to extend and enhance the technique
By using the process or model we can predict more about the student performances and theirs
factors involves with them.
In future, this can be implemented in any university by using this process we can calculate the
GPA of the student in advance by just knowing their previous GPA.
In schools, we can calculate the performance of the worst student so that by knowing the name of
those students, teacher may focus more on such type of students.
8. Conclusion
We have already seen the insights of the Data, the summary is written below:
135 students failed in mathematics, 90 students failed in reading examination, 114
students failed in writing examination and overall 103 students failed the examination.
Reading score and Writing score are positively linearly correlated with correlation
coefficient 0.95(approx.).
Students who belongs to group D in ethnicity performed very well.
Test Preparation Course is very effective. We saw that the students who had completed
their test preparation course failed less in number.
Students who take standard lunch performed very well than others.
In case of parental education level, the parents with master's or bachelor's degree have
children who scores much better in the exams.
The Accuracy of the Student Performance data set is 92.64% calculated by the naïve
Bayes classifier process.
9. References
https://www.kaggle.com/spscientist/students-performance-in-exams#StudentsPerformance.csv