Breaking the Kubernetes Kill Chain: Host Path Mount
Mining the LET Performance in Generating Prediction Models for OTDSS
1. Mining the Teachers’ Licensure
Examination Performance in
Generating Prediction Models for
Online Test and Decision Support
System
IVY M. TARUN
DR. BOBBY D. GERARDO
DISSERTATION ADVISER
FINAL PRESENTATION - MARCH 7, 2015 1
2. OVERVIEWOVERVIEW
INTRODUCTION
◦ BACKGROUND OF THE STUDY
◦ OBJECTIVES OF THE STUDY
◦ SIGNIFICANCE OF THE STUDY
◦ SCOPE AND DELIMITATIONS
THEORETICAL FRAMEWORK
◦ RESEARCH GAPS
◦ FRAMEWORK
OPERATIONAL FRAMEWORK
RESULTS AND DISCUSSION
CONCLUSION AND RECOMMENDATION
FINAL PRESENTATION - MARCH 7, 2015 2
4. BACKGROUND OF THEBACKGROUND OF THE
STUDYSTUDY
Education community as the forefront of most of the changes
in ICT
Use of databases where data are accumulated and stored
Data Mining as one of the major breakthroughs in discovering
useful information
It was only in the recent years that data mining was applied in
the educational domain.
Only few researches were undertaken in the Philippines and
other countries
Explore data mining in the context of licensure examination
performance prediction.
FINAL PRESENTATION - MARCH 7, 2015 4
5. OBJECTIVESOBJECTIVES
1. Develop a licensure examination performance prediction
models based on Multiple Regression and PART
Classification technique
2. Develop an Online Test and Decision Support System
integrating the licensure examination performance
prediction models;
3. Test and implement the licensure examination
performance prediction models using real institutional
data; and
4. Evaluate the Online Test and Decision Support System.
FINAL PRESENTATION - MARCH 7, 2015 5
6. SIGNIFICANCESIGNIFICANCE
The use of the models will allow initial identification of
reviewees who are likely to have difficulty in passing the
LET, therefore providing sufficient time and opportunities
for appropriate interventions during review sessions.
Online Test and Decision Support System will make way
to the contemporary approach to pre-board exam
management.
This study likewise provides information for researchers,
data mining enthusiasts and information technologists for
the processes and application of data mining and decision
support system integration.
FINAL PRESENTATION - MARCH 7, 2015 6
7. SCOPE ANDSCOPE AND
DELIMITATIONSDELIMITATIONS
Main points: generation of Multiple Regression and PART
models and the development of the Online Test and
Decision Support System
Secondary: testing and implementation of the system
using real-institutional data; and system evaluation
System Features: repeated generation of MR Models
(without subset evaluation), mock board exam,
performance prediction for decision making
Criteria for evaluation: adapted from ISO/IEC 9126.
Maintainability was not included since it cannot be
measured at the time of its evaluation.
FINAL PRESENTATION - MARCH 7, 2015 7
9. RESEARCH GAPSRESEARCH GAPS
Most of the researches on educational data mining endeavor
on prediction of academic performance specifically on
identifying which students are likely to fail or succeed in their
studies or more particularly in their subjects.
Many researches on predicting academic performance have
been carried out making use of web-based system data logs,
socio-demographic variables, behavioral and enrollment data
but there are still very few who explored the academic records
of students including learning styles particularly to predict their
performance in the licensure exam.
Most studies focused only on business and business-related
data and processes with the objective of delivering customer-
centered and marketplace support.
FINAL PRESENTATION - MARCH 7, 2015 9
12. MODEL GENERATIONMODEL GENERATION
Algorithms used: Multiple Regression and PART
Multiple Regression (MR) – the value of the response variable, Y, is
predicted using a linear function of predictor variables, X1, X2… Xn.
MR Equation:
MR Model form:
Unknown Coefficients are computed using the covariance equation:
= β1(X1 - 1) + β2(X2 - 2) + … + βn(Xn - n)
Y = β1X1 - β1 1 + β2X2 – β2 2 + … + βnXn - βn n +
FINAL PRESENTATION - MARCH 7, 2015
cov(X, Y) = I - ) i - ) / (n – 1)
12
13. MODEL GENERATIONMODEL GENERATION
PART is a rule-based classifier that outputs a rule from partial
decision tree in each iteration (Pentaho Data Mining, n.d.).
PART model form:
(Condition) → y
Sample PART Model:
FINAL PRESENTATION - MARCH 7, 2015 13
18. OTDSS DEVELOPMENTOTDSS DEVELOPMENT
(cont’d)(cont’d)
System Architecture
FINAL PRESENTATION - MARCH 7, 2015 20
Decision Support StageDecision Support Stage
Data Mining StageData Mining Stage
Data Sources
MDB File
XLS/ XLSX File
CSV File
Clean
Data
Clean
Data
Data Repository
Select and
Transform
Select and
Transform
Datasets
Apply MR
Algorithm
Apply MR
Algorithm
MR Model
I
n
t
e
g
r
a
t
e
I
n
t
e
g
r
a
t
e
Manage Dialogs
and Interfaces
Manage Dialogs
and Interfaces
Manage Models
and Knowledge
Manage Models
and Knowledge
User Interface Knowledge Model Repository
U
p
d
a
t
e
U
p
d
a
t
e
System
Administrator
System User
19. OTDSS DEVELOPMENTOTDSS DEVELOPMENT
(cont’d)(cont’d)
Context DFD
Online Test and
Decision Support
System
Online Test and
Decision Support
System
SYSTEM
ADMINISTRATOR
SYSTEM
ADMINISTRATOR
REVIEWEEREVIEWEE
Raw Data
MR Model
Test Item
Mock Board Exam
Test Answer
Mock Board Exam
Result
Prediction Data
Performance
Prediction
FINAL PRESENTATION - MARCH 7, 2015 21
20. TESTING PROCEDURESTESTING PROCEDURES
Testing was conducted per unit.
The whole OTDSS is subjected to testing using a real
institutional data.
MR model generated by the OTDSS was compared to the
models generated by Weka and Excel.
FINAL PRESENTATION - MARCH 7, 2015 22
21. EVALUATIONEVALUATION
PROCEDURESPROCEDURES
System evaluation was done by assessing the system
objectives against the following criteria as defined in
ISO/IEC 9126, the international standard for the
evaluation of software quality.
1. Functionality
2. Efficiency
3. Usability
4. Reliability
5. Portability
FINAL PRESENTATION - MARCH 7, 2015 23
22. EVALUATIONEVALUATION
PROCEDURES (cont’d)PROCEDURES (cont’d)
Response Numerical Value
Strongly Agree 1
Agree 2
Uncertain 3
Disagree 4
Strongly Disagree 5
Five-point Likert Scale was used to measure the opinions of
the respondents
Data were summarized and analyzed using Mode,
Frequency and Percentages to determine the Central
Tendency of the data while Inter-Quartile Range was used
to measure the dispersion of the data.
FINAL PRESENTATION - MARCH 7, 2015 24
25. PART Models (cont’d)PART Models (cont’d)
PART Model B
FINAL PRESENTATION - MARCH 7, 2015 27
Ong, Palompon, Bañico:
“Student’s Academic
Performance and their
performance in the pre-board
examination are significant
determinants of the success
and failure of their licensure
examination performance.”
26. PART Models (cont’d)PART Models (cont’d)
Actual
Class
Predicted Class
PART Model A PART Model B
Passed Failed Percent
Correct
Passed Failed Percent
Correct
Passed 46 1 97.87% 44 3 93.62%
Failed 5 4 44.44% 4 5 55.56%
Overall
Percentag
e
90.20% 80.00% 89.29% 91.67% 62.50% 87.50%
PART Confusion Matrices
FINAL PRESENTATION - MARCH 7, 2015 28
27. PART Models (cont’d)PART Models (cont’d)
71.43%
28.57%
78.57%
21.43%
CORRECTLY CLASSIFIEDINSTANCES INCORRECTLY CLASSIFIEDINSTANCES
PART Model A PART Model B
Performance After Testing
FINAL PRESENTATION - MARCH 7, 2015 29
29. MR Models (cont’d)MR Models (cont’d)
MR Model B
FINAL PRESENTATION - MARCH 7, 2015 31
Decrease effect?
30. MR Models (cont’d)MR Models (cont’d)
MR Model A MR Model B
Correlation Coefficient 0.3531 0.4899
Mean Absolute Error
(MAE)
3.1906 2.9525
Root Mean Squared
Error (RMSE)
4.0494 3.4841
Error Measures of the MR Models
FINAL PRESENTATION - MARCH 7, 2015 32
31. MR Models (cont’d)MR Models (cont’d)
MR Model A Evaluation on Test Set MR Model B Evaluation on Test Set
FINAL PRESENTATION - MARCH 7, 2015 33
32. Testing of OTDSSTesting of OTDSS
FINAL PRESENTATION - MARCH 7, 2015 34
Excel Output WEKA Output
OTDSS Output
34. OTDSS EvaluationOTDSS Evaluation
OTDSS Functionality
Mode Percentage Inter-
Quartile
Range
The set of functions on a
specified task is appropriate.
1 88.89% 0
The system provides the right
and agreed results.
1 74.07% 0.5
The system has the ability to
prevent unauthorized access,
whether accidental or
deliberate, to programs or
data.
1 81.48% 0
The system works as per
intended application.
1 96.30% 0
The system provides output
that conforms to the base
requirements.
1 85.19% 0
The system generates output
the same with that of the
expected output.
1 92.59% 0
The system has no broken
links and spelling mistakes.
1 92.59% 0
There is no confusing
application flow and crashes.
1 88.89% 0
The response time is fast. 1 88.89% 0
The system serves its purpose
well.
1 96.30% 0
FINAL PRESENTATION - MARCH 7, 2015 36
35. OTDSS EvaluationOTDSS Evaluation
(cont’d)(cont’d)
OTDSS Efficiency
Mode Percentage Inter-
Quartile
Range
User specifications are
achieved by the system.
1 85.19% 0
The system produces desired
output with optimum time.
1 88.89% 0
The system does the required
processing on least amount of
hardware.
1 88.89% 0
The system uses an optimum
amount of memory and disk
space.
1 85.19% 0
The system performs greater
useful work transactions.
1 88.89% 0
FINAL PRESENTATION - MARCH 7, 2015 37
36. OTDSS EvaluationOTDSS Evaluation
(cont’d)(cont’d)
OTDSS Usability
Mode Percentage Inter-
Quartile
Range
The system is easy to use. 1 88.89% 0
The system enables the user to
learn how to use it.
1 96.30% 0
The system has an attractive
interface.
1 66.67% 1
The system is easy to
understand.
1 96.30% 0
The system is fit to be used by
both reviewers and reviewees.
1 96.30% 0
FINAL PRESENTATION - MARCH 7, 2015 38
37. OTDSS EvaluationOTDSS Evaluation
(cont’d)(cont’d)
OTDSS Reliability
Mode Percentage Inter-
Quartile
Range
The system handles errors
systematically.
1 85.19% 0
Transactions are simple. 1 85.19% 0
The system has the ability to
continue operating properly
in the event of the failure of
some of its components.
1 85.19% 0
The system provides
consistent results.
1 92.59% 0
The system performs
consistently well.
1 88.89% 0
FINAL PRESENTATION - MARCH 7, 2015 39
38. OTDSS EvaluationOTDSS Evaluation
(cont’d)(cont’d)
OTDSS Portability
Mode Percentage Inter-
Quartile
Range
The system is re-useable. 1 96.30% 0
The system is easy to install. 1 81.48% 0
The system can be transferred
from one computer to another.
1 85.19% 0
The system can run on
different operating systems
without requiring major
rework.
1 81.48% 0
The system can be easily
integrated into another
environment with consistent
functional correctness.
1 88.89% 0
FINAL PRESENTATION - MARCH 7, 2015 40
40. CONCLUSIONCONCLUSION
1. It can be concluded that the GWA of the reviewees in
their General Education subjects, the result of the Mock
Board Exam and the instance when the reviewee is
conducting a selfreview are good predictors of the
licensure examination performance. It can be concluded
also that based from MR Model B, the GWA of the
reviewees in their Major or Content courses is the best
predictor of licensure examination performance.
FINAL PRESENTATION - MARCH 7, 2015 42
41. CONCLUSION (cont’d)CONCLUSION (cont’d)
2. It has been demonstrated in this paper using the
developed Online Test and Decision Support System that
the fusion of data mining in education and decision
support system can be readily used in practice. The
approach of integrating data mining in decision support
system can then be successfully used for other
applications aside from performance prediction.
3. The developed Online Test and Decision Support System
produced accurate Multiple Regression models. This was
exhibited during testing and simulation of real
institutional data and displayed the same output with
that of the two reliable application programs, the
Microsoft Excel and WEKA.
FINAL PRESENTATION - MARCH 7, 2015 43
42. CONCLUSION (cont’d)CONCLUSION (cont’d)
4. The study has shown that the developed Online Test and
Decision Support System satisfied its implied functions and is
efficient, usable, reliable and portable. The respondents
during the evaluation of the system have shown their
enthusiasm of using the system to enhance their licensure
examination review. This suggests that the system could be
used by the reviewees during their review. This would allow
initial identification of reviewees who are likely to have
difficulty in passing the licensure examination, therefore
providing sufficient time and opportunities for appropriate
interventions during review sessions. It should, however, be
used with caution as the system was built as an
encouragement for the reviewees to review very well and not
as a substitute to the face-to-face review sessions.
FINAL PRESENTATION - MARCH 7, 2015 44
43. RECOMMENDATIONSRECOMMENDATIONS
On the basis of the promising findings presented in
this paper, the following issues are recommended:
1. Further research will be needed to validate the models
generated.
2. Future research should concentrate on the addition of
instances in the dataset. Attribute selection can be
considered as there will be a larger dataset.
FINAL PRESENTATION - MARCH 7, 2015 45
44. RECOMMENDATIONSRECOMMENDATIONS
3. Other variables could also be considered that relate to
academic or licensure examination performance such as
study habits and motivations.
4. Decision Support System should be upgraded by
incorporating a more complex features both on the data
mining and decision support stages. As such, statistical
computations of standard error, t- statistic and p-value
may be included together with the generation of the
MR Model.
FINAL PRESENTATION - MARCH 7, 2015 46
46. The images used in this presentation are not owned by the author and
were used only to add aesthetics to the slides and to better demonstrate the
system architecture.
At this juncture, the author would like to acknowledge the following
sources of the images:
http://danmaycock.com/wp-content/uploads/2014/07/DataMining.jpg
http://learn.burnside.school.nz/pluginfile.php/35905/mod_label/intro/Introduction.jpg
http://blogs.adobe.com/captivate/files/2013/01/Picture1.jpg
http://thumbs.dreamstime.com/z/concept-data-mining-24810792.jpg
http://content.quizfactor.com/quizzes/quiz_00005.jpg;width=230;height=164
http://www.iconeasy.com/icon/thumbnail/System/Fast%20Icon%20Users/Fast%20Icon
%20Users%20icon%20thumbnail.jpg
http://www.grace-fp7.eu/sites/default/files/imagecache/Article-popup/article-
images/Database_iStock_000020783950XSmall_0.jpg
http://pierrot-peladeau.net/wp-
content/uploads/2010/08/Free_Database_Icons_by_artistsvalley.jpg
http://www.creattor.com/files/10/617/database-icons-screenshots-1.png
https://rfclipart.com/image/big/de-d0-62/folder-icon-Download-Royalty-free-Vector-File-EPS-
3135.jpg
http://d2ed9jytfu8qln.cloudfront.net/wp-content/uploads/2013/03/Data-Analysis-image.png
https://lh4.googleusercontent.com/iyvmiSijnrpHPawMVc9SAhzYGP-
9FQU1KaHUGDjH7tBj4O4uUyIYTE37oaQWLBSFT8gUqylFMtraaY2YoeUlgCS3H7OY5N-
MhZDNNDLol4INSNK0iMlWcoTEWAtBBf28
http://nowiknow.com/wp-content/uploads/Thank-You.jpg
FINAL PRESENTATION - MARCH 7, 2015 48
Photo Acknowledgment
Editor's Notes
Maintainability - Characteristic of design and installation which determines the probability that a failed equipment, machine, or system can be restored to its normal operable state within a given timeframe, using the prescribed practices and procedures. Its two main components are serviceability (ease of conducting scheduled inspections and servicing) and reparability (ease of restoring service after a failure). http://www.businessdictionary.com/definition/maintainability.html#ixzz3O9oCqi6w
The original dataset consisted of 73 instances. The attributes were discretized from numeric to categorical ones for PART and from categorical to numeric for MR producing two (2) separate datasets but using the same data.
Transformation process reduced the original dataset from 73 to 70 instances for PART and 55 instances for MR.
CfsSubsetEval of WEKA(Correlation-based Feature Selection) considers the worth of individual features for predicting the class together with the level of inter-correlation among them
The RAD was chosen because of its suitability to the project’s scope, data, decisions, team, technical architecture and requirements.
Figure examines the difference between the performance of the two PART models, as far as classifier accuracy is concerned, when subjected to testing using the supplied test dataset.