SlideShare a Scribd company logo
Question 1
The Uniform Commercial Code incorporates some of the same
elements as the Statute of Frauds. Under the Statute of Frauds,
certain contracts must be in writing to be enforceable. Research
the types of contracts that must be in writing under the Statute
of Frauds.
Do you agree with the contracts that need to be in writing and
explain why or why not? Imagine that you were asked to be part
of a team to draft revisions to the Statute of Frauds. What
changes or proposals would you make? Why?
Respond to this… The Statute of Frauds requires that certain
types of contracts be in writing to be able to be enforced. These
types of contracts include goods that are priced at $500 or more,
interest in land, promises to pay off debt, and contracts that
cannot be performed within one year, all of which have been
signed by the defendant to be enforceable. I do think that all of
these contracts should be in writing because it is a type of
safeguard of the resource to ensure that each party is
responsible for whatever the contract is regarding. For
example, if we did not have to sign for a car loan, the
responsible party that needs to pay the loan back could walk
away, and without a signature of agreement to the terms of the
loan, it would be hard for the company to fight for their money,
as there is no signature enforcing the agreement.
If I had to revise something with the Statute of Frauds, I would
change the contacts that cannot be performed within one year. I
think one year is a long time to let a contract slide. I feel that
six months sounds more reasonable. I guess if I was a business
and I did not get commitment to a contract for a whole year, I
feel this would greatly affect my business. I also think it might
be a harder fight to get whatever the other party is responsible
for as it was a year ago. As a business, I think I would want to
pursue a breach of contract in three or four months even. That
is a long time to not pay up.
Question 2
Let’s assume that you are interested in doing a statistical survey
and you use confidence intervals for your conclusion. Describe
a possible scenario and indicate what the population is, and
what measure of the population you would try to estimate
(proportion or mean) by using a sample.
· What is your estimate of the population size?
· What sample size will you use?
· How will you gather information for your sample?
· What confidence percentage will you use?
Let’s assume that you have completed the survey and now state
your results using a confidence interval statement. You can
make up the numbers based on a reasonable result.
Respond to this… had found a study in Australia and New
Zealand where they wanted to see if there was efficient care
when dealing with people that suffered from acute coronary
syndrome, that required an understanding of the sources of
variation in their care. Basically, they wanted to see if the
people that did not speak English well were receiving the same
amount of care as the English proficient ones. Basically, the
result was out of 4387 patients, 294 LEP (less efficient English
patients) were older (70.9 vs 66.3 years; P< 0.001), and higher
prevalence of suffering from high blood pressure (71.1% vs
62.8%; P=0.007), diabetes (40.5% vs 24.3%; P< 0.001), and had
kidney damage (16.3% vs 11.1%; P=0.007) compared to the
other 4093 (Hyun, et al., 2017). Once they were in the hospital,
there was no difference on how they received the care, they
were not treated differently. Patient demographics, medical
history, in hospital care, and acute and late outcomes were used
in this study to compare the two groups. A multiple-adjusted
regression model was used for length of stay, and multiple
adjusted logistic regression models were used for each of the
outcomes to estimate the offs ratios and corresponding 95%
confidence intervals (Hyun, et al., 2017). I think this
conclusion made since to me. If someone that doesn’t speak
English (or any language that is not native to the country that
you are in) will not seek out medical help for any issues
because they will not necessarily know what is happening and
would rather take their chances. I am at least glad to know that
once that patient reaches to the hospital they are treated just as
fairly as the English speaking patients.
Hyun, K., Redfern, J., Woodward, M., Briffa, T., Cher, D.,
Ellis, C., . . . . (2017, May/June). Is There Inequity in Hospital
Care Among Patients With Acute Coronary Syndrome Who Are
Proficient and Not Proficient in English Language?: Analysis of
the SNAPSHOT ACS Study. Is There Inequity in Hospital Care
Among Patients With Acute Coronary Syndrome Who Are
Proficient and Not Proficient in English Language?: Analysis of
the SNAPSHOT ACS Study, 288-295.
doi:10.1097/JCN.0000000000000342
Article
DOI: 10.1111/exsy.12138
Ordinal regression by a gravitational model in the field of
educational data mining
Pilar Gómez-Rey,1* Francisco Fernández-Navarro2 and
Elena Barberà1
(1) eLearn Center, Open University of Catalunya, Barcelona,
Spain
E-mail: [email protected]
(2) Department of Mathematics and Engineering, Universidad
Loyola Andalucia, Andalucia, Spain
Abstract: Educational data mining (EDM) is a research area
where the goal is to develop data mining methods to examine
data
critically from educational environments. Traditionally, EDM
has addressed the following problems: clustering, classification,
regression,
anomaly detection and association rule mining. In this paper,
the ordinal regression (OR) paradigm, is introduced in the field
of EDM. The
goal of OR problems is the classification of items in an ordinal
scale. For instance, the prediction of students’ performance in
categories
(where the different grades could be ordered according to A ≻
B ≻ C ≻ D) is a classical example of an OR problem. The EDM
community
has not yet explored this paradigm (despite the importance of
these problems in the field of EDM). Furthermore, an amenable
and
interpretable OR model based on the concept of gravitation is
proposed. The model is an extension of a recently proposed
gravitational
model that tackles imbalanced nominal classification problems.
The model is carefully adapted to the ordinal scenario and
validated with
four EDM datasets. The results obtained were compared with
state-of-the-art OR algorithms and nominal classification ones.
The
proposed models can be used to better understand the learning–
teaching process in higher education environments.
Keywords: educational data mining, ordinal regression models,
students satisfaction, gravitational models
1. Introduction
Educational data mining (EDM) is a recent framework based on
the application of data mining (DM) techniques to educational
problems (Oberreuter & Velasquez, 2013; Romero et al., 2013).
The main goal of EDM is to analyse educational data to find
patterns that can improve the quality of the learning process and
guide students’ learning (Romero & Ventura, 2007, 2010). The
knowledge discovered by EDM techniques may be useful for
teachers/instructors to manage their classes, understand their
students and reflect on their teaching methodologies. EDM has
contributed to the development of learning theories typically
investigated in the educational psychology field (Baker, 2010).
Siemens and d Baker (2012) described the similarities and
differences between learning analytics and EDM and concluded
that both fields are closely tied. EDM techniques can be applied
to data from both traditional classroom educational systems
(based on face-to-face contact) and to data coming from
distance
education environments (e-learning). It is important to note that
every type of education differs in nature and has different
objectives. Therefore, the conclusions obtained in these
environments will be also different. Currently, EDM techniques
have been used to address the following problems (Romero &
Ventura, 2007): data visualization and analysis, clustering,
classification, regression, outlier detection and association rule
learning. An explanation of each problem type can be found in
Appendix A.
On the other hand, ordinal regression (OR) problems are
those problems where the objective is to classify patterns in
an ordinal scale. For example, student satisfaction surveys
usually involve rating teachers based on an ordinal scale
{poor, average, good, very good and excellent}. Hence, the
class label has a natural order, that is, a pattern associated
with class label average has a higher rating (or better) than
another having class poor, but having class good is better
than both labels. This problem falls between nominal
classification, in which data are instead an unordered set,
and regression, in which data are instead a continuous,
totally ordered set. OR problems are also closely related to
the learning to rank problems (Moreira et al., 2013).
Many real problems require the classification of patterns
(items) into naturally ordered classes. In fact, many EDM
problems demand the classification of items in an ordinal
scale. However, so far, OR problems have been addressed
in the EDM community as regression or classification
problems. As a proof of this, we could highlight the recent
EDM review by na Ayala (2014), where the proposed OR
paradigm is not included within the EDM problems
(because of the lack of studies in this direction). For
example, the prediction of student performance is one of
the oldest and most popular applications of EDM. This
problem has been traditionally tackled using either
regression analysis techniques (Nebot et al., 2006) (assuming
an equal distance among the different classes) or nominal
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 161
classification approaches (Romero et al., 2008) (ignoring the
order among the different classes). One of the main goals of
this research work is to introduce the OR paradigm in the
EDM community. To show the importance of this
paradigm for the EDM community, four EDM datasets
are considered and addressed with OR models.
On the other hand, a data gravitational model (DGM) that
learns the parameters of a weighted Euclidean metric for
nominal classification problems has been recently proposed
by Cano et al. (2013) within the field of distance metric
learning (DML). One of the main advantages of this model
is its high interpretability, which makes it especially useful
for educational purposes. The DGM proposed is based on
the idea of a data-driven gravitational law (Wang & Chen,
2005; Zong-chang, 2008; Peng et al., 2009; Cano et al.,
2013). The underlying ideas of the DGMs are the following:
(a) there exists a force between any two patterns; (b) this force
follows Newton’s law of universal gravitation where the body
masses are substituted by a set of data points; and (c) the class
value of a test pattern is determined by comparing the force of
attraction between the pattern and the different classes.
Another goal of this paper is to propose a generalized force-
based model (GFM) specifically designed for OR problems
with educational purposes, extending in several ways to the
state-of-the-art DGMs. The outputs of the model are assumed
to be unimodal (da Costa et al., 2008). To impose this
constraint, the error function has been redefined to penalize
non-unimodal outputs. The proposed method extends to the
DGMs previously presented by considering, besides an
attribute-class weight matrix, a vector representing different
scaling of the class pattern interaction with the distance. The
model has been adapted to the characteristics of the problem
considered. Finally, the model parameters have been
optimized through the covariance matrix adaptation
evolution strategy (CMA-ES) global optimization algorithm
(Hansen & Ostermeier, 2001).
Summarizing, the main contributions of this paper are as
follows:
• To introduce the OR paradigm in the EDM community.
Most of the EDM problems require the classification of
object in an ordinal scale. Despite this, the EDM
community has not yet explored this paradigm.
• To propose a GFM that considers the particularities of
OR problems. The performance of the model proposed
was validated using two publicly available datasets and
one real-world educational problem used to analyse
students’ perceptions about online learning success
factors.
• For EDM problems, the accuracy of the model is equally
as important as its interpretability because EDM
techniques should be applied by practitioners (not just
by researchers) (Lin et al., 2013). Therefore, it is
important to apply and develop interpretable and
amenable models. Accordingly, the high interpretability
of the proposed models was also demonstrated
considering four OR EDM problems.
The remainder of the paper is organized as follows: a brief
analysis of some OR educational problems that were treated
as non-ordinal ones is provided in Section 2. Section 3
describes the case of studies considered in this research
work. Section 4 depicts the main ideas of the model
proposed. Section 5 presents the experimental framework
and the results obtained, while the model interpretability is
discussed in Section 6. Section 7 summarizes the
achievements and outlines some future developments of
the proposed methodology. Finally, a short but useful
glossary of technical terms that may be encountered in the
world of expert systems and artificial intelligence is included
in the Appendix C.
2. Some examples of ordinal regression educational
problems addressed without an ordinal regression technique
In this section, some examples of educational ordinal
problems that were addressed with the inappropriate
technique will be described. As we will discuss later, OR
problems can be easily simplified to other standard data
mining problems. In the EDM community, OR problems
have been traditionally tackled using classification or
standard regression approaches that generally involve
making some assumptions leading to the underperformance
of the final classifier/regressor model.
One very simple idea to face ordinal regression is to cast all
the different labels {C1, C2, …, CJ} (where J is the number of
classes) into real values {r1, r2, …, rJ}, where ri ∈ ℝ, and then
to apply standard regression techniques (Torra et al., 2006).
The main problem of this approach is that the real values
used for the labels may hinder the performance of the final
regressor, and there is no principled way of deciding which
value a label should have. On the other hand, OR problems
have been also tackled with standard nominal classification
models. The main problem of this approach is that the order
information provided by the labels is ignored. Therefore, the
classifier implemented does not consider this information in
the parameter estimation stage.
Then, some examples of educational problems treated as a
regression or classification ones will be highlighted. Without
loss of generality, we will focus on two kinds of problems: the
prediction of students’ satisfaction and the prediction of
students’ performance. A summary of the literature review
and its characteristics is included in Table 1.
• Ordinal regression problems addressed with a
classification approach:
○ Predicting students’ satisfaction: Firstly, it is worth
mentioning the work of Atay and Yildirim (2010).
Their paper reports the factors that affect the levels
of student satisfaction, using a dataset with 1734
students. For their study, they considered
undergraduate tourism students. The classification
tree (CT) revealed that the job considered to be
© 2015 Wiley Publishing Ltd162 Expert Systems, April 2016,
Vol. 33, No. 2
accomplished after graduating was the most
important variable to explain the student’s
satisfaction variable. From a different perspective,
Roberts and Styron Jr (2010) analysed students’
perceptions of services, interactions and experiences,
taking into account students from the College of
Education and Psychology (Southern University of
Mississippi, United States). The questionnaire was
related to academic advising, social connectedness,
involvement and engagement, faculty and staff
approachability and others. The application of
discriminant analysis to these data revealed that
the learning experience variables were the most
significant to be considered in the evaluation of
students’ satisfaction, while the Social
Connectedness and Involvement and Engagement
variables were the least significant ones in the
determination of students’ satisfaction.
○ Predicting students’ performance: Minaei-Bidgoli and
Punch (2003) proposed a genetic algorithm (GA) to
optimize a combination of classifiers such as quadratic
Bayesian classifier, 1-nearest neighbour (1-NN), k-
nearest neighbour (k-NN), Parzen-window, multi-
layer perceptron (MLP) and decision tree (DT) to
predict the students’ final grade. They took into
account features extracted from data logged in an
education web-based system. Some of these features
were the success rate, the number of attempts before
the correct answer is provided or the difference
between time of the last submission and the first time
the problem was examined. The final assessment
showed that the total number of correct answers and
the total number of tries are the most important
factors for the classification. From a different point
of view, Bhardwaj and Pal (2011) also tried to predict
the performance of students. They proposed a
categorization of students of the current year based
on the analysis carried out with the students of the
previous year. To evaluate the effectiveness of their
Bayesian classifier, the study used variables such as
the mother’s qualification, the student’s habits, the
annual family income, the students’ family status or
the living location, among others. Their main finding
was that the academic performance of students does
not only depend on their own effort.
• Ordinal regression problems addressed with a regression
approach:
○ Predicting students’ satisfaction: Analysing the paper of
Sun et al. (2008) allows us to discover the main factors
affecting learner satisfaction in e-learning. Factors such
as the attitude and the motivation of the learners, the
instructor’s performance, the design of the courses, the
available technology and the environment were
considered in this study. The findings support that
learner computer anxiety, instructor attitude towards
e-learning, e-learning course flexibility, e-learning
course quality, perceived usefulness, perceived ease of
use and diversity in assessments are the critical variables
affecting learners’ perceived satisfaction. This study
employed a stepwise multiple regression analysis.
Additionally, the research of Chang and Smith (2008)
explored the correlation between students’ perceptions
of course-related interaction and their course
satisfaction within the learner-centred paradigm in
distance education. The results demonstrated that
student–instructor personal interaction, student–
student personal interaction and student–content
interaction, along with students’ perceptions of WebCT
features and gender, really matter. A multiple linear
regression was used to prove the significance of the
variables and to model the educational problem.
○ Predicting students’ performance. Through a sample of
71 schools, Tanner (2009) analysed student performance
across three school design factors: movement and
circulation, day lighting and views. Hence, reading
comprehension, reading vocabulary, language arts,
mathematics, social studies and science were the
variables considered in the study. The prediction of
student performance was carried out through regression
analysis. The main conclusions of the paper were the
following: (a) a crowded school has a negative influence
on student performance; (b) day lighting impides on the
variables in the scores obtained in science and reading
vocabulary; and (c) views affect patterns of reading
vocabulary, language arts and mathematics allowing
the provision for the students to rest their eyes. Akiri
and Ugborugbo (2009) also investigated this topic. Their
paper attempts to model the influence of teachers’
classroom effectiveness on students’ academic
performance in public secondary schools in Delta State,
Nigeria. Factors such as lesson preparation and
Table 1: Summary of the literature review results
Predicting students satisfaction
Paper Approach Models
Atay and
Yildirim (2010)
Classification CT
Roberts and
Styron Jr (2010)
Classification Discriminant analysis
Sun et al. (2008) Regression Multiple linear regression
Chang and
Smith (2008)
Regression Multiple linear regression
Predicting students performance
Paper Approach Models
Minaei-Bidgoli
and Punch (2003)
Classification Quadratic Bayesian,
K-NN, Parzen-window,
MLP, DT
Bhardwaj and
Pal (2011)
Classification Bayesian classifier
Tanner (2009) Regression Linear regression
Akiri and
Ugborugbo (2009)
Regression Linear regression
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 163
presentation, punctuality and attendance in class, clear
communication, adequate use of instructional materials,
creativity and resourcefulness among others were used to
demonstrate that the effectiveness of teachers is not the
sole determinant of students’ academic performance.
The problem was addressed using regression analysis.
To the best of our knowledge, OR models have been
scarcely applied for educational purposes. Specifically, they
were used in just two research papers in the field of EDM
(Liu, 2009; Yay & Akıncı, 2009). In other papers, the OR
problem was addressed as a nominal one (using a classification
approach) or as a continuous one (using a regression
approach). Moreover, Yay and Akıncı (2009) and Liu (2009)
applied a classical and linear statistical model.1 This algorithm
has several drawbacks. The most obvious one is its inability to
model non-linear relations (very frequent in educational data).
Furthermore, as a classical statistical model, it assumes some
hypotheses that are difficult to satisfy in real-world problems.
The model proposed for EDM problems is described. The
proposed methodology is able to model the non-linear
relations existing in the input space without any
assumptions on the data. One of the advantages of the
model proposed is its high level of interpretability. This will
allow educational experts to gain insights about the
problem. This knowledge could be used to improve the
learning–teaching process (as discussed in Section 6).
3. Educational data mining datasets considered
3.1. Turkiye student evaluation
The Turkiye Student Evaluation (TSE) dataset is composed
of 5820 evaluation scores provided by students from Gazi
University in Ankara (Turkey) (Gündüz & Fokoué, 2013).
The dataset is publicy available in the UCI Machine
Learning repository.2 Each participant was asked 28
education-related questions. The questionnaire is listed in
Appendix B. In the original dataset, 2835 students out of
the total 5280 students provided the same score. These
evaluators were called single-minded evaluators (taking into
account the zero variation nature). Following the
recommendations of Gündüz and Fokoué (2013), two
datasets were considered in this study: the TSE dataset
including the single-minded evaluators (TSE-I-SME) and
the TSE dataset without including the single-minded
evaluators (TSE-W-SME). Furthermore, five attributes
were also taken into account in the study. These attributes
were the instructor’s identifier, the course code, the number
of times the student took the course, the level of attendance
and, finally, the level of difficulty of the course as perceived
by the student. The complete set of attributes considered in
this study are the following:
• Professor(P).Thisisanominal attributecomposedofthree
values (three professors were considered for the study).
• Subject (S). The course code is also a nominal variable. In
this case, the variable was defined with 13 values (13
subjects were considered for the study).
• Repetitions of the course (R). It is an integer attribute with
ranging values from 0 to 4 (the student with the most
repetitions was a student with four repetitions).
• Attendance level (A). The Attendance attribute is defined
in ordinal scale with the following possible values: {poor,
minimal, good, very good, excellent}.
• Difficulty level (D). The Difficulty attribute examined in the
study is an ordinal variable as well. This ordinal variable
ranges from Too easy to Too difficult, with the following five
possible values: {Too easy, Easy, Normal, Difficult, Too
difficult}.
In order to compare our results and discussions with those
obtained by Gündüz and Fokoué (2013), the dependent
variable is built through a clustering process for this specific
problem. Therefore, cluster analysis is applied to identify
potential groups in the way students rate their professors.
For the sake of simplicity, the k-Means algorithm is used
to determine the degree of satisfaction of each student. The
number of clusters (classes) was determined according to
the accumulated variance explained by the number of factors
selected. In our case, the optimum number of clusters was
three. After analysing the scores associated with each class,
we proceeded to label the three clusters. The labels for each
cluster were as follows: {Dissatisfied, neutral, satisfied}
modelling in that way the students’ satisfaction level.
3.2. Teaching assistant evaluation
Teaching Assistant Evaluation (TAE) is composed of
evaluations of teaching performance over three regular
semesters and two summer semesters of 151 teaching assistant
(TA) assignments at the Statistics Department of the
University of Wisconsin-Madison. The dataset is publicy
available in the UCI Machine Learning repository.3 The
performance of each teacher is measured with an ordinal
variable with three different levels: low performance, medium
performance and high performance. The independent
variables considered to model the teaching performance are
the following:
• A binary variable defining whether the teacher is a native
English speaker or not;
• A nominal variable to define the code of the course
(26 categories);
• A nominal variable to define the course instructor
(25 categories);1This algorithm is called the proportional odds
model (POM)
(McCullagh, 1980)
2Available at
http://archive.ics.uci.edu/ml/datasets/Turkiye+Student+
Evaluation
3https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+
Evaluation
© 2015 Wiley Publishing Ltd164 Expert Systems, April 2016,
Vol. 33, No. 2
https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Eval
uation
https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Eval
uation
• A binary variable defining whether the semester is a
regular or a summer one;
• The size of the class.
3.3. Culture and learners satisfaction
This research was carried out with a sample of students in
four online universities: the Open University of Catalonia
in Spain, the University of New Mexico in the United States,
the University of Peking in China and the Autonomous
Popular University of the State of Puebla in Mexico. The
majority of the participants were enrolled in online social
sciences courses (mainly Education or Psychology studies).
Data were collected through a survey of 709 participants.
This dataset was analysed by Barbera and Linder-Van
Berschot (2011) using statistical tests. This study will use an
OR approach. The dependent variable is learner satisfaction
(LST), while the independent ones are eight institutional
factors as follows:
(a) Learner support (LS);
(b) Social presence (SP) measuring the degree to which the
instructor seems to be concerned about the learners needs;
(c) The degree of effectiveness of the teaching strategies of
the instructor (also called Instruction (I));
(d) The quality of the Learning Platform (LP);
(e) Instructor interaction (II);
(f) Learner interaction (LI);
(g) Learning content (LC);
(h) Course design (CD).
Finally, it is also important to note that all variables are
measured with a four-point Likert scale with the following
options: strongly disagree (SD), disagree (D), agree (A)
and strongly agree (SA).
4. The method proposed: a gravitational model for ordinal
regression
This section presents the proposed algorithm. Firstly, the
ordinal regression scenario is described. Secondly, the
definition of force and distance and the probabilistic
interpretation of the force model are presented. Thirdly,
the error function used as objective function is introduced
and motivated, and finally, the procedures used to estimate
the parameters of the model are discussed.
4.1. Ordinal regression scenario
In the ordinal regression problem, a training sample set
D ¼ xn; ynð Þf gNn¼1 is available, where xn = (x1n, …, xKn)
is
the vector of input variables taking values in the input
space Ω ⊂ ℝK and the label, yn, belongs to a finite set
C = {C1, …, CJ}. Moreover, there is an order relation
between these labels, such as C1 ≺C2 ≺…CJ, where ≺ denotes
the given order between different ranks. For the proposal, the
‘1-of-J’encodingvectorisadopted.Forthatreason,eachtarget
has been encoded asyn ¼ y 1ð Þn ; y 2ð Þn ; …; y Jð Þn
� �
withy jð Þn ¼ 1 if the
pattern is from class j, and y jð Þn ¼ 0 if it is not. Clearly it
stands
that ∑Jj¼1 y
jð Þ
n ¼ 1 for every n∈ {1,…,N}.
4.2. Force definition
4.2.1. Definition of force as defined in Cano et al. (2013)
The definition of force as proposed by Cano et al. (2013) is
described first. Cano et al. (2013) weighted the gravitation
of a class by its number of patterns and the total number
of patterns.4 In that way, the gravitation of a pattern x
for a class j was defined as
g x; jð Þ ¼ G ∑
Nj
n¼1
1
d xn; x; jð Þ2
; xn ∈ Cj (1)
G :¼ 1 � Nj � 1
N
� �
(2)
where Nj is the number of patterns of the class j and N is the
total number of patterns. Furthermore, Cano et al. (2013) used
a weight matrix W∈ ℝJ ×ℝK to define the importance of each
attribute in each class. This matrix was applied to the distance
calculation in their gravitational model. Therefore, the
distance proposed in their work is defined as follows:
d x1; x2jð Þ ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffiffiffiffiffiffiffiffiffiffiffiffiffi
∑
K
k¼1
wj;k x1;k � x2;k
� �2s
; (3)
where x1 and x2 are two patterns and wj,k is the weight of
input variable k for class j. The attribute-class weight matrix
W was optimized by an evolutionary algorithm (trying to
maximize the performance of the final model). All the
nominal variables were transformed to binary variables
generating k-1 variables per attribute (dummy variables)
where k is the number of possible variables of the nominal
attribute. Ordinal variables are treated as continuous
variables to compute the Euclidean distance. Please note
that several similar measures exist to determine the distance
between two ordered vectors like the correlation coefficient-
based metrics (such as the Spearman distance or the Kendall
distance). None of the correlation coefficient-based distance
functions satisfy the triangle inequality and hence are
known as semi-metric, and therefore, they were not included
in the gravitational proposed model (Monjardet, 1997).
Additionally, normal practice is to treat Likert scales as a
continuous variable even though they are not. As long as
they have more than five possible values, the bias from
discreteness is not large.
Finally, once the weight matrix is optimized, the class
label of each pattern is determined by comparing the
4With the goal of enhancing the accuracy of prediction for the
minority
classes
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 165
gravitational forces existing between the pattern considered
and the different classes. The pattern will adopt the label of
the class with the highest gravitational force.
4.2.2. Definition of force for the ordinal regression case In
the OR case, the model must take into account the ordinal
information of the labels. For the sake of simplicity, suppose
you have a problem with J=3 classes ordered as C1 ≺C2 ≺C3.
Figure 1 represents the input space of this hypothetical ordinal
regression problem.
Given a new test pattern x1, according to the gravitational
models, the gravitational forces of this pattern with respect
to the three classes have to be estimated. Assuming a
Euclidean distance, the gravitational forces (computed as
in proposed in Cano et al. (2013)) for the test pattern x1 are
g(x, C1) = 20.132, g(x, C2) = 1.734 and g(x, C3) = 23.006.
The highest gravitational force is g(x, C3). C3 is then
attributed to the test pattern. If there is no order relation
between the classes, the second highest gravitational force
is g(x, C1). However, if the classes are ordered, C2 is closer
to C3 than C1, and therefore, the second highest gravitational
force should be attained in this class. To generalize, the
forces associated to each class should follow a unimodal
distribution, that is, they should present only one maximum,
which should be absolute. This idea was already applied in
the context of neural networks (da Costa et al., 2008).
To preserve the ordinal information of the different
classes, two approaches could be applied as follows:
• Modification of the distance: The first possibility to modify
the force is to directly modify the distance allowing the
ordering of the class labels for the pattern considered. There
are many possible choices for the definition of this distance.
The most natural choice is to employ a matrix G∈ ℝK ×ℝK
so that the distance between two patterns is computed as
xTGx, like in the Mahalanobis distance case. Another
possibility is to adopt the definition of distance of Cano
et al. (2013) (also called the weighted Euclidean distance).
Depending on the choice of the distance, the interpretability
of the model will be different. In fact, the elements of the
matrix G are a measure of the correlation between the
different attributes of the given dataset, whereas the
elements of the matrix W∈ ℝJ ×ℝK indicate theimportance
of an attribute in the classification with respect to a certain
class. In our study, both possibilities will be considered
(extending in this direction previous works that only consider
the weighted Euclidean distance for gravitational models).
• Modification of the force: Another possibility, in the
example of 1, to reduce the value of gravitation in C1 is
to act on the definition of the force itself. For example, one
could define a general force law of a pattern x for a class j as
g x; jð Þ ¼ 1 � Nj � 1
N
� �
∑
Nj
n¼1
1
d xn; x; jð Þ þ aj
� �vj ; xn ∈ Cj; (4)
where Nj is the number of patterns of the class j, N is the total
number of patterns and the distance is defined as in Equation
(3).
Note that in the aforementioned definition, one parameter
vj for each class is considered and that, when the distance
tends to zero, the force tends to infinity. To have a proper
control over the force value, the aj ∈ ℝ parameter is introduced
in the definition of force and is calculated for each class as
aj ¼
1
maxForce
� �1
vj
; (5)
where maxForce is the maximum value of force allowed.
4.3. Probabilistic interpretation of the forces
The order is included in the model following a cost-sensitive
approach, penalizing non-unimodal distributions of the
force outputs. After this procedure, a multinomial logit
formulation could be applied to define the probabilities of
each force. Therefore, for robustness of the optimization
process, the force for each class is normalized according to
the softmax activation function (Bishop, 2007). The softmax
activation function maps the range of the force for the j-th
class, into the interval [0, 1] with the additional property that
the sum of the forces of a pattern towards all classes is one.
This transformation can be seen as an estimation of the a
posterior probability of a pattern to be classified as a
member of each class. The softmax function for the force-
based model proposed is defined as
P Cljxð Þ ¼
exp g x; lð Þð Þ
∑Jj¼1 exp g x; jð Þð Þ
; (6)
where P(Cl|x) is the a posterior probability of the pattern x to
belong to Cl and g(x,j) is defined as in Equation (4). This
transformation allows us to have the forces in the same scale
that the targets labels (because the 1-of-J encoding is adopted
in this work).
4.4. Error function formulation
As previously stated, the forces (or the a posterior
probabilities) obtained for a given pattern x must follow a
−1 0 1 2 3 4 5 6 7 8 9
−1
0
1
2
3
4
5
6
7
8
x
1
x 2
Figure 1: An example of classification using gravitational-
based models assuming a Euclidean distance.
© 2015 Wiley Publishing Ltd166 Expert Systems, April 2016,
Vol. 33, No. 2
unimodal distribution. The unimodality constraint is
imposed redefining the error function with a penalization
term for non-unimodal distributions. In our research, the
error function of the model proposed is defined as
E W or G; vð Þ ¼ 1
N
∑
N
n¼1
∑
J
j¼1
½y jð Þn P Cjjxn
� �
� 1
� �2
þ 1 � y jð Þn
� �
cnjP Cjjxn
� �2�
(7)
where cnj is the cost associated with the pattern n for the jth
class. As can be seen in Equation (7), the error function
penalizes non-unimodal outputs. The total cost matrix is
obtained as C = Y × M, where Y is the matrix representing
the ‘1-of-J’ encoding and M is a well-known cost matrix.
For example, the absolute cost matrix (mij = |i � j|), the
quadratic cost one (mij = |i � j|2) or the zero–one cost
matrix. Note that the zero–one cost matrix is the one
assumed in nominal classification. In this study, the
penalization function with quadratic cost terms achieved
the best trade-off between convergence of the optimization
problem, quality of the solution and the related classification
performance. Therefore, the quadratic cost matrix is used for
the proposed model.
4.5. Parameter estimation
The optimization of the W or the G matrices and the v vector is
a
continuous optimization problem whose dimension J� K+J or
K� K+J depends on the number of dimensions and the number
of classes. To estimate the parameters of the model, an
evolutionary algorithm is considered. Evolutionary algorithms
have been successfully applied to estimate the parameters of
machine-learning models in recent years Fernández-Navarro
et al. (2012); Mirchevska et al. (2014). Specifically, the CMA-
ES algorithm Hansen and Ostermeier (2001) was used to
determine the optimization variables (the W or the G matrix
and the v vector). The CMA-ES algorithm is an evolutionary
algorithm (global optimization procedure) for difficult non-
linear non-convex optimization problems in continuous domain.
Furthermore, the initial values for the W are set to 1.0
and for the v to 2.0, that is, all dimensions initially
considered equally relevant and the Euclidean distance is
assumed. The correlation matrix, G, was initialized to have
zero correlation between the different input variables (the
matrix was initialized to be equal to the identity matrix).
4.6. Summary of methodologies
The algorithm proposed has several variations according to
the cost matrix used (nominal or ordinal classification) and
to the distance considered (the weighted Euclidean or the
Mahalanobis distance). The different combinations are
summarized as follows:
• Nominal approaches:
○ Generalized force-based model with a zero–one cost
and the Mahalanobis distance GFMMZOC
� �
.
○ Generalized force-based model using a zero–one cost
and the weighted Euclidean distance GFMWEZOC
� �
.
• Ordinal regression approaches:
○ Generalized force-based model assuming a quadratic
cost and the Mahalanobis distance GFMMQC
� �
.
○ Generalized force-based model considering a quadratic
cost and the weighted Euclidean distance GFMWEQC
� �
.
5. Computational experiments and results
This section presents the experimental study performed to
validate the new algorithms. In Section 5.1, the measures
employed to evaluate the performance of the algorithms and
the description of the algorithms chosen for the comparison
and their relevant parameters are given. The results of the
different methods selected are provided in Section 5.2.
5.1. Experimental design
For comparison purposes, different state-of-the-art methods
have been included in the experimentation. These methods
are the following:
• Nominal classifiers
○ The multi-logistic regression (MLR) algorithm. It is
based on applying the LogitBoost algorithm with
simple regression functions and determining the
optimum number of iterations by a fivefold cross-
validation (Landwehr et al., 2005).
○ An MLP with sigmoid units as hidden nodes,
obtained by means of the back-propagation algorithm
(Witten & Frank, 2005).
○ Support vector machine (SVM) (Vapnik, 1999)
nominal classifier is included in the experiments in
order to validate our proposal contributions. Cost
support vector classification (SVC) available in
libSVM 3.0 (Chang & Lin, 2001) is used as the SVM
classifier implementation.
• Regression approaches
○ Regression neural network model (RNN): as stated in
Section 2, regression models can be applied to solve
the classification of ordinal data. A common
technique for ordered classes is to estimate by
regression any ordered scores s1 ⪯ s2 ⪯ … ⪯ sJ � 1 ⪯ sJ
by replacing the target class Ci with the score si. The
simplest case would be setting si = i; i = 1, …, J. A
neural network with a single output was trained to
estimate the scores.
• Ordinal regression approaches
○ The proportional odd model (POM) McCullagh (1980)
is an extension of the binary logistic regression model
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 167
for ordinal multi-class categorization problems. This is
one of the first models specifically designed for ordinal
regression, and it arose from a statistical background.
For educational purposes, this model was used in Yay
and Akıncı (2009) and Liu (2009).
○ Support vector ordinal regression (SVOR) by Chu and
Keerthi (2005, 2007) proposes two new support vector
approaches for ordinal regression. In this study, the
two approaches proposed are considered: the SVOR
with explicit constraints algorithm (SVOREX) and the
SVOR with implicit constraints method (SVORIM).
All SVM classifiers were run using tools available in the
libsvm library (version 3.0) (Chang & Lin, 2001). The authors
of SVOREX and SVORIM provide software tools of their
methods.5 The mnrfit function of MATLAB (MathWorks,
Natick, MA, United States) was used for training the POM
model. The MLP and MLR methods were run using Weka’s
tools.6 Finally, the RNN method was implemented following
the suggestions by the authors Fernández-Navarro et al. (2013).
Regarding the hyper-parameters of different algorithms,
the following procedure has been applied. For the support
vector algorithms, that is, SVC, SVOREX and SVORIM,
the corresponding hyper-parameters (regularization
parameter, C and width of the Gaussian functions, γ) were
adjusted using a grid search with a fivefold cross-validation,
with the following ranges: C ∈ {103, 101, …, 10� 3} and
γ ∈ {103, 100, …, 10� 3}. For the neural network algorithms,
that is, MLP and RNN, the corresponding hyper-
parameters (number of hidden neuron, H, and number of
iterations of the local search procedure, iterations 7) were
adjusted using a grid search with a fivefold cross-validation,
considering the following ranges: H ∈ {5, 10, 15, 20, 30, 40}
and iterations ∈ {25, 50, …, 500}. For the MLP method,
the learning rate was set to 0.3 and the momentum to 0.2.
Two evaluation metrics were considered to validate the
performance of the different models: (a) the Accuracy
(Acc) and (b) the mean absolute error (MAE). Acc is the
correct classification rate
Acc ¼ 1 � 1
N
∑
N
i¼1
I y�i ≠ yi
� �
¼ 1 � MZE; (8)
where yi is the true label, y
�
i is the predicted label, N is the
number of patterns and I(�) corresponds to the zero–one loss
function. Hence, MZE is the mean zero error. The MAE is
the average deviation in absolute value of the predicted rank
from the true one
MAE ¼ 1
N
∑
N
i¼1
O yið Þ � O y�i
� � ; (9)
where O yið Þ � O y�i
� � is the distance between the true and
predicted ranks. The first measure is simply the fraction of
correct predictions on individual samples. The second
metric is defined as the average deviation of the prediction
from the true targets. These two measures aim to evaluate
different aspects when an OR problem is considered:
accuracy measures that patterns are generally well classified
and the MAE measures that the classifier tends to predict a
class as close to the real class as possible.
Finally, regarding the evaluation of the performance of
the different methods, multiple random splits of the
datasets were considered. For the educational OR
problems, 30 splits with 50 % and 50 % of the instances
in the training and test sets were considered, respectively.
All the partitions were the same for all the methods
evaluated, and one model was trained and evaluated for each
split. A similar experimental setup was performed in a recent
review of ordinal models Gutiérrez et al. (2012).
5.2. Results
The gravitation-based methods were compared with the
well-known nominal classification, OR and regression
techniques described in Section 5.1, using the OR metrics.
Table 2 shows the overall generalization results obtained
with the different techniques tested. A descriptive analysis
of the results leads to the following remarks: (a) The
GFMMQC methods achieved the best results in two datasets
and the second best result in one case using the MAEG
metric as the test variable, while the GFMWQC achieved the
best performance in one dataset and the second best results
in another problem using the same metric. (b) The
gravitational ordinal models are still competitive in AccG,
achieving the best results in two problems and the second
best results in other two problems. As can be observed,
the POM model is not able to reflect non-linear
relationships among input variables, necessary for
performing a realistic classification task. It is important to
highlight that this was the model that was tested in Yay
and Akıncı (2009) and in Liu (2009). In general, OR
models tended to outperform their nominal counterparts
(the SVORIM and the SVOREX methods obtained better
results than their nominal version, the SVC).
Finally, each pair of algorithms is compared by means of
the Wilcoxon test Demsar (2006). A level of significance of
α = 0.05 was considered, and the corresponding correction
for the number of comparisons was also included. The
control method was the GFMMQC method because it obtained
the best mean ranking specially in the MAEG metric
(especially useful in ordinal problems). As shown in Table 2,
the GFMMQC yields the state-of-the-art in the OR field.
6. Discussions
In this section, we analyse the force-based ordinal models
(both the model based on the Mahalanobis distance, the
GFMMQC method and the one based on the weighted Euclidean
5SVOREX and SVORIM methods source code available at
http://
gatsby.ucl.ac.uk/ chuwei/svor.html
6Weka: http://www.cs.waikato.ac.nz/ml/weka/
7The iterations in the MLP method correspond to the training
time
required
© 2015 Wiley Publishing Ltd168 Expert Systems, April 2016,
Vol. 33, No. 2
http://www.cs.waikato.ac.nz/ml/weka/
distance, the GFMWEQC method) of the first split out of the
thirty splits in the TSE-W-SME dataset. The most important
attributes in the classification of student satisfaction
obtained according to the GFMWEQC model are extracted. The
most significant correlations detected by the GFMMQC model
are examined. Both analyses are useful to extract meaningful
knowledge to improve the teaching-learning process.
Finally, some educational recommendations based on the
interpretation of the models are provided.
6.1. Analysis of the best GFMWEQCmodel
This section provides an interpretation of the GFMWEQC model
of the first split out of the 30 splits in the TSE-W-SME
dataset. Firstly, the statistical properties of the model are
described. Table 3 shows the statistical results of the model
implemented including the confusion matrices as well.8
Table 2: Generalization results of the AccG and MAEG of the
methods proposed compared with those obtained using different
statistical and artificial intelligence methods. Results and p-
values of the Wilcoxon rank sum test
TSE-I-SME TSE-W-SME
AccG MAEG p- valueAcc p � valueMAE AccG MAEG p-
valueAcc p- valueMAE
MLR 89.071.82 0.12260.02 3.0E � 11∘ 2.8E � 11∘ 87.251.27
0.14320.02 1.3E � 8∘ 1.3E � 10∘
MLP 87.982.10 0.13580.01 3.0E � 11∘ 3.0E � 11∘ 86.432.12
0.16670.01 2.4E � 9∘ 3.0E � 11∘
SVC 93.710.89 0.09470.01 0.0724 2.9E � 9∘ 88.430.08
0.11010.03 1.6E � 7∘ 2.8E � 04∘
GFMMZOC 93.031.42 0.08340.01 1.7E � 4∘ 3.1E � 5∘
88.731.78 0.10880.01 1.3E � 6∘ 1.7E � 04∘
GFMWEZOC 93.441.39 0.09140.01 0.0063∘ 3.3E � 8∘
88.931.33 0.10910.01 5.4E � 5∘ 2.3E � 04∘
RNN 89.562.03 0.16880.02 4.0E � 11∘ 3.0E � 11∘ 82.561.43
0.19070.01 1.1E � 11∘ 3.0E � 11∘
POM 90.891.67 0.11170.02 7.3E � 11∘ 3.6E � 11∘ 88.050.22
0.12320.02 4.4E � 7∘ 1.2E � 7∘
SVOREX 93.161.33 0.07970.01 0.0011∘ 0.0042∘ 89.760.87
0.12210.01 0.0016∘ 2.4E � 6∘
SVORIM 93.691.22 0.07610.01 0.0679 0.0963 89.951.11
0.12090.02 0.0456∘ 3.7E � 7∘
GFMMQC 94.181.01 0.07190.01 — — 91.652.35 0.09350.01 —
-
GFMWEQC 93.571.84 0.08220.01 — — 90.572.18 0.09570.01
— -
TAE CLS
AccG MAEG p- valueAcc p � valueMAE AccG MAEG p-
valueAcc p- valueMAE
MLR 49.425.77 0.50530.05 3.0E � 11∘ 7.7E � 9∘ 56.671.21
0.57300.08 3.0E � 11∘ 3.0E � 11∘
MLP 55.333.26 0.46970.08 3.0E � 11∘ 2.6E � 6∘ 62.221.89
0.48180.09 3.0E � 11∘ 5.6E � 8∘
SVC 59.607.39 0.44830.07 0.2971 1.7E � 6∘ 69.301.71
0.46030.05 0.0850 2.2E � 5∘
GFMMZOC 58.514.59 0.44930.07 0.0031∘ 3.9E � 4∘
69.251.76 0.46360.05 0.2519 3.1E � 4∘
GFMWEZOC 58.965.73 0.43860.05 0.3722 0.0023∘ 68.850.69
0.47950.03 0.9823 1.7E � 7∘
RNN 54.564.76 0.47970.05 3.0E � 11∘ 2.2E � 7∘ 60.121.03
0.50540.06 3.0E � 11∘ 8.1E � 10∘
POM 50.447.73 0.49620.07 3.0E � 11∘ 7.7E � 6∘ 57.231.43
0.56890.05 3.0E � 11∘ 3.0E � 11∘
SVOREX 57.895.82 0.44110.06 0.0040∘ 0.0300∘ 67.800.89
0.42350.07 5.9E � 5∘ 0.6843
SVORIM 57.635.71 0.40110.07 1.94E � 4∘ 0.9823 68.711.34
0.42690.03 0.4733 0.9589
GFMMQC 59.186.40 0.39830.05 — — 68.902.14 0.42740.05 —
-
GFMWEQC 59.406.40 0.39110.05 — — 67.121.99 0.43120.07
— -
MLR, multi-logistic regression; SVC, support vector
classification; MLR, multi-layer perceptron; RNN, regression
neural network model; POM,
proportional odd model; SVOREX, support vector ordinal
regression with explicit constraints algorithm; SVORM, support
vector ordinal
regression with implicit constraints; TSE-I-SME, Turkiye
student dataset including the single-minded evaluation; TSE-W-
SME, Turkiye student
dataset without including the single-minded evaluation; TAE,
Teaching assistant evaluation; CLS, Culture and learners
satisfaction.
The best result is in bold face and the second one in italics
∘ : The null hypothesis that results provided by the comparison
method and the results ofGFMMQC are samples continuous
distributions with equal medians is rejected
8The contingency or confusion matrix CM for a classification
problem
with J classes and N training or generalization patterns is given
by the
following expression:
M ¼ nij; ∑
J
i;j¼1
nij ¼ N
( )
(10)
where nij represents the number of times the patterns are
predicted by
classifier g to be in class j when they really belong to class i.
The diagonal
corresponds to correctly classified patterns and the off-diagonal
to mistakes
in the classification task.
Table 3: Statistical values of the best GFMWEQC model
Best GFMWEQC ordinal regression model
AccT ¼ 100:00%; AccG ¼ 92:80%
MAET ¼ 0:0000; MAEG ¼ 0:0832
CMT ¼
648 0 0
0 507 0
0 0 213
0
[email protected]
1
CA; CMG ¼
678 18 3
44 452 10
13 14 185
0
[email protected]
1
CA
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 169
Table 3 includes the following information: accuracy on the
training set (AccT), accuracy on the generalization (test) set
(AccG), MAE on the training set (MAET) and MAE on the
generalization set (MAEG), confusion matrix (CM) for the
training set (CMT) and CM for the generalization set
(CMG). As can be seen in the CMs in Table 3, the model in
general is able to reflect the order among the different classes.
Then, we will interpret the coefficient of the W∈ ℝJ ×RK
matrix. The elements of the W matrix indicate the importance
of an attribute in the classification with respect to a certain
class. Thus, the model detects the most important attributes
in the classification of the student’s satisfaction level (taking
into account the ordinal nature of the dependent variable).
The algorithm detected that the seven most influential variables
in the determination of student satisfaction were the following
(in this order): instructor’s knowledge (Q13), instructor’s
effective use of the class hours (Q19), instructor’s coherence
with lesson plan (Q15), openness and respect of the instructor
to students’ views (Q22), instructor’s positive approach to
students (Q21), instructor readiness for classes (Q14), instructor
explanations about the course and instructor helpfulness (Q20).
On the other hand, the least important variables to explain
student satisfaction ratings were the following (in this order):
new perspective of students’ life and world (Q12), clearness of
course aims (Q1), subject (S), difficulty (D), attendance (A),
professor (P) and number of repetitions of the course (R). It
can be seen from these results that the best indicators of student
satisfaction are those related to professor competencies.
Specifically, the students tend to consider the variables related
to the professor more important, giving less importance to the
variables related to the effect of learning and the course design.
These results align with previous research that claims that the
learner’s satisfaction is positively correlated with quality of
learning outcomes. For example, Palmer and Holt (2009)
justified the importance of adopting an interactive learning
approach instead of a planned learning approach where the
learners’ satisfaction is promoted mainly through the elements
existing in the educational interaction (instead of basing the
learners’ satisfaction in the preparation of the lectures and in
the content taught during the lectures). These results also
validate the work of Chang and Smith (2008), where the
importance of the educational interaction for learner
satisfaction is strongly emphasized. Furthermore, our work is
also in line with the works of Bangert (2008) and Shea and
Bidjerano (2009), where the importance of the social presence
in educational environments is highlighted. Differing from the
study of Atay and Yildirim (2010), our learners do not consider
the elements of learning transfer for their academic satisfaction
important; instead, they focus on the elements of the
instructional moment. It is also worth highlighting that this
study contradicts the traditional belief that the student
satisfaction is highly correlated with the difficulty of the course
as perceived by the learners. Unfortunately, this study has not
considered all the variables affecting learner satisfaction
reported in the learner satisfaction literature (Yukselturk,
2009). For example, the educational level, the self-efficacy or
the locus of control variables were not included in this study.
Finally, we compare our work with that of Gündüz and
Fokoué (2013); in particular, we discuss the similarities and
differences of the two studies. Gündüz and Fokoué (2013)
concluded that the Q10, Q14, Q20 and Q24 questions were the
most important variables to explain student satisfaction ratings.
In the study of Gündüz and Fokoué (2013), learners give more
importance to individual questions and structural aspects related
to the design of the learning process in contrast to what has
been
found in our study. This study also shares some similarities with
our study. For example, both studies consider the instructor
readiness for classes and the instructor’s positive approach to
students to be very important. The differences between the two
studies can be justified for the following reasons. Firstly,
Gündüz and Fokoué (2013) included the single-minded
evaluators in their dataset, while in the TSE-W-SME, these
evaluators were discarded. Secondly, Gündüz and Fokoué
(2013) applied a nominal classifier to detect the most
important variables, ignoring the ordinal information existing
in the dependent variable. Taking into account the ordering
information, our study was able to outperform the base
classifier adopted in the study of Gündüz and Fokoué (2013).
6.2. Analysis of the best GFMMQC model
In this section, we analyse the performance of the
bestGFMMQC
model, interpreting its coefficients as a way of improving the
learning–teaching process. Table 4 shows the statistical results
of the best model implemented. As can be seen in the CMs in
Table 4, the model promotes the ordering among the different
classes. There are less errors between not adjacent classes than
between adjacent ones. For example, considering the test set,
there are eight students classified as Neutral when they should
be classified as Satisfied and just two students that were
classified as Dissatisfied, being Satisfied students.
On the other hand, we also analyse the coefficients of the
G ∈ ℝK × ℝK matrix. The elements of the matrix G are a
measure of the correlation between the different attributes
of the given dataset. The G matrix represents the existing
covariance between the independent variables. In this
section, we focus our attention on the existing correlations
detected by the algorithm for the ordered variables (numeric
and Likert variables).9 It is important to note that these
9Note that all nominal variables were transformed to binary
variables
generating k variables per attribute, where k is the number of
possible
values of the nominal attribute.
Table 4: Statistical values of the best GFMMQC model
Best GFMMQC ordinal regression model
AccT ¼ 100:00%; AccG ¼ 96:47%
MAET ¼ 0:0000; MAEG ¼ 0:0409
CMT ¼
648 0 0
0 507 0
0 0 213
0
[email protected]
1
CA; CMG ¼
689 8 2
19 482 5
6 10 196
0
[email protected]
1
CA
© 2015 Wiley Publishing Ltd170 Expert Systems, April 2016,
Vol. 33, No. 2
correlations are needed to perform the mapping from the
input space to the output one. They are not necessarily the
existing correlations in the original input space. In other
words, two variables that are correlated in the G matrix
are not necessarily correlated in the input space (there is
not necessarily a problem with the multicollinearity of the
input data). The multicollinearity problem affects linear
models specifically. The proposed model is capable of better
modelling of the problem in this scenario as observed in the
experimental results.
Then, we proceed to analyse the meaning and the impact
of the existing correlations among the most important
variables detected by the previous algorithm (Q13, Q19,
Q15, Q22, Q21, Q14 and Q40 variables). The instructor’s
knowledge (Q13) variable is highly correlated to the
instructor’s effective use of the class hours (Q19) variable.
According to that, having top-level knowledge allows the
teachers to effectively use his/her teaching hours. This
synergy has an important effect in the final student
satisfaction rating as shown in the previous section. On the
other hand, the instructor’s coherence with the lesson plan
(Q15) variable is significantly correlated to the following
variables: openness and respect of the instructor to students’
views (Q22), instructor’s positive approach to students
(Q21), instructor readiness for classes (Q14) and instructor
explanations about the course and instructor helpfulness
(Q20). Taking into account the existing correlations
between the most important variables, we recommend that
the instructors focus on the two following ones:
• The instructor’s knowledge (Q13): Improving this
variable, we can also improve the second most important
variable (instructor’s effective use of the class hours
(Q19)). This finding allows us to discover a new point
of view in the traditional overview of a university
professor. Subject matter knowledge is important,
however, in the teaching–learning process, it is necessary
to have a professor who can communicate his/her
knowledge effectively (an effective professor). The work
of Gibbs and Coffey (2004) showed that there were
significant positive changes with respect to his/her
effectiveness in trained teachers and negative changes in
untrained teachers. In fact, this is one of the reasons
why the professor training in universities around the
world is so appreciated. So, it is critical to pay special
attention to the training of professors. Because of this
theory, student give to the effective use of the class hours
variable an organizational and personal status because they
have no sense of wasting time in class. The duo composed
of the knowledge of the professor variable and the
methodology applied variable is extended to a triplet in this
study (by the inclusion of the effective use of hours
variable). In this new framework of learning, the feeling
of learning governs the experience of learning. For
knowing the scope of the aforementioned, these
correlations should be linked to learning outcomes. Thus,
we would have an external measure of the students’
perceptions about what gives them greater academic
satisfaction.
• The instructor’s coherence with lesson plan (Q15):
Controlling this variable, the instructor may yield high
rates also in the following next four important variables
(Q22, Q21, Q14 and Q20). The teachers’ coherence in
their teaching–learning method directly affects the
student’s academic satisfaction. Empirically, we have
proved that the participation that teachers allow to their
students is a powerful variable in the determination of
the student’s satisfaction rating. Thus, students do not
have a good perception of a professor who gives a
traditional lecture (without any interaction). They
appreciate it when the professor follows the learning plan
rigorously, interacts with them taking into account their
previous knowledge and experiences in a positive way
and is willing to help in their learning. It stresses the
importance of leaving the traditional teaching method
(where the only interaction with the learners is in the
transfer of the professor’ knowledge to the students) to
adopt a more interactive teaching method (focused on
enhancing students skills, promoting that students are
able to continue the quest for knowledge throughout their
studies). On the other hand, the inquiry-based learning
approach promotes the social presence, the cognitive
presence and the teaching presence Garrison (2011). All
these variables were highlighted as key variables in the
prediction of students’ performance. Therefore, the
adoption of this learning approach will allow to
practitioners to improve their student evaluations.
7. Conclusions
The presented work introduces the OR paradigm in the field
of EDM, enlarging the techniques available in the EDM
framework (mainly focused in nominal classification and
regression approaches). The presented paradigm differs
from existing nominal classification or regression techniques
in the nature of the variable of study. The main particularity
of an OR problem is that its variable of study (also called
dependent variable) is discrete and that the labels of its
different classes have an intrinsic (natural) order. The
proposed OR models could be used for the following:
• Predict students’ future learning behaviour using an
ordinal regression approach;
• Study the effects of different kinds of technological-
pedagogical support;
• Advance scientific knowledge about learning and
learners.
After presenting some OR problems that were improperly
addressed with nominal classification or regression methods,
more attention has been given to describe an interpretable
and amenable model based on the concept of gravitation.
The model was specially designed for this research. It took
into account all the specific characteristics of the problem
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 171
to be tackled. The proposed method extends the state-of-
the-art of gravitational models by generalizing the definition
of force in its mathematical expressions. Furthermore, the
model was adapted to the ordinal scenario (imposing
the well-known unimodal constraint in the outputs of the
model). The proposed models are easily interpretable, which
make them especially interesting for educational purposes
enabling use by educational practitioners, not just by
researchers. To exhibit the importance of this paradigm
for the EDM community and also the interpretability of
the proposed models, the methods were tested with four
OR EDM datasets. The gravitational ordinal models
achieved a competitive performance especially if they are
compared with state-of-the-art classification models.
Finally, it is worth mentioning that the interpretation of
the model allows us to extract some important conclusions
for educational environments. The main educational
findings of this study are that the two key factors in the
prediction of learners’ satisfaction are the instructor’s
knowledge and the instructor’s coherence with the lesson
plan. The remaining most important variables are strongly
correlated with these ones. From these correlations, we show
the importance of leaving traditional teaching methods to
adopt a teaching method that analyses and appreciates the
students’ knowledge and skills and promotes the interaction
between the main actors in the learning–teaching experience.
Acknowledgements
The research work of F. Fernández-Navarro was partially
supported by the TIN2014-54583-C2-1-R project of the
Spanish Ministry of Economy and Competitiveness
(MINECO), FEDER funds and the P2011- TIC-7508
project of the “Junta de Andalucia” (Spain).
Appendix A. EDM techniques implemented nowadays
Currently, EDM techniques have been used to address mainly
the following type of problems (Romero & Ventura, 2007):
(a) Analysis and visualization data: The objective of the
analysis and visualization data is to summarize useful
information in a visual way and to support the
decision-making process. Statistics and visualization
information are the two main techniques used for this
task. Statistics on students’ usage are a powerful tool
to evaluate the impact of an e-learning system. Usage
statistics may be extracted using standard tools
designed to analyse web server logs (Zaïane et al.,
1998). Other general statistics may also represent the
connected student distribution through time or the
most frequently acceded courses (Zorrilla et al., 2005).
(b) Clustering: It is the task of grouping a set of patterns in
such a way that patterns in the same group (called a
cluster) are more similar (in general according to a
distance criteria) to each other than those in the other
groups (clusters). For example, in Tang et al. (2000),
data clustering is used to promote group-based
collaborative learning. They found clusters of students
with similar learning characteristics based on the
sequence and the contents of the pages they visited.
(c) Classification: The main objective of classification is to
identify which of a set of categories (sub-populations)
a new pattern (also called observation or instance)
belongs, on the basis of a training set of data containing
patterns whose category membership is known (Zafra
et al., 2011). An example of this task could be the
classification of the final grade of the students based
on features extracted from web-logs, as proposed in
Minaei-Bidgoli and Punch (2003). In this research work,
the dependent variable is estimated as a discrete variable
(the final grade is represented as A, B, C and D). One
problem associated with this approach is that the
ordinal nature of the dependent variable was not taken
into account in the design of the classifier. This can
result in the underperformance of the final classifier.
(d) Regression: The main objective of regression analysis is the
prediction of the value a continuous dependent variable
according to the values of several independent variables.
In classification problems, the dependent variable is
discrete, while in regression analysis, the dependent
variable is continuous. An example of regression analysis
is the prediction of the final grade of certain students. In
this specific problem, the final grade should be represented
as a continuous variable (ranging from 0 to 10).
(e) Outlier detection (or anomaly detection) is the
identification of items, events or patterns that do not
conform to an expected pattern or other items in a
dataset. Typically, the anomalous items will translate
to some kind of problem such as bank fraud or a
structural defect. Anomalies are also referred to as
outliers, novelties, noise, deviations and exceptions.
Ueno (2003) proposes to use the response time data
from e-learning environments as a means of detecting
outliers or irregular learning patterns in learners. The
outlier statistics are developed considering both
students’ abilities and content difficulties.
(f) Association rule learning is a method for discovering
interesting relations between variables in large datasets.
For example, the rule { morning, high flexibility } → {A}
found in e-learning systems would indicate that if the
student of an e-learning system has a high flexibility in
his/her study time and also he/she studies during the
morning, then he or she is likely to achieve the best
grade in his/her studies. In this context, it is worth
highlighting the work of Romero et al. (2004) where a
grammar-based genetic programming with multi-
objective optimization techniques for providing a
feedback to courseware authors is proposed.
Appendix B. Questionnaire for the Turkiye Student
Evaluation Dataset
Questions answered by students are again in ordinal scale.
Concretely, they are defined with a 5-point Likert scale with
© 2015 Wiley Publishing Ltd172 Expert Systems, April 2016,
Vol. 33, No. 2
the following values {strong disagree, disagree, neutral,
agree, strongly agree}. Specifically, the students answered
to the following questions:
• Q1: The semester course content, teaching method and
evaluation system were provided at the start.
• Q2: The course aims and objectives were clearly stated at
the beginning of the period.
• Q3: The course was worth the amount of credit assigned to it.
• Q4: The course was taught according to the syllabus
announced on the first day of class.
• Q5: The class discussions, homework assignments,
applications and studies were satisfactory.
• Q6: The textbook and other course resources were
sufficient and up to date.
• Q7: The course allowed field work, applications,
laboratory, discussion and other studies.
• Q8: The quizzes, assignments, projects and exams
contributed to helping the learning.
• Q9: I greatly enjoyed the class and was eager to actively
participate during the lectures.
• Q10: My initial expectations about the course were met at
the end of the period or year.
• Q11: The course was relevant and beneficial to my
professional development.
• Q12: The course helped me look at life and the world with
a new perspective.
• Q13: The Instructor’s knowledge was relevant and up to
date.
• Q14: The Instructor came prepared for classes.
• Q15: The Instructor taught in accordance with the
announced lesson plan.
• Q16: The Instructor was committed to the course and was
understandable.
• Q17: The Instructor arrived on time for classes.
• Q18: The Instructor has a smooth and easy to follow
delivery/speech.
• Q19: The Instructor made effective use of class hours.
• Q20: The Instructor explained the course and was eager
to be helpful to students.
• Q21: The Instructor demonstrated a positive approach to
students.
• Q22: The Instructor was open and respectful of the views
of students about the course.
• Q23: The Instructor encouraged participation in the
course.
• Q24: The Instructor gave relevant homework assign
ments/projects, and helped/guided students.
• Q25: The Instructor responded to questions about the
course inside and outside of the course.
• Q26: The Instructor’s evaluation system (midterm and final
questions, projects, assignments, etc.) effectively
measured the course objectives.
• Q27: The Instructor provided solutions to exams and
discussed them with students.
• Q28: The Instructor treated all students in a fair and
objective manner.
Appendix C. Glossary of Terms
A list of technical words related to the manuscript is given in
the following.
Accuracy Accuracy is the percentage of patterns
correctly classified by the model. It is
also known as the Correct Classification
Rate (CCR).
Educational data
mining
Educational Data Mining is a field of
study where the goal is to develop new
methods for exploring educational data.
Confusion matrix
(or contingence
matrix)
A confusion matrix, also known as a
contingency matrix, is a table that
contains information about actual and
predicted classifications carried out by a
classification model. Each column of the
matrix encompasses the instances in a
predicted class, while each row includes
the instances in an actual class.
Classification In supervised learning, classification is
the problem of determining to which of
a set of categories a new pattern belongs,
on the basis of a training set where the
category to which each pattern belongs
and its characterization are known.
Clustering Clustering is the task of grouping a set of
patterns in such a way that patterns in the
same category are more similar to each
other than to those in other categories.
Error term In econometric, the error term is a
variable that represents the differences
among the real data and the predicted
ones. The error term is also known as
the ‘residual’ term.
Mean absolute
error
Mean absolute error is the average
deviation of the prediction from the
actual targets
Distance metric
learning
Distance metric learning is a task where the
goal is to learn a metric for the input data
space from a given set of pair of
similar/dissimilar patterns that preserves
the distance relation among the training set.
Ordinal
regression
The learning task of ordinal regression is
to assign patterns into a set of finite
ordered classes.
Regression In statistics, regression is a task where the
goal is to estimate the relationship among
one or more input variable and a scalar
(continuous) dependent variable.
Softmax function The softmax activation function maps the
range of the data for each class, into the
interval [0, 1] and the summation of all
the output classes is one.
Unimodal
distribution
The unimodal distribution is a kind of
distribution where the data have only
one clear peak.
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 173
References
AKIRI, A.A. and N.M. UGBORUGBO (2009) Teachers’
effectiveness
and students’ academic performance in public secondary
schools
in Delta State, Nigeria, Stud Home Comm Sci, 3, 107–113.
ATAY, L. and H.M. YILDIRIM (2010) Determining the factors
that
affect the satisfaction of students having undergraduate tourism
education with the department by means of the method of
classification tree, Tourismos: An International
Multidisciplinary
Journal of Tourism, 5, 73–87.
BAKER, R. (2010) Data mining for education, International
Encyclopedia of Education, 7, 112–118.
BANGERT, A. (2008) The influence of social presence and
teaching
presence on the quality of online critical inquiry, Journal of
Computing in Higher Education, 20, 34–61.
BARBERA, E. and J. LINDER-VAN BERSCHOT (2011)
Systemic
multicultural model for online education: Tracing connections
among learner inputs, instructional processes and outcomes,
Quarterly Review of Distance Education, 12, 167–180.
BHARDWAJ, B.K. and S. PAL (2011) Data mining: a
prediction for
performance improvement using classification, International
Journal of Computer Science and Information Security, 9,
136–140.
BISHOP, C.M. (2007) Pattern Recognition and Machine
Learning, 1st
edn., Springer.
CANO, A., A. ZAFRA and S. VENTURA (2013) Weighted data
gravitation classification for standard and imbalanced data,
Cybernetics, IEEE Transactions on, 43, 1672–1687.
CHANG, C.-C., C.-J. LIN (2001) LIBSVM: a library for support
vector machines. Software available at: http://www.csie.ntu.
edu.tw/~cjlin/libsvm/faq.html (Accessed 25th October 2015).
CHANG, S.-H.H. and R.A. SMITH (2008) Effectiveness of
personal
interaction in a learner- centered paradigm distance education
class based on student satisfaction, Journal of Research on
Technology in Education, 40, 407–426.
CHU, W. and S.S. KEERTHI (2005) New approaches to support
vector ordinal regression, in In ICML ‘05: Proceedings of the
22nd International Conference on Machine Learning, 145–152.
CHU, W. and S.S. KEERTHI (2007) Support vector ordinal
regression,
Neural Computation, 19, 792–815.
DA COSTA, J.F.P., H. ALONSO and J.S. CARDOSO (2008)
The
unimodal model for the classification of ordinal data, Neural
Networks, 21, 78–91.
DEMSAR, J. (2006) Statistical comparisons of classifiers over
multiple
data sets, Journal of Machine Learning Research, 7, 1–30.
FERNÁNDEZ-NAVARRO, F., P. GUTIERREZ, C. HERVÁS-
MARTÍNEZ
and X. YAO (2013) Negative correlation ensemble learning for
ordinal regression, IEEE Transactions on Neural Networks and
Learning Systems, 24, 1836–1849.
FERNÁNDEZ-NAVARRO, F., C. HERVÁS-MARTÍNEZ, R.
RUIZ and J.C.
RIQUELME (2012) Evolutionary generalized radial basis
function
neural networks for improving prediction accuracy in gene
classification using feature selection, Applied Soft Computing,
12, 1787–1800.
GARRISON, D.R. (2011) E-learning in the 21st Century: A
Framework
for Research and Practice, Taylor & Francis.
GIBBS, G. and M. COFFEY (2004) The impact of training of
university teachers on their teaching skills, their approach to
teaching and the approach to learning of their students, Active
Learning in Higher Education, 5, 87–100.
GÜNDÜZ, N. and E. FOKOUÉ (2013) Data mining and machine
learning techniques for extracting patterns in students’
evaluations
of instructors. Working Paper CQAS-2013-2, Rochester
Institute
of Technology, Center for Quality and Applied Statistics, 98
Lomb
Memorial Drive, Rochester, NY 14623, USA.
GUTIÉRREZ, P., M. PÉREZ-ORTIZ, F. FERNÁNDEZ-
NAVARRO, J.
SÁNCHEZ-MONEDERO and C. HERVÁS-MARTÍNEZ (2012)
An
experimental study of different ordinal regression methods and
measures, in Hybrid Artificial Intelligent Systems. Vol. 7209 of
Lecture Notes in Computer Science, 296–307.
HANSEN, N. and A. OSTERMEIER (2001) Completely
derandomized
self-adaptation in evolution strategies, Evolutionary
Computation,
9, 159–195.
LANDWEHR, N., M. HALL and E. FRANK (2005) Logistic
model
trees, Machine Learning, 59, 161–205.
LIN, C.F., Y.-C. YEH, Y.H. HUNG and R.I. CHANG (2013)
Data
mining for providing a personalized learning path in creativity:
an
application of decision trees, Computers & Education, 68, 199–
210.
LIU, X. (2009) Ordinal regression analysis: Fitting the
proportional
odds model using Stata, SAS and SPSS, Journal of Modern
Applied Statistical Methods, 8, 632–645.
MCCULLAGH, P. (1980) Regression models for ordinal data,
Journal
of the Royal Statistical Society: Series B: Methodological, 42,
109–142.
MINAEI-BIDGOLI, B. and W.F. PUNCH (2003) Using genetic
algorithms for data mining optimization in an educational web-
based system. In Genetic and Evolutionary Computation-
GECCO 2003, Springer, 2252–2263.
MIRCHEVSKA, V., M. LUŠTREK and M. GAMS (2014)
Combining
domain knowledge and machine learning for robust fall
detection, Expert Systems, 31, 163–175.
MONJARDET, B. (1997) Concordance between two linear
orders: the
Spearman and Kendall coefficients revisited, Journal of
Classification, 14, 269–295.
MOREIRA, C., P. CALADO and B. MARTINS (2013) Learning
to rank
academic experts in the DBLP dataset, Expert Systems, In
Press,
10.1111/exsy.12062
NA AYALA, A.P. (2014) Educational data mining: a survey and
a
data mining-based analysis of recent works, Expert Systems
with
Applications, 41(4, Part 1), 1432–1462.
NEBOT, A., F. CASTRO, A. VELLIDO and F. MUGICA (2006)
Identification of fuzzy models to predict students performance
in an e-learning environment, in The Fifth IASTED
International
Conference on Web-Based Education, WBE, 74–79.
OBERREUTER, G. and J.D. VELASQUEZ (2013) Text mining
applied to
plagiarism detection: the use of words for detecting deviations
in
the writing style, Expert Systems with Applications, 40, 3756–
3763.
PALMER, S.R. and D.M. HOLT (2009) Examining student
satisfaction with wholly online learning, Journal of Computer
Assisted Learning, 25, 101–113.
PENG, L., B. YANG, Y. CHEN and A. ABRAHAM (2009) Data
gravitation based classification, Information Sciences, 179,
809–819.
ROBERTS, J. and R. STYRON JR. (2010) Student satisfaction
and
persistence: factors vital to student retention, Research in
Higher
Education Journal, 6, 1–18.
ROMERO, C. and S. VENTURA (2007) Educational data
mining: a
survey from 1995 to 2005, Expert Systems with Applications,
33,
135–146.
ROMERO, C. and S. VENTURA (2010) Educational data
mining: a
review of the state of the art, Systems, Man, and Cybernetics,
Part
C: Applications and Reviews, IEEE Transactions on, 40, 601–
618.
ROMERO, C., S. VENTURA and P. DE BRA (2004) Knowledge
discovery with genetic programming for providing feedback to
courseware authors, User Modeling and User-Adapted
Interaction, 14, 425–464.
ROMERO, C., S. VENTURA, P.G. ESPEJO and C. HERVÁS
(2008) Data
mining algorithms to classify students in International
Conference
on Educational Data Mining, Montreal, Canada, 8–17.
ROMERO, C., A. ZAFRA, J.M. LUNA and S. VENTURA
(2013)
Association rule mining using genetic programming to provide
feedback to instructors from multiple-choice quiz data, Expert
Systems, 30, 162–172.
SHEA, P. and T. BIDJERANO (2009) Community of inquiry as
a
theoretical framework to foster “epistemic engagement” and
“cognitive presence” in online education, Computers &
Education,
52, 543–553.
© 2015 Wiley Publishing Ltd174 Expert Systems, April 2016,
Vol. 33, No. 2
http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html
http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html
SIEMENS, G. and D R.S. BAKER (2012) Learning analytics
and
educational data mining: towards communication and
collaboration, in Proceedings of the 2nd International
Conference
on Learning Analytics and Knowledge. ACM, 252–254.
SUN, P.-C., R.J. TSAI, G. FINGER, Y.-Y. CHEN and D. YEH
(2008)
What drives a successful e-learning? An empirical investigation
of the critical factors influencing learner satisfaction,
Computers
& Education, 50, 1183–1202.
TANG, C., R.W. LAU, Q. LI, H. YIN, T. LI and D. KILIS
(2000)
Personalized courseware construction based on web data
mining.
in Web Information Systems Engineering, 2000. Proceedings of
the First International Conference on. Vol. 2. IEEE, 204–211.
TANNER, C.K. (2009) Effects of school design on student
outcomes,
Journal of Educational Administration, 47, 381–399.
TORRA, V., J. DOMINGO-FERRER, J.M. MATEO-SANZ and
M. NG
(2006) Regression for ordinal variables without underlying
continuous variables, Information Sciences, 176, 465–474.
UENO, M. (2003) On-line statistical outlier detection of
irregular
learning processes for e-learning. in World Conference on
Educational Multimedia, Hypermedia and Telecommunications.
Vol. 2003, 227–234.
VAPNIK, V.N. (1999) The Nature of Statistical Learning
Theory,
Springer.
WANG, C. and Y. CHEN (2005) Improving nearest neighbor
classification with simulated gravitational collapse. In Wang,
L.,
K. Chen and Y. Ong (editors), Advances in Natural
Computation.
Vol. 3612 of Lecture Notes in Computer Science, Springer,
Berlin
Heidelberg, 845–854.
WITTEN, I.H. and E. FRANK (2005) data mining: practical
machine
learning tools and techniques. In Data Management Systems,
2nd edn., Morgan Kaufmann (Elsevier).
YAY, M. and E.D. AKINCI (2009) Application of ordinal
logistic
regression and artificial neural networks in a study of student
satistaction, Cypriot Journal of Educational Sciences, 4, 58–69.
YUKSELTURK, E. (2009) Do entry characteristics of online
learners
affect their satisfaction?, International Journal on E-Learning,
8,
263–281.
ZAFRA, A., C. ROMERO and S. VENTURA (2011) Multiple
instance
learning for classifying students in learning management
systems,
Expert Systems with Applications, 38, 15020–15031.
ZAÏANE, O.R., M. XIN and J. HAN (1998) Discovering web
access
patterns and trends by applying olap and data mining
technology
on web logs, in Research and Technology Advances in Digital
Libraries, 1998. ADL 98. Proceedings. IEEE International
Forum on. IEEE, 19–29.
ZONG-CHANG, Y. (2008) A vector gravitational force model
for
classification, Pattern Analysis and Applications, 11, 169–177.
ZORRILLA, M.E., E. MENASALVAS, D. MARIN, E. MORA
and J.
SEGOVIA (2005) Web usage mining project for improving web-
based learning sites. In Computer Aided Systems Theory–
EUROCAST 2005, Springer, 205–210.
The authors
Pilar Gómez-Rey
Pilar Gómez-Rey received the MSc degree in Business
Administration from University ETEA, Spain, in 2012 and
the MSc degree in Teaching Economics for Pre-Higher
Education from the International University of La Rioja,
Spain, in 2014. Currently, she is a PhD candidate at the
Open University of Catalonia where she is developing her
thesis through the Doctoral Programme in Education and
ICT (e-learning). Her main research interests include Higher
Education, e-learning, students’ perceptions as well as
quality education.
Francisco Fernández-Navarro
Francisco Fernández-Navarro received the MSc degree in
computer science from the University of Cordoba, Spain,
in 2008, the MSc degree in artificial intelligence from the
University of Malaga, Spain, in 2009, and the PhD degree
in computer science and artificial intelligence from the
University of Malaga in 2011. He was a research fellow in
computational management with the European Space
Agency, Noordwijk, The Netherlands, and currently he is
working as Associate Professor at the Universidad Loyola
Andalucia. His current research interests include neural
networks, ordinal regression, imbalanced classification,
and hybrid algorithms. He is a member of the IEEE.
Elena Barberà
Elena Barberà PhD in Educational Psychology (1995) and
senior researcher at eLearn Center (Open University of
Catalonia, Barcelona). She is currently Director of the
PhD program ‘Education and ICT’ at OUC. Her research
activity is focused in the area of educational psychology.
As head of the e-DUS (Distance School and University e-
ducation) research group, she currently participates in
national and international projects, and she is external
evaluator of national and European research projects. She
is also an editor of two journals of impact in the field of
education and technology.
© 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol.
33, No. 2 175
Copyright of Expert Systems is the property of Wiley-Blackwell
and its content may not be
copied or emailed to multiple sites or posted to a listserv
without the copyright holder's
express written permission. However, users may print,
download, or email articles for
individual use.

More Related Content

More from makdul

According to Davenport (2014) social media and health care are c.docx
According to Davenport (2014) social media and health care are c.docxAccording to Davenport (2014) social media and health care are c.docx
According to Davenport (2014) social media and health care are c.docx
makdul
 
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docxAccording to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
makdul
 
According to Libertarianism, there is no right to any social service.docx
According to Libertarianism, there is no right to any social service.docxAccording to Libertarianism, there is no right to any social service.docx
According to Libertarianism, there is no right to any social service.docx
makdul
 
According to Kirk (2016), most of your time will be spent working wi.docx
According to Kirk (2016), most of your time will be spent working wi.docxAccording to Kirk (2016), most of your time will be spent working wi.docx
According to Kirk (2016), most of your time will be spent working wi.docx
makdul
 
According to cultural deviance theorists like Cohen, deviant sub.docx
According to cultural deviance theorists like Cohen, deviant sub.docxAccording to cultural deviance theorists like Cohen, deviant sub.docx
According to cultural deviance theorists like Cohen, deviant sub.docx
makdul
 
According to Gray et al, (2017) critical appraisal is the proce.docx
According to Gray et al, (2017) critical appraisal is the proce.docxAccording to Gray et al, (2017) critical appraisal is the proce.docx
According to Gray et al, (2017) critical appraisal is the proce.docx
makdul
 
According to article Insecure Policing Under Racial Capitalism by.docx
According to article Insecure Policing Under Racial Capitalism by.docxAccording to article Insecure Policing Under Racial Capitalism by.docx
According to article Insecure Policing Under Racial Capitalism by.docx
makdul
 
Abstract In this experiment, examining the equivalence poi.docx
Abstract  In this experiment, examining the equivalence poi.docxAbstract  In this experiment, examining the equivalence poi.docx
Abstract In this experiment, examining the equivalence poi.docx
makdul
 
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docxACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
makdul
 
ACC 601 Managerial Accounting Group Case 3 (160 points) .docx
ACC 601 Managerial Accounting Group Case 3 (160 points) .docxACC 601 Managerial Accounting Group Case 3 (160 points) .docx
ACC 601 Managerial Accounting Group Case 3 (160 points) .docx
makdul
 
Academic Integrity A Letter to My Students[1] Bill T.docx
Academic Integrity A Letter to My Students[1]  Bill T.docxAcademic Integrity A Letter to My Students[1]  Bill T.docx
Academic Integrity A Letter to My Students[1] Bill T.docx
makdul
 
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docxAccess the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
makdul
 
According to DSM 5 This patient had very many symptoms that sugg.docx
According to DSM 5 This patient had very many symptoms that sugg.docxAccording to DSM 5 This patient had very many symptoms that sugg.docx
According to DSM 5 This patient had very many symptoms that sugg.docx
makdul
 
Acceptable concerts include professional orchestras, soloists, jazz,.docx
Acceptable concerts include professional orchestras, soloists, jazz,.docxAcceptable concerts include professional orchestras, soloists, jazz,.docx
Acceptable concerts include professional orchestras, soloists, jazz,.docx
makdul
 
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docxACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
makdul
 
Access the FASB website. Once you login, click the FASB Accounting S.docx
Access the FASB website. Once you login, click the FASB Accounting S.docxAccess the FASB website. Once you login, click the FASB Accounting S.docx
Access the FASB website. Once you login, click the FASB Accounting S.docx
makdul
 
Academic Paper  Overview  This performance task was intended to asse.docx
Academic Paper  Overview  This performance task was intended to asse.docxAcademic Paper  Overview  This performance task was intended to asse.docx
Academic Paper  Overview  This performance task was intended to asse.docx
makdul
 
Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docxAcademic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
makdul
 
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docxAbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
makdul
 
Abstract                                 Structure of Abstra.docx
Abstract                                 Structure of Abstra.docxAbstract                                 Structure of Abstra.docx
Abstract                                 Structure of Abstra.docx
makdul
 

More from makdul (20)

According to Davenport (2014) social media and health care are c.docx
According to Davenport (2014) social media and health care are c.docxAccording to Davenport (2014) social media and health care are c.docx
According to Davenport (2014) social media and health care are c.docx
 
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docxAccording to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
According to (Fatehi, Gordon & Florida, N.D.) theoretical orient.docx
 
According to Libertarianism, there is no right to any social service.docx
According to Libertarianism, there is no right to any social service.docxAccording to Libertarianism, there is no right to any social service.docx
According to Libertarianism, there is no right to any social service.docx
 
According to Kirk (2016), most of your time will be spent working wi.docx
According to Kirk (2016), most of your time will be spent working wi.docxAccording to Kirk (2016), most of your time will be spent working wi.docx
According to Kirk (2016), most of your time will be spent working wi.docx
 
According to cultural deviance theorists like Cohen, deviant sub.docx
According to cultural deviance theorists like Cohen, deviant sub.docxAccording to cultural deviance theorists like Cohen, deviant sub.docx
According to cultural deviance theorists like Cohen, deviant sub.docx
 
According to Gray et al, (2017) critical appraisal is the proce.docx
According to Gray et al, (2017) critical appraisal is the proce.docxAccording to Gray et al, (2017) critical appraisal is the proce.docx
According to Gray et al, (2017) critical appraisal is the proce.docx
 
According to article Insecure Policing Under Racial Capitalism by.docx
According to article Insecure Policing Under Racial Capitalism by.docxAccording to article Insecure Policing Under Racial Capitalism by.docx
According to article Insecure Policing Under Racial Capitalism by.docx
 
Abstract In this experiment, examining the equivalence poi.docx
Abstract  In this experiment, examining the equivalence poi.docxAbstract  In this experiment, examining the equivalence poi.docx
Abstract In this experiment, examining the equivalence poi.docx
 
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docxACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
ACC 403- ASSIGNMENT 2 RUBRIC!!!Points 280Assignment 2 Audi.docx
 
ACC 601 Managerial Accounting Group Case 3 (160 points) .docx
ACC 601 Managerial Accounting Group Case 3 (160 points) .docxACC 601 Managerial Accounting Group Case 3 (160 points) .docx
ACC 601 Managerial Accounting Group Case 3 (160 points) .docx
 
Academic Integrity A Letter to My Students[1] Bill T.docx
Academic Integrity A Letter to My Students[1]  Bill T.docxAcademic Integrity A Letter to My Students[1]  Bill T.docx
Academic Integrity A Letter to My Students[1] Bill T.docx
 
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docxAccess the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
Access the Center for Disease Control and Prevention’s (CDC’s) Nu.docx
 
According to DSM 5 This patient had very many symptoms that sugg.docx
According to DSM 5 This patient had very many symptoms that sugg.docxAccording to DSM 5 This patient had very many symptoms that sugg.docx
According to DSM 5 This patient had very many symptoms that sugg.docx
 
Acceptable concerts include professional orchestras, soloists, jazz,.docx
Acceptable concerts include professional orchestras, soloists, jazz,.docxAcceptable concerts include professional orchestras, soloists, jazz,.docx
Acceptable concerts include professional orchestras, soloists, jazz,.docx
 
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docxACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
ACA was passed in 2010, under the presidency of Barack Obama. Pr.docx
 
Access the FASB website. Once you login, click the FASB Accounting S.docx
Access the FASB website. Once you login, click the FASB Accounting S.docxAccess the FASB website. Once you login, click the FASB Accounting S.docx
Access the FASB website. Once you login, click the FASB Accounting S.docx
 
Academic Paper  Overview  This performance task was intended to asse.docx
Academic Paper  Overview  This performance task was intended to asse.docxAcademic Paper  Overview  This performance task was intended to asse.docx
Academic Paper  Overview  This performance task was intended to asse.docx
 
Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docxAcademic Research Team Project PaperCOVID-19 Open Research Datas.docx
Academic Research Team Project PaperCOVID-19 Open Research Datas.docx
 
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docxAbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
AbstractVoice over Internet Protocol (VoIP) is an advanced t.docx
 
Abstract                                 Structure of Abstra.docx
Abstract                                 Structure of Abstra.docxAbstract                                 Structure of Abstra.docx
Abstract                                 Structure of Abstra.docx
 

Recently uploaded

Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
Celine George
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
Dr. Shivangi Singh Parihar
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 

Recently uploaded (20)

Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17How to Make a Field Mandatory in Odoo 17
How to Make a Field Mandatory in Odoo 17
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.PCOS corelations and management through Ayurveda.
PCOS corelations and management through Ayurveda.
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 

Question 1The Uniform Commercial Code incorporates some of the s.docx

  • 1. Question 1 The Uniform Commercial Code incorporates some of the same elements as the Statute of Frauds. Under the Statute of Frauds, certain contracts must be in writing to be enforceable. Research the types of contracts that must be in writing under the Statute of Frauds. Do you agree with the contracts that need to be in writing and explain why or why not? Imagine that you were asked to be part of a team to draft revisions to the Statute of Frauds. What changes or proposals would you make? Why? Respond to this… The Statute of Frauds requires that certain types of contracts be in writing to be able to be enforced. These types of contracts include goods that are priced at $500 or more, interest in land, promises to pay off debt, and contracts that cannot be performed within one year, all of which have been signed by the defendant to be enforceable. I do think that all of these contracts should be in writing because it is a type of safeguard of the resource to ensure that each party is responsible for whatever the contract is regarding. For example, if we did not have to sign for a car loan, the responsible party that needs to pay the loan back could walk away, and without a signature of agreement to the terms of the loan, it would be hard for the company to fight for their money, as there is no signature enforcing the agreement. If I had to revise something with the Statute of Frauds, I would change the contacts that cannot be performed within one year. I think one year is a long time to let a contract slide. I feel that six months sounds more reasonable. I guess if I was a business and I did not get commitment to a contract for a whole year, I feel this would greatly affect my business. I also think it might be a harder fight to get whatever the other party is responsible for as it was a year ago. As a business, I think I would want to pursue a breach of contract in three or four months even. That is a long time to not pay up.
  • 2. Question 2 Let’s assume that you are interested in doing a statistical survey and you use confidence intervals for your conclusion. Describe a possible scenario and indicate what the population is, and what measure of the population you would try to estimate (proportion or mean) by using a sample. · What is your estimate of the population size? · What sample size will you use? · How will you gather information for your sample? · What confidence percentage will you use? Let’s assume that you have completed the survey and now state your results using a confidence interval statement. You can make up the numbers based on a reasonable result. Respond to this… had found a study in Australia and New Zealand where they wanted to see if there was efficient care when dealing with people that suffered from acute coronary syndrome, that required an understanding of the sources of variation in their care. Basically, they wanted to see if the people that did not speak English well were receiving the same amount of care as the English proficient ones. Basically, the result was out of 4387 patients, 294 LEP (less efficient English patients) were older (70.9 vs 66.3 years; P< 0.001), and higher prevalence of suffering from high blood pressure (71.1% vs 62.8%; P=0.007), diabetes (40.5% vs 24.3%; P< 0.001), and had kidney damage (16.3% vs 11.1%; P=0.007) compared to the other 4093 (Hyun, et al., 2017). Once they were in the hospital, there was no difference on how they received the care, they were not treated differently. Patient demographics, medical history, in hospital care, and acute and late outcomes were used in this study to compare the two groups. A multiple-adjusted regression model was used for length of stay, and multiple adjusted logistic regression models were used for each of the outcomes to estimate the offs ratios and corresponding 95% confidence intervals (Hyun, et al., 2017). I think this conclusion made since to me. If someone that doesn’t speak
  • 3. English (or any language that is not native to the country that you are in) will not seek out medical help for any issues because they will not necessarily know what is happening and would rather take their chances. I am at least glad to know that once that patient reaches to the hospital they are treated just as fairly as the English speaking patients. Hyun, K., Redfern, J., Woodward, M., Briffa, T., Cher, D., Ellis, C., . . . . (2017, May/June). Is There Inequity in Hospital Care Among Patients With Acute Coronary Syndrome Who Are Proficient and Not Proficient in English Language?: Analysis of the SNAPSHOT ACS Study. Is There Inequity in Hospital Care Among Patients With Acute Coronary Syndrome Who Are Proficient and Not Proficient in English Language?: Analysis of the SNAPSHOT ACS Study, 288-295. doi:10.1097/JCN.0000000000000342 Article DOI: 10.1111/exsy.12138 Ordinal regression by a gravitational model in the field of educational data mining Pilar Gómez-Rey,1* Francisco Fernández-Navarro2 and Elena Barberà1 (1) eLearn Center, Open University of Catalunya, Barcelona, Spain E-mail: [email protected] (2) Department of Mathematics and Engineering, Universidad Loyola Andalucia, Andalucia, Spain Abstract: Educational data mining (EDM) is a research area where the goal is to develop data mining methods to examine
  • 4. data critically from educational environments. Traditionally, EDM has addressed the following problems: clustering, classification, regression, anomaly detection and association rule mining. In this paper, the ordinal regression (OR) paradigm, is introduced in the field of EDM. The goal of OR problems is the classification of items in an ordinal scale. For instance, the prediction of students’ performance in categories (where the different grades could be ordered according to A ≻ B ≻ C ≻ D) is a classical example of an OR problem. The EDM community has not yet explored this paradigm (despite the importance of these problems in the field of EDM). Furthermore, an amenable and interpretable OR model based on the concept of gravitation is proposed. The model is an extension of a recently proposed gravitational model that tackles imbalanced nominal classification problems. The model is carefully adapted to the ordinal scenario and validated with four EDM datasets. The results obtained were compared with state-of-the-art OR algorithms and nominal classification ones. The proposed models can be used to better understand the learning– teaching process in higher education environments. Keywords: educational data mining, ordinal regression models, students satisfaction, gravitational models 1. Introduction Educational data mining (EDM) is a recent framework based on the application of data mining (DM) techniques to educational problems (Oberreuter & Velasquez, 2013; Romero et al., 2013).
  • 5. The main goal of EDM is to analyse educational data to find patterns that can improve the quality of the learning process and guide students’ learning (Romero & Ventura, 2007, 2010). The knowledge discovered by EDM techniques may be useful for teachers/instructors to manage their classes, understand their students and reflect on their teaching methodologies. EDM has contributed to the development of learning theories typically investigated in the educational psychology field (Baker, 2010). Siemens and d Baker (2012) described the similarities and differences between learning analytics and EDM and concluded that both fields are closely tied. EDM techniques can be applied to data from both traditional classroom educational systems (based on face-to-face contact) and to data coming from distance education environments (e-learning). It is important to note that every type of education differs in nature and has different objectives. Therefore, the conclusions obtained in these environments will be also different. Currently, EDM techniques have been used to address the following problems (Romero & Ventura, 2007): data visualization and analysis, clustering, classification, regression, outlier detection and association rule learning. An explanation of each problem type can be found in Appendix A. On the other hand, ordinal regression (OR) problems are those problems where the objective is to classify patterns in an ordinal scale. For example, student satisfaction surveys usually involve rating teachers based on an ordinal scale {poor, average, good, very good and excellent}. Hence, the class label has a natural order, that is, a pattern associated with class label average has a higher rating (or better) than another having class poor, but having class good is better than both labels. This problem falls between nominal classification, in which data are instead an unordered set, and regression, in which data are instead a continuous, totally ordered set. OR problems are also closely related to
  • 6. the learning to rank problems (Moreira et al., 2013). Many real problems require the classification of patterns (items) into naturally ordered classes. In fact, many EDM problems demand the classification of items in an ordinal scale. However, so far, OR problems have been addressed in the EDM community as regression or classification problems. As a proof of this, we could highlight the recent EDM review by na Ayala (2014), where the proposed OR paradigm is not included within the EDM problems (because of the lack of studies in this direction). For example, the prediction of student performance is one of the oldest and most popular applications of EDM. This problem has been traditionally tackled using either regression analysis techniques (Nebot et al., 2006) (assuming an equal distance among the different classes) or nominal © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 161 classification approaches (Romero et al., 2008) (ignoring the order among the different classes). One of the main goals of this research work is to introduce the OR paradigm in the EDM community. To show the importance of this paradigm for the EDM community, four EDM datasets are considered and addressed with OR models. On the other hand, a data gravitational model (DGM) that learns the parameters of a weighted Euclidean metric for nominal classification problems has been recently proposed by Cano et al. (2013) within the field of distance metric learning (DML). One of the main advantages of this model is its high interpretability, which makes it especially useful for educational purposes. The DGM proposed is based on
  • 7. the idea of a data-driven gravitational law (Wang & Chen, 2005; Zong-chang, 2008; Peng et al., 2009; Cano et al., 2013). The underlying ideas of the DGMs are the following: (a) there exists a force between any two patterns; (b) this force follows Newton’s law of universal gravitation where the body masses are substituted by a set of data points; and (c) the class value of a test pattern is determined by comparing the force of attraction between the pattern and the different classes. Another goal of this paper is to propose a generalized force- based model (GFM) specifically designed for OR problems with educational purposes, extending in several ways to the state-of-the-art DGMs. The outputs of the model are assumed to be unimodal (da Costa et al., 2008). To impose this constraint, the error function has been redefined to penalize non-unimodal outputs. The proposed method extends to the DGMs previously presented by considering, besides an attribute-class weight matrix, a vector representing different scaling of the class pattern interaction with the distance. The model has been adapted to the characteristics of the problem considered. Finally, the model parameters have been optimized through the covariance matrix adaptation evolution strategy (CMA-ES) global optimization algorithm (Hansen & Ostermeier, 2001). Summarizing, the main contributions of this paper are as follows: • To introduce the OR paradigm in the EDM community. Most of the EDM problems require the classification of object in an ordinal scale. Despite this, the EDM community has not yet explored this paradigm. • To propose a GFM that considers the particularities of OR problems. The performance of the model proposed was validated using two publicly available datasets and one real-world educational problem used to analyse
  • 8. students’ perceptions about online learning success factors. • For EDM problems, the accuracy of the model is equally as important as its interpretability because EDM techniques should be applied by practitioners (not just by researchers) (Lin et al., 2013). Therefore, it is important to apply and develop interpretable and amenable models. Accordingly, the high interpretability of the proposed models was also demonstrated considering four OR EDM problems. The remainder of the paper is organized as follows: a brief analysis of some OR educational problems that were treated as non-ordinal ones is provided in Section 2. Section 3 describes the case of studies considered in this research work. Section 4 depicts the main ideas of the model proposed. Section 5 presents the experimental framework and the results obtained, while the model interpretability is discussed in Section 6. Section 7 summarizes the achievements and outlines some future developments of the proposed methodology. Finally, a short but useful glossary of technical terms that may be encountered in the world of expert systems and artificial intelligence is included in the Appendix C. 2. Some examples of ordinal regression educational problems addressed without an ordinal regression technique In this section, some examples of educational ordinal problems that were addressed with the inappropriate technique will be described. As we will discuss later, OR problems can be easily simplified to other standard data mining problems. In the EDM community, OR problems have been traditionally tackled using classification or standard regression approaches that generally involve
  • 9. making some assumptions leading to the underperformance of the final classifier/regressor model. One very simple idea to face ordinal regression is to cast all the different labels {C1, C2, …, CJ} (where J is the number of classes) into real values {r1, r2, …, rJ}, where ri ∈ ℝ, and then to apply standard regression techniques (Torra et al., 2006). The main problem of this approach is that the real values used for the labels may hinder the performance of the final regressor, and there is no principled way of deciding which value a label should have. On the other hand, OR problems have been also tackled with standard nominal classification models. The main problem of this approach is that the order information provided by the labels is ignored. Therefore, the classifier implemented does not consider this information in the parameter estimation stage. Then, some examples of educational problems treated as a regression or classification ones will be highlighted. Without loss of generality, we will focus on two kinds of problems: the prediction of students’ satisfaction and the prediction of students’ performance. A summary of the literature review and its characteristics is included in Table 1. • Ordinal regression problems addressed with a classification approach: ○ Predicting students’ satisfaction: Firstly, it is worth mentioning the work of Atay and Yildirim (2010). Their paper reports the factors that affect the levels of student satisfaction, using a dataset with 1734 students. For their study, they considered undergraduate tourism students. The classification tree (CT) revealed that the job considered to be © 2015 Wiley Publishing Ltd162 Expert Systems, April 2016,
  • 10. Vol. 33, No. 2 accomplished after graduating was the most important variable to explain the student’s satisfaction variable. From a different perspective, Roberts and Styron Jr (2010) analysed students’ perceptions of services, interactions and experiences, taking into account students from the College of Education and Psychology (Southern University of Mississippi, United States). The questionnaire was related to academic advising, social connectedness, involvement and engagement, faculty and staff approachability and others. The application of discriminant analysis to these data revealed that the learning experience variables were the most significant to be considered in the evaluation of students’ satisfaction, while the Social Connectedness and Involvement and Engagement variables were the least significant ones in the determination of students’ satisfaction. ○ Predicting students’ performance: Minaei-Bidgoli and Punch (2003) proposed a genetic algorithm (GA) to optimize a combination of classifiers such as quadratic Bayesian classifier, 1-nearest neighbour (1-NN), k- nearest neighbour (k-NN), Parzen-window, multi- layer perceptron (MLP) and decision tree (DT) to predict the students’ final grade. They took into account features extracted from data logged in an education web-based system. Some of these features were the success rate, the number of attempts before the correct answer is provided or the difference between time of the last submission and the first time the problem was examined. The final assessment
  • 11. showed that the total number of correct answers and the total number of tries are the most important factors for the classification. From a different point of view, Bhardwaj and Pal (2011) also tried to predict the performance of students. They proposed a categorization of students of the current year based on the analysis carried out with the students of the previous year. To evaluate the effectiveness of their Bayesian classifier, the study used variables such as the mother’s qualification, the student’s habits, the annual family income, the students’ family status or the living location, among others. Their main finding was that the academic performance of students does not only depend on their own effort. • Ordinal regression problems addressed with a regression approach: ○ Predicting students’ satisfaction: Analysing the paper of Sun et al. (2008) allows us to discover the main factors affecting learner satisfaction in e-learning. Factors such as the attitude and the motivation of the learners, the instructor’s performance, the design of the courses, the available technology and the environment were considered in this study. The findings support that learner computer anxiety, instructor attitude towards e-learning, e-learning course flexibility, e-learning course quality, perceived usefulness, perceived ease of use and diversity in assessments are the critical variables affecting learners’ perceived satisfaction. This study employed a stepwise multiple regression analysis. Additionally, the research of Chang and Smith (2008) explored the correlation between students’ perceptions of course-related interaction and their course satisfaction within the learner-centred paradigm in
  • 12. distance education. The results demonstrated that student–instructor personal interaction, student– student personal interaction and student–content interaction, along with students’ perceptions of WebCT features and gender, really matter. A multiple linear regression was used to prove the significance of the variables and to model the educational problem. ○ Predicting students’ performance. Through a sample of 71 schools, Tanner (2009) analysed student performance across three school design factors: movement and circulation, day lighting and views. Hence, reading comprehension, reading vocabulary, language arts, mathematics, social studies and science were the variables considered in the study. The prediction of student performance was carried out through regression analysis. The main conclusions of the paper were the following: (a) a crowded school has a negative influence on student performance; (b) day lighting impides on the variables in the scores obtained in science and reading vocabulary; and (c) views affect patterns of reading vocabulary, language arts and mathematics allowing the provision for the students to rest their eyes. Akiri and Ugborugbo (2009) also investigated this topic. Their paper attempts to model the influence of teachers’ classroom effectiveness on students’ academic performance in public secondary schools in Delta State, Nigeria. Factors such as lesson preparation and Table 1: Summary of the literature review results Predicting students satisfaction Paper Approach Models Atay and
  • 13. Yildirim (2010) Classification CT Roberts and Styron Jr (2010) Classification Discriminant analysis Sun et al. (2008) Regression Multiple linear regression Chang and Smith (2008) Regression Multiple linear regression Predicting students performance Paper Approach Models Minaei-Bidgoli and Punch (2003) Classification Quadratic Bayesian, K-NN, Parzen-window, MLP, DT Bhardwaj and Pal (2011) Classification Bayesian classifier Tanner (2009) Regression Linear regression Akiri and Ugborugbo (2009) Regression Linear regression
  • 14. © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 163 presentation, punctuality and attendance in class, clear communication, adequate use of instructional materials, creativity and resourcefulness among others were used to demonstrate that the effectiveness of teachers is not the sole determinant of students’ academic performance. The problem was addressed using regression analysis. To the best of our knowledge, OR models have been scarcely applied for educational purposes. Specifically, they were used in just two research papers in the field of EDM (Liu, 2009; Yay & Akıncı, 2009). In other papers, the OR problem was addressed as a nominal one (using a classification approach) or as a continuous one (using a regression approach). Moreover, Yay and Akıncı (2009) and Liu (2009) applied a classical and linear statistical model.1 This algorithm has several drawbacks. The most obvious one is its inability to model non-linear relations (very frequent in educational data). Furthermore, as a classical statistical model, it assumes some hypotheses that are difficult to satisfy in real-world problems. The model proposed for EDM problems is described. The proposed methodology is able to model the non-linear relations existing in the input space without any assumptions on the data. One of the advantages of the model proposed is its high level of interpretability. This will allow educational experts to gain insights about the problem. This knowledge could be used to improve the learning–teaching process (as discussed in Section 6). 3. Educational data mining datasets considered
  • 15. 3.1. Turkiye student evaluation The Turkiye Student Evaluation (TSE) dataset is composed of 5820 evaluation scores provided by students from Gazi University in Ankara (Turkey) (Gündüz & Fokoué, 2013). The dataset is publicy available in the UCI Machine Learning repository.2 Each participant was asked 28 education-related questions. The questionnaire is listed in Appendix B. In the original dataset, 2835 students out of the total 5280 students provided the same score. These evaluators were called single-minded evaluators (taking into account the zero variation nature). Following the recommendations of Gündüz and Fokoué (2013), two datasets were considered in this study: the TSE dataset including the single-minded evaluators (TSE-I-SME) and the TSE dataset without including the single-minded evaluators (TSE-W-SME). Furthermore, five attributes were also taken into account in the study. These attributes were the instructor’s identifier, the course code, the number of times the student took the course, the level of attendance and, finally, the level of difficulty of the course as perceived by the student. The complete set of attributes considered in this study are the following: • Professor(P).Thisisanominal attributecomposedofthree values (three professors were considered for the study). • Subject (S). The course code is also a nominal variable. In this case, the variable was defined with 13 values (13 subjects were considered for the study). • Repetitions of the course (R). It is an integer attribute with ranging values from 0 to 4 (the student with the most repetitions was a student with four repetitions).
  • 16. • Attendance level (A). The Attendance attribute is defined in ordinal scale with the following possible values: {poor, minimal, good, very good, excellent}. • Difficulty level (D). The Difficulty attribute examined in the study is an ordinal variable as well. This ordinal variable ranges from Too easy to Too difficult, with the following five possible values: {Too easy, Easy, Normal, Difficult, Too difficult}. In order to compare our results and discussions with those obtained by Gündüz and Fokoué (2013), the dependent variable is built through a clustering process for this specific problem. Therefore, cluster analysis is applied to identify potential groups in the way students rate their professors. For the sake of simplicity, the k-Means algorithm is used to determine the degree of satisfaction of each student. The number of clusters (classes) was determined according to the accumulated variance explained by the number of factors selected. In our case, the optimum number of clusters was three. After analysing the scores associated with each class, we proceeded to label the three clusters. The labels for each cluster were as follows: {Dissatisfied, neutral, satisfied} modelling in that way the students’ satisfaction level. 3.2. Teaching assistant evaluation Teaching Assistant Evaluation (TAE) is composed of evaluations of teaching performance over three regular semesters and two summer semesters of 151 teaching assistant (TA) assignments at the Statistics Department of the University of Wisconsin-Madison. The dataset is publicy available in the UCI Machine Learning repository.3 The performance of each teacher is measured with an ordinal variable with three different levels: low performance, medium
  • 17. performance and high performance. The independent variables considered to model the teaching performance are the following: • A binary variable defining whether the teacher is a native English speaker or not; • A nominal variable to define the code of the course (26 categories); • A nominal variable to define the course instructor (25 categories);1This algorithm is called the proportional odds model (POM) (McCullagh, 1980) 2Available at http://archive.ics.uci.edu/ml/datasets/Turkiye+Student+ Evaluation 3https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+ Evaluation © 2015 Wiley Publishing Ltd164 Expert Systems, April 2016, Vol. 33, No. 2 https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Eval uation https://archive.ics.uci.edu/ml/datasets/Teaching+Assistant+Eval uation • A binary variable defining whether the semester is a regular or a summer one; • The size of the class.
  • 18. 3.3. Culture and learners satisfaction This research was carried out with a sample of students in four online universities: the Open University of Catalonia in Spain, the University of New Mexico in the United States, the University of Peking in China and the Autonomous Popular University of the State of Puebla in Mexico. The majority of the participants were enrolled in online social sciences courses (mainly Education or Psychology studies). Data were collected through a survey of 709 participants. This dataset was analysed by Barbera and Linder-Van Berschot (2011) using statistical tests. This study will use an OR approach. The dependent variable is learner satisfaction (LST), while the independent ones are eight institutional factors as follows: (a) Learner support (LS); (b) Social presence (SP) measuring the degree to which the instructor seems to be concerned about the learners needs; (c) The degree of effectiveness of the teaching strategies of the instructor (also called Instruction (I)); (d) The quality of the Learning Platform (LP); (e) Instructor interaction (II); (f) Learner interaction (LI); (g) Learning content (LC); (h) Course design (CD). Finally, it is also important to note that all variables are measured with a four-point Likert scale with the following options: strongly disagree (SD), disagree (D), agree (A) and strongly agree (SA). 4. The method proposed: a gravitational model for ordinal regression
  • 19. This section presents the proposed algorithm. Firstly, the ordinal regression scenario is described. Secondly, the definition of force and distance and the probabilistic interpretation of the force model are presented. Thirdly, the error function used as objective function is introduced and motivated, and finally, the procedures used to estimate the parameters of the model are discussed. 4.1. Ordinal regression scenario In the ordinal regression problem, a training sample set D ¼ xn; ynð Þf gNn¼1 is available, where xn = (x1n, …, xKn) is the vector of input variables taking values in the input space Ω ⊂ ℝK and the label, yn, belongs to a finite set C = {C1, …, CJ}. Moreover, there is an order relation between these labels, such as C1 ≺C2 ≺…CJ, where ≺ denotes the given order between different ranks. For the proposal, the ‘1-of-J’encodingvectorisadopted.Forthatreason,eachtarget has been encoded asyn ¼ y 1ð Þn ; y 2ð Þn ; …; y Jð Þn � � withy jð Þn ¼ 1 if the pattern is from class j, and y jð Þn ¼ 0 if it is not. Clearly it stands that ∑Jj¼1 y jð Þ n ¼ 1 for every n∈ {1,…,N}. 4.2. Force definition 4.2.1. Definition of force as defined in Cano et al. (2013)
  • 20. The definition of force as proposed by Cano et al. (2013) is described first. Cano et al. (2013) weighted the gravitation of a class by its number of patterns and the total number of patterns.4 In that way, the gravitation of a pattern x for a class j was defined as g x; jð Þ ¼ G ∑ Nj n¼1 1 d xn; x; jð Þ2 ; xn ∈ Cj (1) G :¼ 1 � Nj � 1 N � � (2) where Nj is the number of patterns of the class j and N is the total number of patterns. Furthermore, Cano et al. (2013) used a weight matrix W∈ ℝJ ×ℝK to define the importance of each attribute in each class. This matrix was applied to the distance calculation in their gravitational model. Therefore, the distance proposed in their work is defined as follows: d x1; x2jð Þ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffi ∑ K k¼1
  • 21. wj;k x1;k � x2;k � �2s ; (3) where x1 and x2 are two patterns and wj,k is the weight of input variable k for class j. The attribute-class weight matrix W was optimized by an evolutionary algorithm (trying to maximize the performance of the final model). All the nominal variables were transformed to binary variables generating k-1 variables per attribute (dummy variables) where k is the number of possible variables of the nominal attribute. Ordinal variables are treated as continuous variables to compute the Euclidean distance. Please note that several similar measures exist to determine the distance between two ordered vectors like the correlation coefficient- based metrics (such as the Spearman distance or the Kendall distance). None of the correlation coefficient-based distance functions satisfy the triangle inequality and hence are known as semi-metric, and therefore, they were not included in the gravitational proposed model (Monjardet, 1997). Additionally, normal practice is to treat Likert scales as a continuous variable even though they are not. As long as they have more than five possible values, the bias from discreteness is not large. Finally, once the weight matrix is optimized, the class label of each pattern is determined by comparing the 4With the goal of enhancing the accuracy of prediction for the minority classes © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 165
  • 22. gravitational forces existing between the pattern considered and the different classes. The pattern will adopt the label of the class with the highest gravitational force. 4.2.2. Definition of force for the ordinal regression case In the OR case, the model must take into account the ordinal information of the labels. For the sake of simplicity, suppose you have a problem with J=3 classes ordered as C1 ≺C2 ≺C3. Figure 1 represents the input space of this hypothetical ordinal regression problem. Given a new test pattern x1, according to the gravitational models, the gravitational forces of this pattern with respect to the three classes have to be estimated. Assuming a Euclidean distance, the gravitational forces (computed as in proposed in Cano et al. (2013)) for the test pattern x1 are g(x, C1) = 20.132, g(x, C2) = 1.734 and g(x, C3) = 23.006. The highest gravitational force is g(x, C3). C3 is then attributed to the test pattern. If there is no order relation between the classes, the second highest gravitational force is g(x, C1). However, if the classes are ordered, C2 is closer to C3 than C1, and therefore, the second highest gravitational force should be attained in this class. To generalize, the forces associated to each class should follow a unimodal distribution, that is, they should present only one maximum, which should be absolute. This idea was already applied in the context of neural networks (da Costa et al., 2008). To preserve the ordinal information of the different classes, two approaches could be applied as follows: • Modification of the distance: The first possibility to modify the force is to directly modify the distance allowing the ordering of the class labels for the pattern considered. There
  • 23. are many possible choices for the definition of this distance. The most natural choice is to employ a matrix G∈ ℝK ×ℝK so that the distance between two patterns is computed as xTGx, like in the Mahalanobis distance case. Another possibility is to adopt the definition of distance of Cano et al. (2013) (also called the weighted Euclidean distance). Depending on the choice of the distance, the interpretability of the model will be different. In fact, the elements of the matrix G are a measure of the correlation between the different attributes of the given dataset, whereas the elements of the matrix W∈ ℝJ ×ℝK indicate theimportance of an attribute in the classification with respect to a certain class. In our study, both possibilities will be considered (extending in this direction previous works that only consider the weighted Euclidean distance for gravitational models). • Modification of the force: Another possibility, in the example of 1, to reduce the value of gravitation in C1 is to act on the definition of the force itself. For example, one could define a general force law of a pattern x for a class j as g x; jð Þ ¼ 1 � Nj � 1 N � � ∑ Nj n¼1 1 d xn; x; jð Þ þ aj � �vj ; xn ∈ Cj; (4)
  • 24. where Nj is the number of patterns of the class j, N is the total number of patterns and the distance is defined as in Equation (3). Note that in the aforementioned definition, one parameter vj for each class is considered and that, when the distance tends to zero, the force tends to infinity. To have a proper control over the force value, the aj ∈ ℝ parameter is introduced in the definition of force and is calculated for each class as aj ¼ 1 maxForce � �1 vj ; (5) where maxForce is the maximum value of force allowed. 4.3. Probabilistic interpretation of the forces The order is included in the model following a cost-sensitive approach, penalizing non-unimodal distributions of the force outputs. After this procedure, a multinomial logit formulation could be applied to define the probabilities of each force. Therefore, for robustness of the optimization process, the force for each class is normalized according to the softmax activation function (Bishop, 2007). The softmax activation function maps the range of the force for the j-th class, into the interval [0, 1] with the additional property that the sum of the forces of a pattern towards all classes is one. This transformation can be seen as an estimation of the a posterior probability of a pattern to be classified as a
  • 25. member of each class. The softmax function for the force- based model proposed is defined as P Cljxð Þ ¼ exp g x; lð Þð Þ ∑Jj¼1 exp g x; jð Þð Þ ; (6) where P(Cl|x) is the a posterior probability of the pattern x to belong to Cl and g(x,j) is defined as in Equation (4). This transformation allows us to have the forces in the same scale that the targets labels (because the 1-of-J encoding is adopted in this work). 4.4. Error function formulation As previously stated, the forces (or the a posterior probabilities) obtained for a given pattern x must follow a −1 0 1 2 3 4 5 6 7 8 9 −1 0 1 2 3 4 5 6
  • 26. 7 8 x 1 x 2 Figure 1: An example of classification using gravitational- based models assuming a Euclidean distance. © 2015 Wiley Publishing Ltd166 Expert Systems, April 2016, Vol. 33, No. 2 unimodal distribution. The unimodality constraint is imposed redefining the error function with a penalization term for non-unimodal distributions. In our research, the error function of the model proposed is defined as E W or G; vð Þ ¼ 1 N ∑ N n¼1 ∑ J j¼1 ½y jð Þn P Cjjxn
  • 27. � � � 1 � �2 þ 1 � y jð Þn � � cnjP Cjjxn � �2� (7) where cnj is the cost associated with the pattern n for the jth class. As can be seen in Equation (7), the error function penalizes non-unimodal outputs. The total cost matrix is obtained as C = Y × M, where Y is the matrix representing the ‘1-of-J’ encoding and M is a well-known cost matrix. For example, the absolute cost matrix (mij = |i � j|), the quadratic cost one (mij = |i � j|2) or the zero–one cost matrix. Note that the zero–one cost matrix is the one assumed in nominal classification. In this study, the penalization function with quadratic cost terms achieved the best trade-off between convergence of the optimization problem, quality of the solution and the related classification performance. Therefore, the quadratic cost matrix is used for the proposed model. 4.5. Parameter estimation The optimization of the W or the G matrices and the v vector is a continuous optimization problem whose dimension J� K+J or K� K+J depends on the number of dimensions and the number of classes. To estimate the parameters of the model, an evolutionary algorithm is considered. Evolutionary algorithms have been successfully applied to estimate the parameters of
  • 28. machine-learning models in recent years Fernández-Navarro et al. (2012); Mirchevska et al. (2014). Specifically, the CMA- ES algorithm Hansen and Ostermeier (2001) was used to determine the optimization variables (the W or the G matrix and the v vector). The CMA-ES algorithm is an evolutionary algorithm (global optimization procedure) for difficult non- linear non-convex optimization problems in continuous domain. Furthermore, the initial values for the W are set to 1.0 and for the v to 2.0, that is, all dimensions initially considered equally relevant and the Euclidean distance is assumed. The correlation matrix, G, was initialized to have zero correlation between the different input variables (the matrix was initialized to be equal to the identity matrix). 4.6. Summary of methodologies The algorithm proposed has several variations according to the cost matrix used (nominal or ordinal classification) and to the distance considered (the weighted Euclidean or the Mahalanobis distance). The different combinations are summarized as follows: • Nominal approaches: ○ Generalized force-based model with a zero–one cost and the Mahalanobis distance GFMMZOC � � . ○ Generalized force-based model using a zero–one cost and the weighted Euclidean distance GFMWEZOC � � .
  • 29. • Ordinal regression approaches: ○ Generalized force-based model assuming a quadratic cost and the Mahalanobis distance GFMMQC � � . ○ Generalized force-based model considering a quadratic cost and the weighted Euclidean distance GFMWEQC � � . 5. Computational experiments and results This section presents the experimental study performed to validate the new algorithms. In Section 5.1, the measures employed to evaluate the performance of the algorithms and the description of the algorithms chosen for the comparison and their relevant parameters are given. The results of the different methods selected are provided in Section 5.2. 5.1. Experimental design For comparison purposes, different state-of-the-art methods have been included in the experimentation. These methods are the following: • Nominal classifiers ○ The multi-logistic regression (MLR) algorithm. It is based on applying the LogitBoost algorithm with
  • 30. simple regression functions and determining the optimum number of iterations by a fivefold cross- validation (Landwehr et al., 2005). ○ An MLP with sigmoid units as hidden nodes, obtained by means of the back-propagation algorithm (Witten & Frank, 2005). ○ Support vector machine (SVM) (Vapnik, 1999) nominal classifier is included in the experiments in order to validate our proposal contributions. Cost support vector classification (SVC) available in libSVM 3.0 (Chang & Lin, 2001) is used as the SVM classifier implementation. • Regression approaches ○ Regression neural network model (RNN): as stated in Section 2, regression models can be applied to solve the classification of ordinal data. A common technique for ordered classes is to estimate by regression any ordered scores s1 ⪯ s2 ⪯ … ⪯ sJ � 1 ⪯ sJ by replacing the target class Ci with the score si. The simplest case would be setting si = i; i = 1, …, J. A neural network with a single output was trained to estimate the scores. • Ordinal regression approaches ○ The proportional odd model (POM) McCullagh (1980) is an extension of the binary logistic regression model © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 167
  • 31. for ordinal multi-class categorization problems. This is one of the first models specifically designed for ordinal regression, and it arose from a statistical background. For educational purposes, this model was used in Yay and Akıncı (2009) and Liu (2009). ○ Support vector ordinal regression (SVOR) by Chu and Keerthi (2005, 2007) proposes two new support vector approaches for ordinal regression. In this study, the two approaches proposed are considered: the SVOR with explicit constraints algorithm (SVOREX) and the SVOR with implicit constraints method (SVORIM). All SVM classifiers were run using tools available in the libsvm library (version 3.0) (Chang & Lin, 2001). The authors of SVOREX and SVORIM provide software tools of their methods.5 The mnrfit function of MATLAB (MathWorks, Natick, MA, United States) was used for training the POM model. The MLP and MLR methods were run using Weka’s tools.6 Finally, the RNN method was implemented following the suggestions by the authors Fernández-Navarro et al. (2013). Regarding the hyper-parameters of different algorithms, the following procedure has been applied. For the support vector algorithms, that is, SVC, SVOREX and SVORIM, the corresponding hyper-parameters (regularization parameter, C and width of the Gaussian functions, γ) were adjusted using a grid search with a fivefold cross-validation, with the following ranges: C ∈ {103, 101, …, 10� 3} and γ ∈ {103, 100, …, 10� 3}. For the neural network algorithms, that is, MLP and RNN, the corresponding hyper- parameters (number of hidden neuron, H, and number of iterations of the local search procedure, iterations 7) were adjusted using a grid search with a fivefold cross-validation, considering the following ranges: H ∈ {5, 10, 15, 20, 30, 40}
  • 32. and iterations ∈ {25, 50, …, 500}. For the MLP method, the learning rate was set to 0.3 and the momentum to 0.2. Two evaluation metrics were considered to validate the performance of the different models: (a) the Accuracy (Acc) and (b) the mean absolute error (MAE). Acc is the correct classification rate Acc ¼ 1 � 1 N ∑ N i¼1 I y�i ≠ yi � � ¼ 1 � MZE; (8) where yi is the true label, y � i is the predicted label, N is the number of patterns and I(�) corresponds to the zero–one loss function. Hence, MZE is the mean zero error. The MAE is the average deviation in absolute value of the predicted rank from the true one MAE ¼ 1 N ∑ N i¼1 O yið Þ � O y�i
  • 33. � � ; (9) where O yið Þ � O y�i � � is the distance between the true and predicted ranks. The first measure is simply the fraction of correct predictions on individual samples. The second metric is defined as the average deviation of the prediction from the true targets. These two measures aim to evaluate different aspects when an OR problem is considered: accuracy measures that patterns are generally well classified and the MAE measures that the classifier tends to predict a class as close to the real class as possible. Finally, regarding the evaluation of the performance of the different methods, multiple random splits of the datasets were considered. For the educational OR problems, 30 splits with 50 % and 50 % of the instances in the training and test sets were considered, respectively. All the partitions were the same for all the methods evaluated, and one model was trained and evaluated for each split. A similar experimental setup was performed in a recent review of ordinal models Gutiérrez et al. (2012). 5.2. Results The gravitation-based methods were compared with the well-known nominal classification, OR and regression techniques described in Section 5.1, using the OR metrics. Table 2 shows the overall generalization results obtained with the different techniques tested. A descriptive analysis of the results leads to the following remarks: (a) The GFMMQC methods achieved the best results in two datasets and the second best result in one case using the MAEG metric as the test variable, while the GFMWQC achieved the best performance in one dataset and the second best results
  • 34. in another problem using the same metric. (b) The gravitational ordinal models are still competitive in AccG, achieving the best results in two problems and the second best results in other two problems. As can be observed, the POM model is not able to reflect non-linear relationships among input variables, necessary for performing a realistic classification task. It is important to highlight that this was the model that was tested in Yay and Akıncı (2009) and in Liu (2009). In general, OR models tended to outperform their nominal counterparts (the SVORIM and the SVOREX methods obtained better results than their nominal version, the SVC). Finally, each pair of algorithms is compared by means of the Wilcoxon test Demsar (2006). A level of significance of α = 0.05 was considered, and the corresponding correction for the number of comparisons was also included. The control method was the GFMMQC method because it obtained the best mean ranking specially in the MAEG metric (especially useful in ordinal problems). As shown in Table 2, the GFMMQC yields the state-of-the-art in the OR field. 6. Discussions In this section, we analyse the force-based ordinal models (both the model based on the Mahalanobis distance, the GFMMQC method and the one based on the weighted Euclidean 5SVOREX and SVORIM methods source code available at http:// gatsby.ucl.ac.uk/ chuwei/svor.html 6Weka: http://www.cs.waikato.ac.nz/ml/weka/ 7The iterations in the MLP method correspond to the training time required
  • 35. © 2015 Wiley Publishing Ltd168 Expert Systems, April 2016, Vol. 33, No. 2 http://www.cs.waikato.ac.nz/ml/weka/ distance, the GFMWEQC method) of the first split out of the thirty splits in the TSE-W-SME dataset. The most important attributes in the classification of student satisfaction obtained according to the GFMWEQC model are extracted. The most significant correlations detected by the GFMMQC model are examined. Both analyses are useful to extract meaningful knowledge to improve the teaching-learning process. Finally, some educational recommendations based on the interpretation of the models are provided. 6.1. Analysis of the best GFMWEQCmodel This section provides an interpretation of the GFMWEQC model of the first split out of the 30 splits in the TSE-W-SME dataset. Firstly, the statistical properties of the model are described. Table 3 shows the statistical results of the model implemented including the confusion matrices as well.8 Table 2: Generalization results of the AccG and MAEG of the methods proposed compared with those obtained using different statistical and artificial intelligence methods. Results and p- values of the Wilcoxon rank sum test TSE-I-SME TSE-W-SME AccG MAEG p- valueAcc p � valueMAE AccG MAEG p- valueAcc p- valueMAE MLR 89.071.82 0.12260.02 3.0E � 11∘ 2.8E � 11∘ 87.251.27
  • 36. 0.14320.02 1.3E � 8∘ 1.3E � 10∘ MLP 87.982.10 0.13580.01 3.0E � 11∘ 3.0E � 11∘ 86.432.12 0.16670.01 2.4E � 9∘ 3.0E � 11∘ SVC 93.710.89 0.09470.01 0.0724 2.9E � 9∘ 88.430.08 0.11010.03 1.6E � 7∘ 2.8E � 04∘ GFMMZOC 93.031.42 0.08340.01 1.7E � 4∘ 3.1E � 5∘ 88.731.78 0.10880.01 1.3E � 6∘ 1.7E � 04∘ GFMWEZOC 93.441.39 0.09140.01 0.0063∘ 3.3E � 8∘ 88.931.33 0.10910.01 5.4E � 5∘ 2.3E � 04∘ RNN 89.562.03 0.16880.02 4.0E � 11∘ 3.0E � 11∘ 82.561.43 0.19070.01 1.1E � 11∘ 3.0E � 11∘ POM 90.891.67 0.11170.02 7.3E � 11∘ 3.6E � 11∘ 88.050.22 0.12320.02 4.4E � 7∘ 1.2E � 7∘ SVOREX 93.161.33 0.07970.01 0.0011∘ 0.0042∘ 89.760.87 0.12210.01 0.0016∘ 2.4E � 6∘ SVORIM 93.691.22 0.07610.01 0.0679 0.0963 89.951.11 0.12090.02 0.0456∘ 3.7E � 7∘ GFMMQC 94.181.01 0.07190.01 — — 91.652.35 0.09350.01 — - GFMWEQC 93.571.84 0.08220.01 — — 90.572.18 0.09570.01 — - TAE CLS AccG MAEG p- valueAcc p � valueMAE AccG MAEG p- valueAcc p- valueMAE MLR 49.425.77 0.50530.05 3.0E � 11∘ 7.7E � 9∘ 56.671.21 0.57300.08 3.0E � 11∘ 3.0E � 11∘ MLP 55.333.26 0.46970.08 3.0E � 11∘ 2.6E � 6∘ 62.221.89 0.48180.09 3.0E � 11∘ 5.6E � 8∘ SVC 59.607.39 0.44830.07 0.2971 1.7E � 6∘ 69.301.71 0.46030.05 0.0850 2.2E � 5∘ GFMMZOC 58.514.59 0.44930.07 0.0031∘ 3.9E � 4∘ 69.251.76 0.46360.05 0.2519 3.1E � 4∘ GFMWEZOC 58.965.73 0.43860.05 0.3722 0.0023∘ 68.850.69 0.47950.03 0.9823 1.7E � 7∘
  • 37. RNN 54.564.76 0.47970.05 3.0E � 11∘ 2.2E � 7∘ 60.121.03 0.50540.06 3.0E � 11∘ 8.1E � 10∘ POM 50.447.73 0.49620.07 3.0E � 11∘ 7.7E � 6∘ 57.231.43 0.56890.05 3.0E � 11∘ 3.0E � 11∘ SVOREX 57.895.82 0.44110.06 0.0040∘ 0.0300∘ 67.800.89 0.42350.07 5.9E � 5∘ 0.6843 SVORIM 57.635.71 0.40110.07 1.94E � 4∘ 0.9823 68.711.34 0.42690.03 0.4733 0.9589 GFMMQC 59.186.40 0.39830.05 — — 68.902.14 0.42740.05 — - GFMWEQC 59.406.40 0.39110.05 — — 67.121.99 0.43120.07 — - MLR, multi-logistic regression; SVC, support vector classification; MLR, multi-layer perceptron; RNN, regression neural network model; POM, proportional odd model; SVOREX, support vector ordinal regression with explicit constraints algorithm; SVORM, support vector ordinal regression with implicit constraints; TSE-I-SME, Turkiye student dataset including the single-minded evaluation; TSE-W- SME, Turkiye student dataset without including the single-minded evaluation; TAE, Teaching assistant evaluation; CLS, Culture and learners satisfaction. The best result is in bold face and the second one in italics ∘ : The null hypothesis that results provided by the comparison method and the results ofGFMMQC are samples continuous distributions with equal medians is rejected 8The contingency or confusion matrix CM for a classification problem with J classes and N training or generalization patterns is given by the following expression:
  • 38. M ¼ nij; ∑ J i;j¼1 nij ¼ N ( ) (10) where nij represents the number of times the patterns are predicted by classifier g to be in class j when they really belong to class i. The diagonal corresponds to correctly classified patterns and the off-diagonal to mistakes in the classification task. Table 3: Statistical values of the best GFMWEQC model Best GFMWEQC ordinal regression model AccT ¼ 100:00%; AccG ¼ 92:80% MAET ¼ 0:0000; MAEG ¼ 0:0832 CMT ¼ 648 0 0 0 507 0 0 0 213 0 [email protected] 1 CA; CMG ¼
  • 39. 678 18 3 44 452 10 13 14 185 0 [email protected] 1 CA © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 169 Table 3 includes the following information: accuracy on the training set (AccT), accuracy on the generalization (test) set (AccG), MAE on the training set (MAET) and MAE on the generalization set (MAEG), confusion matrix (CM) for the training set (CMT) and CM for the generalization set (CMG). As can be seen in the CMs in Table 3, the model in general is able to reflect the order among the different classes. Then, we will interpret the coefficient of the W∈ ℝJ ×RK matrix. The elements of the W matrix indicate the importance of an attribute in the classification with respect to a certain class. Thus, the model detects the most important attributes in the classification of the student’s satisfaction level (taking into account the ordinal nature of the dependent variable). The algorithm detected that the seven most influential variables in the determination of student satisfaction were the following (in this order): instructor’s knowledge (Q13), instructor’s effective use of the class hours (Q19), instructor’s coherence with lesson plan (Q15), openness and respect of the instructor
  • 40. to students’ views (Q22), instructor’s positive approach to students (Q21), instructor readiness for classes (Q14), instructor explanations about the course and instructor helpfulness (Q20). On the other hand, the least important variables to explain student satisfaction ratings were the following (in this order): new perspective of students’ life and world (Q12), clearness of course aims (Q1), subject (S), difficulty (D), attendance (A), professor (P) and number of repetitions of the course (R). It can be seen from these results that the best indicators of student satisfaction are those related to professor competencies. Specifically, the students tend to consider the variables related to the professor more important, giving less importance to the variables related to the effect of learning and the course design. These results align with previous research that claims that the learner’s satisfaction is positively correlated with quality of learning outcomes. For example, Palmer and Holt (2009) justified the importance of adopting an interactive learning approach instead of a planned learning approach where the learners’ satisfaction is promoted mainly through the elements existing in the educational interaction (instead of basing the learners’ satisfaction in the preparation of the lectures and in the content taught during the lectures). These results also validate the work of Chang and Smith (2008), where the importance of the educational interaction for learner satisfaction is strongly emphasized. Furthermore, our work is also in line with the works of Bangert (2008) and Shea and Bidjerano (2009), where the importance of the social presence in educational environments is highlighted. Differing from the study of Atay and Yildirim (2010), our learners do not consider the elements of learning transfer for their academic satisfaction important; instead, they focus on the elements of the instructional moment. It is also worth highlighting that this study contradicts the traditional belief that the student satisfaction is highly correlated with the difficulty of the course as perceived by the learners. Unfortunately, this study has not considered all the variables affecting learner satisfaction
  • 41. reported in the learner satisfaction literature (Yukselturk, 2009). For example, the educational level, the self-efficacy or the locus of control variables were not included in this study. Finally, we compare our work with that of Gündüz and Fokoué (2013); in particular, we discuss the similarities and differences of the two studies. Gündüz and Fokoué (2013) concluded that the Q10, Q14, Q20 and Q24 questions were the most important variables to explain student satisfaction ratings. In the study of Gündüz and Fokoué (2013), learners give more importance to individual questions and structural aspects related to the design of the learning process in contrast to what has been found in our study. This study also shares some similarities with our study. For example, both studies consider the instructor readiness for classes and the instructor’s positive approach to students to be very important. The differences between the two studies can be justified for the following reasons. Firstly, Gündüz and Fokoué (2013) included the single-minded evaluators in their dataset, while in the TSE-W-SME, these evaluators were discarded. Secondly, Gündüz and Fokoué (2013) applied a nominal classifier to detect the most important variables, ignoring the ordinal information existing in the dependent variable. Taking into account the ordering information, our study was able to outperform the base classifier adopted in the study of Gündüz and Fokoué (2013). 6.2. Analysis of the best GFMMQC model In this section, we analyse the performance of the bestGFMMQC model, interpreting its coefficients as a way of improving the learning–teaching process. Table 4 shows the statistical results of the best model implemented. As can be seen in the CMs in Table 4, the model promotes the ordering among the different classes. There are less errors between not adjacent classes than
  • 42. between adjacent ones. For example, considering the test set, there are eight students classified as Neutral when they should be classified as Satisfied and just two students that were classified as Dissatisfied, being Satisfied students. On the other hand, we also analyse the coefficients of the G ∈ ℝK × ℝK matrix. The elements of the matrix G are a measure of the correlation between the different attributes of the given dataset. The G matrix represents the existing covariance between the independent variables. In this section, we focus our attention on the existing correlations detected by the algorithm for the ordered variables (numeric and Likert variables).9 It is important to note that these 9Note that all nominal variables were transformed to binary variables generating k variables per attribute, where k is the number of possible values of the nominal attribute. Table 4: Statistical values of the best GFMMQC model Best GFMMQC ordinal regression model AccT ¼ 100:00%; AccG ¼ 96:47% MAET ¼ 0:0000; MAEG ¼ 0:0409 CMT ¼ 648 0 0 0 507 0 0 0 213 0 [email protected]
  • 43. 1 CA; CMG ¼ 689 8 2 19 482 5 6 10 196 0 [email protected] 1 CA © 2015 Wiley Publishing Ltd170 Expert Systems, April 2016, Vol. 33, No. 2 correlations are needed to perform the mapping from the input space to the output one. They are not necessarily the existing correlations in the original input space. In other words, two variables that are correlated in the G matrix are not necessarily correlated in the input space (there is not necessarily a problem with the multicollinearity of the input data). The multicollinearity problem affects linear models specifically. The proposed model is capable of better modelling of the problem in this scenario as observed in the experimental results. Then, we proceed to analyse the meaning and the impact of the existing correlations among the most important variables detected by the previous algorithm (Q13, Q19, Q15, Q22, Q21, Q14 and Q40 variables). The instructor’s knowledge (Q13) variable is highly correlated to the instructor’s effective use of the class hours (Q19) variable.
  • 44. According to that, having top-level knowledge allows the teachers to effectively use his/her teaching hours. This synergy has an important effect in the final student satisfaction rating as shown in the previous section. On the other hand, the instructor’s coherence with the lesson plan (Q15) variable is significantly correlated to the following variables: openness and respect of the instructor to students’ views (Q22), instructor’s positive approach to students (Q21), instructor readiness for classes (Q14) and instructor explanations about the course and instructor helpfulness (Q20). Taking into account the existing correlations between the most important variables, we recommend that the instructors focus on the two following ones: • The instructor’s knowledge (Q13): Improving this variable, we can also improve the second most important variable (instructor’s effective use of the class hours (Q19)). This finding allows us to discover a new point of view in the traditional overview of a university professor. Subject matter knowledge is important, however, in the teaching–learning process, it is necessary to have a professor who can communicate his/her knowledge effectively (an effective professor). The work of Gibbs and Coffey (2004) showed that there were significant positive changes with respect to his/her effectiveness in trained teachers and negative changes in untrained teachers. In fact, this is one of the reasons why the professor training in universities around the world is so appreciated. So, it is critical to pay special attention to the training of professors. Because of this theory, student give to the effective use of the class hours variable an organizational and personal status because they have no sense of wasting time in class. The duo composed of the knowledge of the professor variable and the methodology applied variable is extended to a triplet in this study (by the inclusion of the effective use of hours
  • 45. variable). In this new framework of learning, the feeling of learning governs the experience of learning. For knowing the scope of the aforementioned, these correlations should be linked to learning outcomes. Thus, we would have an external measure of the students’ perceptions about what gives them greater academic satisfaction. • The instructor’s coherence with lesson plan (Q15): Controlling this variable, the instructor may yield high rates also in the following next four important variables (Q22, Q21, Q14 and Q20). The teachers’ coherence in their teaching–learning method directly affects the student’s academic satisfaction. Empirically, we have proved that the participation that teachers allow to their students is a powerful variable in the determination of the student’s satisfaction rating. Thus, students do not have a good perception of a professor who gives a traditional lecture (without any interaction). They appreciate it when the professor follows the learning plan rigorously, interacts with them taking into account their previous knowledge and experiences in a positive way and is willing to help in their learning. It stresses the importance of leaving the traditional teaching method (where the only interaction with the learners is in the transfer of the professor’ knowledge to the students) to adopt a more interactive teaching method (focused on enhancing students skills, promoting that students are able to continue the quest for knowledge throughout their studies). On the other hand, the inquiry-based learning approach promotes the social presence, the cognitive presence and the teaching presence Garrison (2011). All these variables were highlighted as key variables in the prediction of students’ performance. Therefore, the adoption of this learning approach will allow to
  • 46. practitioners to improve their student evaluations. 7. Conclusions The presented work introduces the OR paradigm in the field of EDM, enlarging the techniques available in the EDM framework (mainly focused in nominal classification and regression approaches). The presented paradigm differs from existing nominal classification or regression techniques in the nature of the variable of study. The main particularity of an OR problem is that its variable of study (also called dependent variable) is discrete and that the labels of its different classes have an intrinsic (natural) order. The proposed OR models could be used for the following: • Predict students’ future learning behaviour using an ordinal regression approach; • Study the effects of different kinds of technological- pedagogical support; • Advance scientific knowledge about learning and learners. After presenting some OR problems that were improperly addressed with nominal classification or regression methods, more attention has been given to describe an interpretable and amenable model based on the concept of gravitation. The model was specially designed for this research. It took into account all the specific characteristics of the problem © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 171
  • 47. to be tackled. The proposed method extends the state-of- the-art of gravitational models by generalizing the definition of force in its mathematical expressions. Furthermore, the model was adapted to the ordinal scenario (imposing the well-known unimodal constraint in the outputs of the model). The proposed models are easily interpretable, which make them especially interesting for educational purposes enabling use by educational practitioners, not just by researchers. To exhibit the importance of this paradigm for the EDM community and also the interpretability of the proposed models, the methods were tested with four OR EDM datasets. The gravitational ordinal models achieved a competitive performance especially if they are compared with state-of-the-art classification models. Finally, it is worth mentioning that the interpretation of the model allows us to extract some important conclusions for educational environments. The main educational findings of this study are that the two key factors in the prediction of learners’ satisfaction are the instructor’s knowledge and the instructor’s coherence with the lesson plan. The remaining most important variables are strongly correlated with these ones. From these correlations, we show the importance of leaving traditional teaching methods to adopt a teaching method that analyses and appreciates the students’ knowledge and skills and promotes the interaction between the main actors in the learning–teaching experience. Acknowledgements The research work of F. Fernández-Navarro was partially supported by the TIN2014-54583-C2-1-R project of the Spanish Ministry of Economy and Competitiveness (MINECO), FEDER funds and the P2011- TIC-7508 project of the “Junta de Andalucia” (Spain).
  • 48. Appendix A. EDM techniques implemented nowadays Currently, EDM techniques have been used to address mainly the following type of problems (Romero & Ventura, 2007): (a) Analysis and visualization data: The objective of the analysis and visualization data is to summarize useful information in a visual way and to support the decision-making process. Statistics and visualization information are the two main techniques used for this task. Statistics on students’ usage are a powerful tool to evaluate the impact of an e-learning system. Usage statistics may be extracted using standard tools designed to analyse web server logs (Zaïane et al., 1998). Other general statistics may also represent the connected student distribution through time or the most frequently acceded courses (Zorrilla et al., 2005). (b) Clustering: It is the task of grouping a set of patterns in such a way that patterns in the same group (called a cluster) are more similar (in general according to a distance criteria) to each other than those in the other groups (clusters). For example, in Tang et al. (2000), data clustering is used to promote group-based collaborative learning. They found clusters of students with similar learning characteristics based on the sequence and the contents of the pages they visited. (c) Classification: The main objective of classification is to identify which of a set of categories (sub-populations) a new pattern (also called observation or instance) belongs, on the basis of a training set of data containing patterns whose category membership is known (Zafra et al., 2011). An example of this task could be the classification of the final grade of the students based
  • 49. on features extracted from web-logs, as proposed in Minaei-Bidgoli and Punch (2003). In this research work, the dependent variable is estimated as a discrete variable (the final grade is represented as A, B, C and D). One problem associated with this approach is that the ordinal nature of the dependent variable was not taken into account in the design of the classifier. This can result in the underperformance of the final classifier. (d) Regression: The main objective of regression analysis is the prediction of the value a continuous dependent variable according to the values of several independent variables. In classification problems, the dependent variable is discrete, while in regression analysis, the dependent variable is continuous. An example of regression analysis is the prediction of the final grade of certain students. In this specific problem, the final grade should be represented as a continuous variable (ranging from 0 to 10). (e) Outlier detection (or anomaly detection) is the identification of items, events or patterns that do not conform to an expected pattern or other items in a dataset. Typically, the anomalous items will translate to some kind of problem such as bank fraud or a structural defect. Anomalies are also referred to as outliers, novelties, noise, deviations and exceptions. Ueno (2003) proposes to use the response time data from e-learning environments as a means of detecting outliers or irregular learning patterns in learners. The outlier statistics are developed considering both students’ abilities and content difficulties. (f) Association rule learning is a method for discovering interesting relations between variables in large datasets. For example, the rule { morning, high flexibility } → {A} found in e-learning systems would indicate that if the
  • 50. student of an e-learning system has a high flexibility in his/her study time and also he/she studies during the morning, then he or she is likely to achieve the best grade in his/her studies. In this context, it is worth highlighting the work of Romero et al. (2004) where a grammar-based genetic programming with multi- objective optimization techniques for providing a feedback to courseware authors is proposed. Appendix B. Questionnaire for the Turkiye Student Evaluation Dataset Questions answered by students are again in ordinal scale. Concretely, they are defined with a 5-point Likert scale with © 2015 Wiley Publishing Ltd172 Expert Systems, April 2016, Vol. 33, No. 2 the following values {strong disagree, disagree, neutral, agree, strongly agree}. Specifically, the students answered to the following questions: • Q1: The semester course content, teaching method and evaluation system were provided at the start. • Q2: The course aims and objectives were clearly stated at the beginning of the period. • Q3: The course was worth the amount of credit assigned to it. • Q4: The course was taught according to the syllabus announced on the first day of class. • Q5: The class discussions, homework assignments,
  • 51. applications and studies were satisfactory. • Q6: The textbook and other course resources were sufficient and up to date. • Q7: The course allowed field work, applications, laboratory, discussion and other studies. • Q8: The quizzes, assignments, projects and exams contributed to helping the learning. • Q9: I greatly enjoyed the class and was eager to actively participate during the lectures. • Q10: My initial expectations about the course were met at the end of the period or year. • Q11: The course was relevant and beneficial to my professional development. • Q12: The course helped me look at life and the world with a new perspective. • Q13: The Instructor’s knowledge was relevant and up to date. • Q14: The Instructor came prepared for classes. • Q15: The Instructor taught in accordance with the announced lesson plan. • Q16: The Instructor was committed to the course and was understandable. • Q17: The Instructor arrived on time for classes. • Q18: The Instructor has a smooth and easy to follow delivery/speech.
  • 52. • Q19: The Instructor made effective use of class hours. • Q20: The Instructor explained the course and was eager to be helpful to students. • Q21: The Instructor demonstrated a positive approach to students. • Q22: The Instructor was open and respectful of the views of students about the course. • Q23: The Instructor encouraged participation in the course. • Q24: The Instructor gave relevant homework assign ments/projects, and helped/guided students. • Q25: The Instructor responded to questions about the course inside and outside of the course. • Q26: The Instructor’s evaluation system (midterm and final questions, projects, assignments, etc.) effectively measured the course objectives. • Q27: The Instructor provided solutions to exams and discussed them with students. • Q28: The Instructor treated all students in a fair and objective manner. Appendix C. Glossary of Terms A list of technical words related to the manuscript is given in the following. Accuracy Accuracy is the percentage of patterns
  • 53. correctly classified by the model. It is also known as the Correct Classification Rate (CCR). Educational data mining Educational Data Mining is a field of study where the goal is to develop new methods for exploring educational data. Confusion matrix (or contingence matrix) A confusion matrix, also known as a contingency matrix, is a table that contains information about actual and predicted classifications carried out by a classification model. Each column of the matrix encompasses the instances in a predicted class, while each row includes the instances in an actual class. Classification In supervised learning, classification is the problem of determining to which of a set of categories a new pattern belongs, on the basis of a training set where the category to which each pattern belongs and its characterization are known. Clustering Clustering is the task of grouping a set of patterns in such a way that patterns in the same category are more similar to each other than to those in other categories.
  • 54. Error term In econometric, the error term is a variable that represents the differences among the real data and the predicted ones. The error term is also known as the ‘residual’ term. Mean absolute error Mean absolute error is the average deviation of the prediction from the actual targets Distance metric learning Distance metric learning is a task where the goal is to learn a metric for the input data space from a given set of pair of similar/dissimilar patterns that preserves the distance relation among the training set. Ordinal regression The learning task of ordinal regression is to assign patterns into a set of finite ordered classes. Regression In statistics, regression is a task where the goal is to estimate the relationship among one or more input variable and a scalar (continuous) dependent variable. Softmax function The softmax activation function maps the range of the data for each class, into the
  • 55. interval [0, 1] and the summation of all the output classes is one. Unimodal distribution The unimodal distribution is a kind of distribution where the data have only one clear peak. © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 173 References AKIRI, A.A. and N.M. UGBORUGBO (2009) Teachers’ effectiveness and students’ academic performance in public secondary schools in Delta State, Nigeria, Stud Home Comm Sci, 3, 107–113. ATAY, L. and H.M. YILDIRIM (2010) Determining the factors that affect the satisfaction of students having undergraduate tourism education with the department by means of the method of classification tree, Tourismos: An International Multidisciplinary Journal of Tourism, 5, 73–87. BAKER, R. (2010) Data mining for education, International Encyclopedia of Education, 7, 112–118. BANGERT, A. (2008) The influence of social presence and teaching
  • 56. presence on the quality of online critical inquiry, Journal of Computing in Higher Education, 20, 34–61. BARBERA, E. and J. LINDER-VAN BERSCHOT (2011) Systemic multicultural model for online education: Tracing connections among learner inputs, instructional processes and outcomes, Quarterly Review of Distance Education, 12, 167–180. BHARDWAJ, B.K. and S. PAL (2011) Data mining: a prediction for performance improvement using classification, International Journal of Computer Science and Information Security, 9, 136–140. BISHOP, C.M. (2007) Pattern Recognition and Machine Learning, 1st edn., Springer. CANO, A., A. ZAFRA and S. VENTURA (2013) Weighted data gravitation classification for standard and imbalanced data, Cybernetics, IEEE Transactions on, 43, 1672–1687. CHANG, C.-C., C.-J. LIN (2001) LIBSVM: a library for support vector machines. Software available at: http://www.csie.ntu. edu.tw/~cjlin/libsvm/faq.html (Accessed 25th October 2015). CHANG, S.-H.H. and R.A. SMITH (2008) Effectiveness of personal interaction in a learner- centered paradigm distance education class based on student satisfaction, Journal of Research on Technology in Education, 40, 407–426. CHU, W. and S.S. KEERTHI (2005) New approaches to support vector ordinal regression, in In ICML ‘05: Proceedings of the 22nd International Conference on Machine Learning, 145–152.
  • 57. CHU, W. and S.S. KEERTHI (2007) Support vector ordinal regression, Neural Computation, 19, 792–815. DA COSTA, J.F.P., H. ALONSO and J.S. CARDOSO (2008) The unimodal model for the classification of ordinal data, Neural Networks, 21, 78–91. DEMSAR, J. (2006) Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research, 7, 1–30. FERNÁNDEZ-NAVARRO, F., P. GUTIERREZ, C. HERVÁS- MARTÍNEZ and X. YAO (2013) Negative correlation ensemble learning for ordinal regression, IEEE Transactions on Neural Networks and Learning Systems, 24, 1836–1849. FERNÁNDEZ-NAVARRO, F., C. HERVÁS-MARTÍNEZ, R. RUIZ and J.C. RIQUELME (2012) Evolutionary generalized radial basis function neural networks for improving prediction accuracy in gene classification using feature selection, Applied Soft Computing, 12, 1787–1800. GARRISON, D.R. (2011) E-learning in the 21st Century: A Framework for Research and Practice, Taylor & Francis. GIBBS, G. and M. COFFEY (2004) The impact of training of university teachers on their teaching skills, their approach to teaching and the approach to learning of their students, Active Learning in Higher Education, 5, 87–100.
  • 58. GÜNDÜZ, N. and E. FOKOUÉ (2013) Data mining and machine learning techniques for extracting patterns in students’ evaluations of instructors. Working Paper CQAS-2013-2, Rochester Institute of Technology, Center for Quality and Applied Statistics, 98 Lomb Memorial Drive, Rochester, NY 14623, USA. GUTIÉRREZ, P., M. PÉREZ-ORTIZ, F. FERNÁNDEZ- NAVARRO, J. SÁNCHEZ-MONEDERO and C. HERVÁS-MARTÍNEZ (2012) An experimental study of different ordinal regression methods and measures, in Hybrid Artificial Intelligent Systems. Vol. 7209 of Lecture Notes in Computer Science, 296–307. HANSEN, N. and A. OSTERMEIER (2001) Completely derandomized self-adaptation in evolution strategies, Evolutionary Computation, 9, 159–195. LANDWEHR, N., M. HALL and E. FRANK (2005) Logistic model trees, Machine Learning, 59, 161–205. LIN, C.F., Y.-C. YEH, Y.H. HUNG and R.I. CHANG (2013) Data mining for providing a personalized learning path in creativity: an application of decision trees, Computers & Education, 68, 199– 210.
  • 59. LIU, X. (2009) Ordinal regression analysis: Fitting the proportional odds model using Stata, SAS and SPSS, Journal of Modern Applied Statistical Methods, 8, 632–645. MCCULLAGH, P. (1980) Regression models for ordinal data, Journal of the Royal Statistical Society: Series B: Methodological, 42, 109–142. MINAEI-BIDGOLI, B. and W.F. PUNCH (2003) Using genetic algorithms for data mining optimization in an educational web- based system. In Genetic and Evolutionary Computation- GECCO 2003, Springer, 2252–2263. MIRCHEVSKA, V., M. LUŠTREK and M. GAMS (2014) Combining domain knowledge and machine learning for robust fall detection, Expert Systems, 31, 163–175. MONJARDET, B. (1997) Concordance between two linear orders: the Spearman and Kendall coefficients revisited, Journal of Classification, 14, 269–295. MOREIRA, C., P. CALADO and B. MARTINS (2013) Learning to rank academic experts in the DBLP dataset, Expert Systems, In Press, 10.1111/exsy.12062 NA AYALA, A.P. (2014) Educational data mining: a survey and a data mining-based analysis of recent works, Expert Systems with Applications, 41(4, Part 1), 1432–1462.
  • 60. NEBOT, A., F. CASTRO, A. VELLIDO and F. MUGICA (2006) Identification of fuzzy models to predict students performance in an e-learning environment, in The Fifth IASTED International Conference on Web-Based Education, WBE, 74–79. OBERREUTER, G. and J.D. VELASQUEZ (2013) Text mining applied to plagiarism detection: the use of words for detecting deviations in the writing style, Expert Systems with Applications, 40, 3756– 3763. PALMER, S.R. and D.M. HOLT (2009) Examining student satisfaction with wholly online learning, Journal of Computer Assisted Learning, 25, 101–113. PENG, L., B. YANG, Y. CHEN and A. ABRAHAM (2009) Data gravitation based classification, Information Sciences, 179, 809–819. ROBERTS, J. and R. STYRON JR. (2010) Student satisfaction and persistence: factors vital to student retention, Research in Higher Education Journal, 6, 1–18. ROMERO, C. and S. VENTURA (2007) Educational data mining: a survey from 1995 to 2005, Expert Systems with Applications, 33, 135–146. ROMERO, C. and S. VENTURA (2010) Educational data mining: a
  • 61. review of the state of the art, Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 40, 601– 618. ROMERO, C., S. VENTURA and P. DE BRA (2004) Knowledge discovery with genetic programming for providing feedback to courseware authors, User Modeling and User-Adapted Interaction, 14, 425–464. ROMERO, C., S. VENTURA, P.G. ESPEJO and C. HERVÁS (2008) Data mining algorithms to classify students in International Conference on Educational Data Mining, Montreal, Canada, 8–17. ROMERO, C., A. ZAFRA, J.M. LUNA and S. VENTURA (2013) Association rule mining using genetic programming to provide feedback to instructors from multiple-choice quiz data, Expert Systems, 30, 162–172. SHEA, P. and T. BIDJERANO (2009) Community of inquiry as a theoretical framework to foster “epistemic engagement” and “cognitive presence” in online education, Computers & Education, 52, 543–553. © 2015 Wiley Publishing Ltd174 Expert Systems, April 2016, Vol. 33, No. 2 http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html
  • 62. SIEMENS, G. and D R.S. BAKER (2012) Learning analytics and educational data mining: towards communication and collaboration, in Proceedings of the 2nd International Conference on Learning Analytics and Knowledge. ACM, 252–254. SUN, P.-C., R.J. TSAI, G. FINGER, Y.-Y. CHEN and D. YEH (2008) What drives a successful e-learning? An empirical investigation of the critical factors influencing learner satisfaction, Computers & Education, 50, 1183–1202. TANG, C., R.W. LAU, Q. LI, H. YIN, T. LI and D. KILIS (2000) Personalized courseware construction based on web data mining. in Web Information Systems Engineering, 2000. Proceedings of the First International Conference on. Vol. 2. IEEE, 204–211. TANNER, C.K. (2009) Effects of school design on student outcomes, Journal of Educational Administration, 47, 381–399. TORRA, V., J. DOMINGO-FERRER, J.M. MATEO-SANZ and M. NG (2006) Regression for ordinal variables without underlying continuous variables, Information Sciences, 176, 465–474. UENO, M. (2003) On-line statistical outlier detection of irregular learning processes for e-learning. in World Conference on Educational Multimedia, Hypermedia and Telecommunications. Vol. 2003, 227–234.
  • 63. VAPNIK, V.N. (1999) The Nature of Statistical Learning Theory, Springer. WANG, C. and Y. CHEN (2005) Improving nearest neighbor classification with simulated gravitational collapse. In Wang, L., K. Chen and Y. Ong (editors), Advances in Natural Computation. Vol. 3612 of Lecture Notes in Computer Science, Springer, Berlin Heidelberg, 845–854. WITTEN, I.H. and E. FRANK (2005) data mining: practical machine learning tools and techniques. In Data Management Systems, 2nd edn., Morgan Kaufmann (Elsevier). YAY, M. and E.D. AKINCI (2009) Application of ordinal logistic regression and artificial neural networks in a study of student satistaction, Cypriot Journal of Educational Sciences, 4, 58–69. YUKSELTURK, E. (2009) Do entry characteristics of online learners affect their satisfaction?, International Journal on E-Learning, 8, 263–281. ZAFRA, A., C. ROMERO and S. VENTURA (2011) Multiple instance learning for classifying students in learning management systems, Expert Systems with Applications, 38, 15020–15031. ZAÏANE, O.R., M. XIN and J. HAN (1998) Discovering web
  • 64. access patterns and trends by applying olap and data mining technology on web logs, in Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on. IEEE, 19–29. ZONG-CHANG, Y. (2008) A vector gravitational force model for classification, Pattern Analysis and Applications, 11, 169–177. ZORRILLA, M.E., E. MENASALVAS, D. MARIN, E. MORA and J. SEGOVIA (2005) Web usage mining project for improving web- based learning sites. In Computer Aided Systems Theory– EUROCAST 2005, Springer, 205–210. The authors Pilar Gómez-Rey Pilar Gómez-Rey received the MSc degree in Business Administration from University ETEA, Spain, in 2012 and the MSc degree in Teaching Economics for Pre-Higher Education from the International University of La Rioja, Spain, in 2014. Currently, she is a PhD candidate at the Open University of Catalonia where she is developing her thesis through the Doctoral Programme in Education and ICT (e-learning). Her main research interests include Higher Education, e-learning, students’ perceptions as well as quality education. Francisco Fernández-Navarro Francisco Fernández-Navarro received the MSc degree in computer science from the University of Cordoba, Spain,
  • 65. in 2008, the MSc degree in artificial intelligence from the University of Malaga, Spain, in 2009, and the PhD degree in computer science and artificial intelligence from the University of Malaga in 2011. He was a research fellow in computational management with the European Space Agency, Noordwijk, The Netherlands, and currently he is working as Associate Professor at the Universidad Loyola Andalucia. His current research interests include neural networks, ordinal regression, imbalanced classification, and hybrid algorithms. He is a member of the IEEE. Elena Barberà Elena Barberà PhD in Educational Psychology (1995) and senior researcher at eLearn Center (Open University of Catalonia, Barcelona). She is currently Director of the PhD program ‘Education and ICT’ at OUC. Her research activity is focused in the area of educational psychology. As head of the e-DUS (Distance School and University e- ducation) research group, she currently participates in national and international projects, and she is external evaluator of national and European research projects. She is also an editor of two journals of impact in the field of education and technology. © 2015 Wiley Publishing Ltd Expert Systems, April 2016, Vol. 33, No. 2 175 Copyright of Expert Systems is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for