1. In a big number of situations where a supervised classifier is employed, we have to face the problem of
labelling data at a huge cost. In such cases we can apply active learning techniques where the model
which is learning, is allowed to select the next instance or set of instances to be labelled. Hence, the
model will try to choose the best instance to be labelled in a way it can learn a good classifier using
fewer labels than would be required in the traditional paradigm. Diverse approaches have been
proposed for applying active learning. Most of them are thought of for binary classification problems.
One of these approaches is the one exposed by [Cohn 1] that consists of choosing instances within the
uncertainty region so the region gets smaller and smaller in every iteration. This approach could be too
expensive due to the necessity of recalculating explicitly the uncertainty region in every iteration. Other
approaches have been exposed where the uncertainty region does not need to be calculated in a
explicitly way. One of them is Uncertainty Sampling proposed by [Lewis] and consists of choosing the
instance which presents the biggest uncertainty which is measured using the current classifier model.
This is only applicable to some kinds of learning probabilistic models which return a membership
probability of an instance to every possible label. A general purpose approach is Query By Committee
which was proposed by [QBC]. The main idea of this method is to keep a committee of classifiers in
order to infer the label of every candidate instance. Finally, the instance whose disagreement in the
committee is biggest is chosen. There are other papers which apply active learning to other kinds of
problems. The next step is the multiclass problem. Some criterions have been proposed for this kind of
problem, such as the ones we can find in [multiclass] and [video multiclass]. Active learning has been
applied to regression problems with successful results. [Cohn 2] studied several ways for valuing the
information of instances in this kind of problem such as the variance or the KL-divergence. Active
learning has also been applied to other kinds of problems such as class based ranking ones. One
example of this is [bootstrap-LV], which uses a committee where the criterion for choosing a new
instance is based on the variance.
We face two different kinds of situations in our system:
• Learning the weights of the metaheuristics: we must train a model in order to infer what weight is
adequate for every instance and every metaheuristic. This value is in the range [0, 1] and the sum of the
weights of all metaheuristics for an instance is 1. We can not class this problem as a pure regression
one or a ranking problem but it keeps several properties of both of them.
• Learning the parameters for every metaheuristic: When an instance is given, we need to infer what
parameters are the best for every metaheuristic. Hence, it is a set of classification problems where the
labels are the possible values a parameter can take. It is important to consider that more than 2 labels
can exist to be chosen.
2. In the first situation we have chosen to make a committee strategy. In this way, in every iteration, the
variance of the distribution that results when an instance is inferred by all the members of the
committee is calculated.Therefore we get a variance vector of length n for every instance, being n the
number of metaheuristics considered. Finally we select the instance whose sum of variances is the
biggest, so that it is the way of measuring the disagreement between the members of the committee.
In the second situation we have chosen to make another committee for every parameter. However, in
this case, we have used another selection criterion. Now it is not needed to infer the weights but to
decide between a set of labels, so we are interested in choosing the instance whose uncertainty is the
biggest. [multiclass] proposes a method which has reported very good scores in multiclass problems. It
consists of choosing the instance which minimizes the difference between the committee’s output for
the most and the second most popular class label. That is the option we have chosen and we have got
very good results.
It is important to point out that in the first situation we can not use the last approach because the
problem is not a classification one. For example, if in the first situation we have an instance whose
margin between the most and the second most popular label is pretty low, it does not mean that this
class has a lot of uncertainty because it is possible that these estimated values are very close to the real
ones.