1. Forms of Learning
Supervised learning[ Learn input/output patterns from given, correct output for
some inputs]
In Supervised learning,correct answer for each example is given.Answer can be a numeric
variable, categorical variable etc.An agent tries to find a function that matches examples
from a sample set. Each example provides an input together with the correct output. Goal
is to build general model that will producecorrect output on novel input.
e.g. The following example set is given
M M F
F F
Any picture may be asked for M or f
Thus Supervised learning of a concept: when examples are already properly classified, and
the task is to learn the hidden standard of classification. The process often consists of
cycles of "hypothesis generating" followed by "hypothesis testing".
Unsupervised learning
In Unsupervised learning: correct answers not given – just examples (e.g. – the same
2. figures as above , without the labels).The agent tries to learn from patterns without
corresponding output values
– No pre-classification of training examples
– Learning about data by looking at its features
– No specific feedback from users
– Usually entails clustering data
Thus in Unsupervised learning of concepts,the instances are not labeled, and the task is to
cluster them into classes according to their similarity. One way to do it is to recursively
merge or split the current class(es), with the hope to achieve the minimum intra-class
distances and the maximum inter-class distances.
Reinforcement learning(occasional rewards) refers to Learning from
feedback .
Reinforcement learning, or learning by try-and-error, is a method used to learn preference
among alternative actions according to the feedback (reward (+ve)and punishment(-ve)) at
end of sequence of steps
The agent does not know the exact output for an input, but it receives feedback on the
desirability of its behavior. Unlike supervised learning, the reinforcement learning
takes place in an environment where theagent cannot directly compare the
results of its action to adesired result. Instead, it is given some reward or
3. punishment that relates to its actions. It may win or lose a game, or be told it has made
a good move or a poor one. The job of reinforcement learning is to find a successful
function using these rewards.
One hard problem in this type of learning is credit/blame assignment. When the
feedback is only about a complete sequence of actions, not about each individual action
(delayed reward), it is not always easy to determine what is right/wrong.
Another issue in reinforcement learning is the tradeoff between exploration and
exploitation. To get the maximum reward in the long run in a uncertain environment,
sometimes it is better to take a less-explored option, even when another option has a
better historic record.