2. When we look at broadly different kinds of Machine Learning that are used in
practice in Artificial Intelligence
Historically, there have been several approaches in Machine learning for AI like
supervised learning, unsupervised learning, reinforcement learning, case-based
reasoning, inductive logic programming, experience based generalisation etc.
there have been several examples of waves of machine learning for different AI
problems. But, of them the 3 most important categories of machine learning
which are of practical use or business use today happen to be:
1. Supervised Learning
2. Unsupervised Learning
3. Reinforcement learning
Let’s look at each of them separately and we’ll have a brief summary of each of
them so that we have an idea of what we’re talking about.
Supervised Learning
is today the most mature and probably in some sense the easiest form of machine
learning. The idea here is that you have historical data with some notion of output
variable. What is output variable? Output Variable is meant for identifying how
you can a good combination of several input variables and corresponding output
values as historical data presented to you and then based on that you try to come
up with a function which is able to predict an output given any input. So, the key
idea is that the historical data is labelled. Labelled means that you have a specific
output value for every row of data, that is presented to it. So, in that sense the
problem of supervised learning is that you really can only work when you have
clearly labelled data with input output values of all the historical data that’s
presented to you. Then, once you have the historical data with their several input
variables with their values and corresponding output value present with you, you
can use that to infer a kind of potential function between input and output and the
function can be actually used for any given input that is coming in and guess the
corresponding output whether the output is discreet or continuous, does not
matter. Both are supervised learning.
Specifically, in the case of output variable, if the output variable is discreet, it is
called CLASSIFICATION. And if it is continuous it is called REGRESSION. So, in
case of classification, the formula of function takes a new input and classifies them
into one of the discreet possible output values or in case of REGRESSION, we take
the input values and give the corresponding continuous value to the output. So an
example in discreet, could be SPAM CLASSIFIER, that takes input data and then
classifies it into spam or non-spam. An example of Continuous data could be
stock prediction where you take a look at a lot of data from history to potential
stock prices with varied different conditions and predict the exact value and get
the output function and then use it for producing a new scenario where, at a new
instance, with different conditions of environment, stock price is calculated.
3. So, what happens in supervised is, that you have the luxury of having labelled
historical input and output data.
UNSUPERVISED LEARNING
Unsupervised learning DOES NOT have the luxury of having labelled historical
data input output etc. Instead, we can only say that it has a whole bunch of input
data, RAW INPUT DATA. So what does this unsupervised learning give you?
It allows us to identify what is known as patterns in the historical input data and
allows us to identify interesting insights from the overall perspective of an
interesting pattern or insight on the original input data. So, there is no explicit
equation or a pattern which reflects a relationship between an input or an output.
So, the output here, is absent and all you need to understand is that is there a
pattern being visible in the unsupervised set of input.
An example is, suppose you look at identifying, in a large supermarket
transaction set, which 2 items are often bought together? There is no output here.
We are only talking about which are the collection of items which are frequently
bought together? So, that’s like extraction of common occurrence kind of items
together. So, that does not mean labelling, it only has explicit inputs, no output.
All inputs are then run through an unsupervised algorithm and then pattern is
extracted.
The beauty of unsupervised learning is that it lends itself to numerous
combinations of patterns, and the problem is, because of its diverse nature, there
is no one notion of data,
One pattern means, in one context something, in another context of data, it could
mean another pattern. So, there is no standardised notion of what is a good
unsupervised algorithm. That’s why unsupervised algorithms are harder,
problems are tougher and more difficult to deal with.
REINFORCEMENT LEARNING
Reinforcement learning (RL) is an area of machine learning concerned with
how software agents ought to take actions in an environment so as to maximize
some notion of cumulative reward.
Where typically you’re required to reach some goal state and it’s possible that
you know you have put a whole set of steps as you go along from the start
position to the goal, which is end position and there are several steps in between
and at each step, you can take multiple actions. So, what happens is that there is
some notion of a start function (start state), there is some notion of a goal state,
and then there are multiple states in the middle where at each state, you can take
multiple actions. Multiple actions possibility at each state, and each action has a
corresponding reward or a punishment. So, what happens with this notion is,
4. because we’re talking a collection of states, one at start state, one at end state,
multiple goal states, not one, then each state having corresponding mapping to
an action to a corresponding reward or punishment. So, ultimately, what
reinforcement learning does is it allows a machine or an agent to take steps and
to go from one state to another and take action at each step, where it takes to
another state and it takes action at next step. But as you keep going along, it
collects the output based on experience, whether it is positive or negatively
rewarded or punished. Based on that, it optimises the steps taken and the path
taken and it incrementally increases its knowledge of which is probably a better
path in terms of better rewarding, in terms of reaching goal faster, or finishing a
task faster.
So, in that sense, reinforcement learning is dynamic, state dependent and
constantly keeps updating rewards and punishments also as it keeps learning
from experience. So, in that sense, history may not be there in the start, but
history builds as it gives through actions, states, rewards, punishments in each
step. So, a classic example of this is MAZE LEARNING, where a starting agent
starts with start point to end with various obstacles that it gives that as a
punishment and it retrains itself. And in the future, it remembers and tries to go
through paths which do not lead to potentially blocking. So, what happens is
you’re trying to reach the goal faster by remembering of the moves which do not
lead you to the goal and instead take you to a hindrance and that knowledge is
being accumulated here. Reinforcement learning dynamically continues updates
the rewards and punishments knowledge and brings a system which is able to
learn from experience and become optimal in reaching the goal.
So, these 3 broad categories form the basis of modern AI systems, where
machine learning is entrenched into AI systems.