Lecture 3 (Supervised learning)

Supervised Learning
Dr. Varun Kumar
Dr. Varun Kumar Lecture 3 1 / 13

Outlines
1 Introduction to Supervised Learning
2 Motivation:
3 Scope of Machine Learning:
4 References

Introduction to Supervised Learning:
Introduction:
In supervised learning, the aim is to learn a mapping from the input to an
output whose correct values are provided by a supervisor.
Ex- To find the family car based on classification.
Attributes:
It should be not expensive, i.e, p1 < price < p2.
Its engine capacity should be better, e1 hp < ec < e2 hp
It should be spacious.
Key Points:
Other attributes like, comfort, milage, top speed so on has not
been considered in the case. For analytical simplicity, we consider only
price and engine capacity.
Class learning is finding a description that is shared by all positive
examples and none of the negative examples.

Positive and Negative Example:
Positive and Negative Example:
When car price and their engine capacity follow the relation, i.e,
(p1 < price < p2) & (e1 < ec < e2) called as positive otherwise negative
example.
Training set for the class of a family car.
Figure: Each data point corresponds to one example car. ’+’ denotes a positive
example of the class, and ’-’ denotes a negative example.

Actual Hypothesis H
Actual Hypothesis H
The class of family car is a rectangle in the price-engine power space.
Actual hypothesis H contain all possibilities, where the car comes under
the family car at given constraints.

Mathematical Intuition:
Mathematical representation:
Let input attribute is expressed as
x =
x1
x2
x1 → Price & x2 → Engine capacity
Similarly r is a vector that reveals the condition of +ve and -ve example.
r =
1, if x is a positive example
0, if x is a negative example
Each car is represented by such an ordered pair (x, r).
Training set: Let X is the training set that gives the information about N
diﬀerent cars. Mathematically,
X = {xt
, rt
}N
t=1 t → refers indexes for diﬀerent examples in the set.

Learning Hypothesis h:
Aim of learning hypothesis h
Let C is the class of family cars. Aim of particular hypothesis is as follow
i. To reduce the search space.
ii. h ∈ H and H ≥ h.
The learning algorithm finds the particular hypothesis h such that h ∈ H,
to approximate C as closely as possible.
Let us say the hypothesis h makes a prediction for an instance x such that
h(x) =
1, if h classifies x as a positive example
0, if h classifies x as a negative example
Key points :
In real life we do not know C(x).
We cannot evaluate how well h(x) matches C(x).
We have a training set X, which is a small subset of large problem.

Error of hypothesis h:
Error of hypothesis h:
The error of hypothesis h given the training set X is
E(h|X) =
N
t=1
1(h(xt
) = rt
)
where 1(a = b) is 1 if a = b and is 0 if a = b.

Generalization:
Problem of generalization:
Let x1 and x2 are real-valued.
There are infinitely many such h that satisfy the condition, i.e, E=0.
At the boundary between positive and negative examples, different
candidate hypotheses may make different generalization predictions.
The problem of generalization refers that how well our hypothesis will
correctly classify future examples that are not part of the training set.
Most specific hypothesis:
The most specific hypothesis S is the tightest rectangle that includes all
the positive examples and none of the negative examples.
Most general hypothesis
The most general hypothesis G is the largest rectangle that includes all the
positive examples and none of the negative examples.

Comparison of different hypothesis:
Comparison of different hypothesis:
Any h ∈ H between S and G is a valid hypothesis with no error, said
to be consistent with the training set.
At another training set, S, G, version space, the parameters and thus
the learned hypothesis h can be different.

Margin:
Margin:
Actually, depending on X and H, there may be several Si and Gj that
make up the S-set and the G-set.
Margin:
The distance between boundary and the instances closest to it.
Margin decides the degree of accuracy of learning hypothesis h.

Probably Approximately Correct (PAC) Learning:
PAC learning:
In PAC learning,
Let a given class of problem is C.
Unknown examples have fixed probability distribution, i.e, p(x).
We want to find the number of examples N such that with probability
at least 1 − δ, the hypothesis h has error at most , for arbitrary
δ ≥ 1/2 and > 0.
P(C∆h ≤ ) ≥ 1 − δ, where C∆h is the region of difference between
C and h.

References
E. Alpaydin, Introduction to machine learning. MIT press, 2020.
T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University,
School of Computer Science, Machine Learning , 2006, vol. 9.
J. Grus, Data science from scratch: ﬁrst principles with python. O’Reilly Media,
2019.

Lecture 3 (Supervised learning)

Recommended

Recommended

More Related Content

Similar to Lecture 3 (Supervised learning)

Similar to Lecture 3 (Supervised learning) (20)

More from VARUN KUMAR

More from VARUN KUMAR (20)

Recently uploaded

Recently uploaded (20)

Lecture 3 (Supervised learning)