2. Outlines
1 What is Decision Theory ?
2 Usage of Decision of Theory
3 Mathematical Description of Bayeโs Theorem:
4 References
Dr.Varun Kumar Lecture 4 2 / 13
3. What is Decision Theory ?
Decision Theory :
For solving a real world problem, adaptive decision capability makes the
system more robust.
The framework for making decisions under uncertainty.
We can make rational decisions among multiple actions to minimize
expected risk.
Through learning association rules from data, a proper decision can
be framed.
scope:
Dr.Varun Kumar Lecture 4 3 / 13
4. Usage of Decision of Theory
1 Arti๏ฌcial Intelligence
2 Machine Learning
3 Pattern Recognition
4 Wireless Communication
5 Image Processing
Dr.Varun Kumar Lecture 4 4 / 13
5. Mathematical Description:
Mathematical Description :
Class โ Family/Sports/Luxury
โ
Features โ Price/engine-capacity/Top-speed
โ
Dimension
Let two classes ฯ1 and ฯ2 denotes the accept and reject phenomenon.
p(ฯ1) โ Apriori probability. An event whose outcome falls in class ฯ1.
p(ฯ2) โ Apriori probability. An event whose outcome falls in class ฯ2.
p(x|ฯ1) or p(x|ฯ2) โ Class conditional probability.
x โ It is a feature. It depends on both class ฯ1 and ฯ2. It can
co-exist either in class ฯ1 and ฯ2.
Dr.Varun Kumar Lecture 4 5 / 13
6. Continuedโ
Mathematical Description:
p(ฯ1|x) โ Aposteriori probability (depends on current input or future).
p(ฯi , x) โ i = 1, 2 โ Joint probability
Joint Probability:
p(ฯi , x) = p(ฯi |x)p(x) = p(x|ฯi )p(ฯi )
Property of Joint probability:
p(x) = 2
i=1 p(ฯi , x)
p(ฯi |x) =
p(x|ฯi )p(ฯi )
p(x)
=
p(x|ฯi )p(ฯi )
2
i=1 p(ฯi , x)
p(ฯ1|x) > p(ฯ2|x) โ Decision will go in favor of class ฯ1.
p(ฯ1|x) < p(ฯ2|x) โ Decision will go in favor of class ฯ2.
Note: Aposteriori probability gives the true measure of any new sample
that may fall in the class ฯ1 and ฯ2.
Dr.Varun Kumar Lecture 4 6 / 13
7. Continuedโ
Note:
1 If the relation between apriori probability is p(ฯ1) > p(ฯ2) โ
Decision goes in favor of class ฯ1. This phenomenon is less likely.
2 Above relation does not address the actual scenario of the condition
of class ฯ1 and ฯ2.
3 On the other side, if the relation between aposteriori probability is
p(ฯ1|x) > p(ฯ2|x) โ decision will go in favor of class ฯ1. This
phenomenon is more likely.
4 Random variable x is function of ฯ1 and ฯ2.
x = f (ฯ1, ฯ2)
Dr.Varun Kumar Lecture 4 7 / 13
9. Example
Q. If a feature x is the essential part of two classes ฯ1 and ฯ2. If the
PDF of this feature is exponentially distributed, such that
p(x) = 1
2eโx/2 โ x > 0 and aposteriori PDF for ฯ1 and ฯ2 are
2eโ2x โ x > 0 and 4eโ4x โ x > 0, respectively.
1 Find the probability of error, when decision goes in favor of ฯ1.
2 Find the probability of error, when decision goes in favor of ฯ2.
3 At what value of feature x, the decision canโt be performed.
Ans. 1 According to question, p(x) = 1
2 eโx/2
โ x > 0,
p(ฯ1|x) = 2eโ2x
โ x > 0 and p(ฯ2|x) = 4eโ4x
โ x > 0 . If decision
goes in favor of ฯ1 then p(error/x) = p(ฯ2/x).
p(error)|ฯ1 =
โ
0
4eโ4x 1
2 eโx/2
dx=4
9 โ x > 0
2 p(error)|ฯ2 =
โ
0
2eโ2x 1
2 eโx/2
dx=2
5 โ x > 0
3 when aposteriori PDF for class ฯ1 and ฯ2 are same,i.e
2eโ2x
= 4eโ4x
โ e2x
= 2 โ x = 1
2 ln(2)
Dr.Varun Kumar Lecture 4 9 / 13
10. Multiple Class, Loss Function:
Multiple Class/Actions/Features and Loss Function:
{ฯ1, ฯ2, ...., ฯc} โ Multiple class or state of nature
{ฮฑ1, ฮฑ2, ...., ฮฑa} โ Multiple actions
Loss Function:
Instead of probability of error, we use the term loss function in case of
multiple classes and actions. Mathematically, it can be expressed as
L(ฮฑi /ฯj ) = Lij โ A given action i is performed under jth
state of nature
โ i = 1, 2, ...a and j = 1, 2, ..., c
X โ d-dimensional feature vector, i.e. X = {x1, x2, ....xd }
Dr.Varun Kumar Lecture 4 10 / 13
11. Risk Function or Expected Loss:
Risk Function or Expected Loss:
In case of multiple classes and performed action, we require the expected
loss for ๏ฌnal decision. Hence, we use risk function, i.e. denoted as
R(ฮฑi |X) =
c
j=1
L(ฮฑi |ฯj )p(ฯj |X) โ i, j = 1, 2, ...
R(ฮฑ1|X) = L11p(ฯ1|X) + L12p(ฯ2|X)
R(ฮฑ2|X) = L21p(ฯ1|X) + L22p(ฯ2|X)
If a risk relation exist in such a way that R(ฮฑ1|X) < R(ฮฑ2|X) or
(L21 โ L11)
+Ve
p(ฯ1|X) > (L12 โ L22)
+Ve
p(ฯ2|X)
Note: Above relation suggest that the decision goes in favor of class ฯ1.
Dr.Varun Kumar Lecture 4 11 / 13
12. Minimum Error Rate Classi๏ฌcation
L(ฮฑi |ฯj ) = 0 โ i = j โ No loss occur for performing the ith action
correspond to ith class
= 1 โ i = j
Risk Function :
R(ฮฑi |X) = c
j=i p(ฯj |X) = 1 โ p(ฯi |X)
Dr.Varun Kumar Lecture 4 12 / 13
13. References
E. Alpaydin, Introduction to machine learning. MIT press, 2020.
T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University,
School of Computer Science, Machine Learning , 2006, vol. 9.
J. Grus, Data science from scratch: ๏ฌrst principles with python. OโReilly Media,
2019.
Dr.Varun Kumar Lecture 4 13 / 13