Free Lessons on Artificial Intelligence and Machine Learning
kindsonthegenius.blogspot.com Click on:
For updates:
Free Lessons on Artificial Intelligence and Machine Learning
What we are going to cover:
 What is decision theory?
 Application of Decision Theory – Cancer Diagnosis
 Formal definition
 False positives/False negatives
 Minimizing misclassification
 Reducing Expected Loss
 Introduction to ROC
Click on:
For updates:
Free Lessons on Artificial Intelligence and Machine Learning
What is decision theory? What is the goal?
This is mathematical theory in the field of Machine Learning that allows us to
make optimal decisions in situations involving uncertainty.
So from the figure we can see that the goal of the physician would be to
get the highest score possible which is 100% and that is the objective of
Decision Theory, to make the most optima decision.
Free Lessons on Artificial Intelligence and Machine Learning
Application of Decision Theory – Cancer Diagnosis
Scenario 1: There is presence of cancer and the physician
decides to perform a surgery. That is 100% because its the
best decision to take.
Scenario 2: There is presence of cancer but the
physician decides not to perform a surgery. That is a
score of 0 as it is the worst case scenario and of course
the consequences would be very serious.
Scenario 3: Cancer is absent but the physician decides to perform a surgery anyway.
This is a low score but does not result in any serious consequence
Scenario 4: Cancer is absent and the physician decides not to perform a surgery.
This is a good decision as well.
Free Lessons on Artificial Intelligence and Machine Learning
Formal definition of Decision Theory
A corresponding vector t of the target variables ( which could be 1 or 0)
And the two classes C1 and C2 (C1 = presence of cancer, C2 = absence of cancer)
Consider that we have an input vector x
Let t = 1 correspond to class C1 and
t = 0 correspond to class C2
We need to determine the joint distribution p(x, Ck). Here k = 1,2. This is the same as p(x,t).
Decision theory is concerned with how to make optimal decisions given the appropriate
probabilities.
Click on:
For updates:
Free Lessons on Artificial Intelligence and Machine Learning
What are False positives/False negatives?
False Negative: There is presence of cancer but the physician decides not to perform a surgery.
That is a score of 0 as it is the worst case scenario and of course the consequences would be very
serious.
False Positive: Cancer is absent but the physician decides to perform a surgery
anyway. This is a low score but does not result in any serious consequence
Free Lessons on Artificial Intelligence and Machine Learning
Minimizing misclassification
The approach is to divide the input space into regions Rk called decision regions, one region for
each class.
Rk is assigned to class Ck
𝑝 𝑚𝑖𝑠𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 = 𝑝 𝐱 ∈ 𝑅2, 𝐶1 + 𝑝(𝐱 ∈ 𝑅1, 𝐶2)
𝑅1
𝑝 𝐱, 𝐶2 𝑑𝐱 +
𝑅2
𝑝 𝐱, 𝐶1 𝑑𝐱
To minimize misclassification, we must choose to assign x to which of the classes
has the smaller value of the integrand.
Consider the case of two classes C1 and C2. A mistake occurs when an input vector belonging
to R1 is assigned to C2 or vector x belonging to R2 is assigned to C1.
Free Lessons on Artificial Intelligence and Machine Learning
Reducing Expected Loss
The Loss Matrix
The Loss Matrix is a table showing the decision that was taken relative to
the true class
Assuming that for a new value of x, we assign it to class Cj whereas the real correct
class is Ck. It means we have incurred a loss Lkj, which is the k, j element of the loss
matrix.
𝐸[𝐿 ] =
𝑘 𝑗 𝑅 𝑗
𝐿 𝑘𝑗 𝑝(𝐱, 𝐶 𝑘)𝑑𝑥
The average loss function is given by the equation:
The best solution is one that minimizes the average loss function. For a given input
vector x, our uncertainty in the correct class is expressed through the joint
probability distribution p(x, Ck)
Free Lessons on Artificial Intelligence and Machine Learning
Introduction to ROC Curve
An ROC in Statistics is a graphical plot that illustrates the diagnostic ability of a
binary classifier system as its discrimination threshold is varied.
It is created by plotting the true positive rate against the false positive rate at
various threshold setting
Details of this in another presentation…
For updates:
Subscribe now for more videos
on
Artificial Intelligence and Machine Learning
Click on:

Basics of Statistical decision theory

  • 1.
    Free Lessons onArtificial Intelligence and Machine Learning kindsonthegenius.blogspot.com Click on: For updates:
  • 2.
    Free Lessons onArtificial Intelligence and Machine Learning What we are going to cover:  What is decision theory?  Application of Decision Theory – Cancer Diagnosis  Formal definition  False positives/False negatives  Minimizing misclassification  Reducing Expected Loss  Introduction to ROC Click on: For updates:
  • 3.
    Free Lessons onArtificial Intelligence and Machine Learning What is decision theory? What is the goal? This is mathematical theory in the field of Machine Learning that allows us to make optimal decisions in situations involving uncertainty. So from the figure we can see that the goal of the physician would be to get the highest score possible which is 100% and that is the objective of Decision Theory, to make the most optima decision.
  • 4.
    Free Lessons onArtificial Intelligence and Machine Learning Application of Decision Theory – Cancer Diagnosis Scenario 1: There is presence of cancer and the physician decides to perform a surgery. That is 100% because its the best decision to take. Scenario 2: There is presence of cancer but the physician decides not to perform a surgery. That is a score of 0 as it is the worst case scenario and of course the consequences would be very serious. Scenario 3: Cancer is absent but the physician decides to perform a surgery anyway. This is a low score but does not result in any serious consequence Scenario 4: Cancer is absent and the physician decides not to perform a surgery. This is a good decision as well.
  • 5.
    Free Lessons onArtificial Intelligence and Machine Learning Formal definition of Decision Theory A corresponding vector t of the target variables ( which could be 1 or 0) And the two classes C1 and C2 (C1 = presence of cancer, C2 = absence of cancer) Consider that we have an input vector x Let t = 1 correspond to class C1 and t = 0 correspond to class C2 We need to determine the joint distribution p(x, Ck). Here k = 1,2. This is the same as p(x,t). Decision theory is concerned with how to make optimal decisions given the appropriate probabilities. Click on: For updates:
  • 6.
    Free Lessons onArtificial Intelligence and Machine Learning What are False positives/False negatives? False Negative: There is presence of cancer but the physician decides not to perform a surgery. That is a score of 0 as it is the worst case scenario and of course the consequences would be very serious. False Positive: Cancer is absent but the physician decides to perform a surgery anyway. This is a low score but does not result in any serious consequence
  • 7.
    Free Lessons onArtificial Intelligence and Machine Learning Minimizing misclassification The approach is to divide the input space into regions Rk called decision regions, one region for each class. Rk is assigned to class Ck 𝑝 𝑚𝑖𝑠𝑐𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 = 𝑝 𝐱 ∈ 𝑅2, 𝐶1 + 𝑝(𝐱 ∈ 𝑅1, 𝐶2) 𝑅1 𝑝 𝐱, 𝐶2 𝑑𝐱 + 𝑅2 𝑝 𝐱, 𝐶1 𝑑𝐱 To minimize misclassification, we must choose to assign x to which of the classes has the smaller value of the integrand. Consider the case of two classes C1 and C2. A mistake occurs when an input vector belonging to R1 is assigned to C2 or vector x belonging to R2 is assigned to C1.
  • 8.
    Free Lessons onArtificial Intelligence and Machine Learning Reducing Expected Loss The Loss Matrix The Loss Matrix is a table showing the decision that was taken relative to the true class Assuming that for a new value of x, we assign it to class Cj whereas the real correct class is Ck. It means we have incurred a loss Lkj, which is the k, j element of the loss matrix. 𝐸[𝐿 ] = 𝑘 𝑗 𝑅 𝑗 𝐿 𝑘𝑗 𝑝(𝐱, 𝐶 𝑘)𝑑𝑥 The average loss function is given by the equation: The best solution is one that minimizes the average loss function. For a given input vector x, our uncertainty in the correct class is expressed through the joint probability distribution p(x, Ck)
  • 9.
    Free Lessons onArtificial Intelligence and Machine Learning Introduction to ROC Curve An ROC in Statistics is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied. It is created by plotting the true positive rate against the false positive rate at various threshold setting Details of this in another presentation… For updates:
  • 10.
    Subscribe now formore videos on Artificial Intelligence and Machine Learning Click on: