SlideShare a Scribd company logo
1 of 35
Lecture 2.
Bayesian Decision Theory
Bayes Decision Rule
Loss function
Decision surface
Multivariate normal and Discriminant Function
Bayes Decision
It is the decision making when all underlying probability
distributions are known.
It is optimal given the distributions are known.
For two classes ω1 and ω2 ,
Prior probabilities for an unknown new observation:
P(ω1) : the new observation belongs to class 1
P(ω2) : the new observation belongs to class 2
P(ω1 ) + P(ω2 ) = 1
It reflects our prior knowledge. It is our decision rule
when no feature on the new object is available:
Classify as class 1 if P(ω1 ) > P(ω2 )
Bayes Decision
We observe features on each object.
P(x| ω1) & P(x| ω2) : class-specific density
The Bayes rule:
Bayes Decision
Likelihood of
observing x given
class label.
Bayes Decision
Posterior
probabilities.
Loss function
Loss function:
probability statement --> decision
some classification mistakes can be more costly than
others.
The set of c classes:
The set of possible actions:
: deciding that an observation belongs to
Loss when taking action i given the observation belongs to
hidden class j:
Loss function
The expected loss:
Given an observation with covariant vector x, the conditional
risk is:
Our final goal is to minimize the total risk over all x.
Loss function
The zero-one loss:
All errors are equally costly.
The conditional risk is:
“The risk corrsponding to this loss function is the average
probability error.”
R(αi | x)= λ(αi |ωj)P(ωj | x)
j=1
j=c
∑
= P(ωj | x)=1−P(ωi | x)
j≠i
∑
c,...,1j,i
ji1
ji0
),( ji =



≠
=
=ωαλ
Loss function
Let denote the loss for deciding class i
when the true class is j
In minimizing the risk, we decide class one if
Rearrange it, we have
Loss function
λλ θ
ω
ω
ωθ
ω
ω
λλ
λλ
>=
−
−
)|x(P
)|x(P
:ifdecidethen
)(P
)(P
.Let
2
1
1
1
2
1121
2212
λ =
0 1
1 0





,
then θλ =
P(ω2 )
P(ω1)
= θa
λ =
0 2
1 0






then θλ =
2P(ω2 )
P(ω1)
= θb
Example:
Loss function
Likelihood ratio.
Zero-one loss
function
If miss-
classifying ω2 is
penalized more:
Discriminant function & decision surface
Features -> discriminant functions gi(x), i=1,…,c
Assign class i if gi(x) > gj(x) ∀j ≠ i
Decision
surface
defined by
gi(x) = gj(x)
Decision surface
The discriminant functions help partition the feature space
into c decision regions (not necessarily contiguous). Our
interest is to estimate the boundaries between the regions.
Minimax
Minimizing the
maximum possible
loss.
What happens when
the priors change?
Normal density
Reminder: the covariance matrix is symmetric and
positive semidefinite.
Entropy - the measure of uncertainty
Normal distribution has the maximum entropy over all
distributions with a given mean and variance.
Reminder of some results for random vectors
Let Σ be a kxk square symmetrix matrix, then it has k pairs of
eigenvalues and eigenvectors. A can be decomposed as:
Σ=λ1e1e1
′+λ2e2e2
′+.......+λkekek
′=PΛ′P
Positive-definite matrix:
′xΣx >0,∀x ≠0
λ1 ≥λ2 ≥......≥λk >0
Note: ′xΣx =λ1( ′xe1)2
+......+λk( ′xek)2
Normal density
Whitening transform:
P : eigen vector matrix
Λ : diagonal eigen value matrix
Aw = PΛ
− 1
2
Aw
t
ΣAw
= Λ
− 1
2
Pt
ΣPΛ
− 1
2
= Λ
− 1
2
Pt
PΛPt
PΛ
− 1
2
= I
Σ=λ1e1e1
′+λ2e2e2
′+.......+λkekek
′=PΛ′P
Normal density
To make a minimum error rate classification (zero-one loss),
we use discriminant functions:
This is the log of the numerator in the Bayes formula. The
log posterior probability is proportional to it. Log is used
because we are only comparing the gi’s, and log is
monotone.
When normal density is assumed:
We have:
Discriminant function for normal density
(1)Σi = σ2
I
Linear discriminant
function:
Note: blue boxes –
irrelevant terms.
Discriminant function for normal density
The decision surface is where
With equal prior, x0 is the middle point between the two
means.
The decision surface is a hyperplane,perpendicular to the
line between the means.
Discriminant function for normal density
“Linear machine”: dicision surfaces are hyperplanes.
Discriminant function for normal density
With unequal prior
probabilities, the
decision boundary
shifts to the less likely
mean.
Discriminant function for normal density
(2) Σi = Σ
Discriminant function for normal density
Set:
The decision boundary is:
Discriminant function for normal density
The hyperplane is
generally not
perpendicular to the
line between the
means.
Discriminant function for normal density
(3) Σi is arbitrary
Decision boundary is hyperquadrics (hyperplanes, pairs of
hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)
gi(x)= xt
Wix+wi
t
x+wi0
Wi =−
1
2
Σi
−1
wi =Σi
−1
µi
wi0 =−
1
2
µi
t
Σi
−1
µi −
1
2
lnΣi +lnP(ωi)
Discriminant function
for normal density
Discriminant
function for
normal density
Discriminant function for normal density
Extention to multi-class.
Discriminant function for discrete features
Discrete features: x = [x1, x2, …, xd ]t
, xi∈{0,1 }
pi = P(xi = 1 | ω1)
qi = P(xi = 1 | ω2)
The likelihood will be:
Discriminant function for discrete features
The discriminant function:
The likelihood ratio:
g(x) = wi
i=1
d
∑ xi + w0
wi = ln
pi(1−qi)
qi(1− pi)
i =1,...,d
w0 = ln
1− pi
1−qii=1
d
∑ + ln
P(ω1)
P(ω2)
Discriminant function for discrete features
So the decision surface is again a hyperplane.
Optimality
Consider a two-class case.
Two ways to make a mistake in the classification:
Misclassifying an observation from class 2 to class 1;
Misclassifying an observation from class 1 to class 2.
The feature space is partitioned into two regions by any
classifier: R1 and R2
Optimality
Optimality
In the multi-class case, there are numerous ways to make
mistakes. It is easier to calculate the probability of correct
classification.
Bayes classifier maximizes P(correct). Any other partitioning
will yield higher probability of error.
The result is not dependent on the form of the underlying
distributions.

More Related Content

What's hot

Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligencesandeep54552
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptronomaraldabash
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 
SPATIAL FILTERING IN IMAGE PROCESSING
SPATIAL FILTERING IN IMAGE PROCESSINGSPATIAL FILTERING IN IMAGE PROCESSING
SPATIAL FILTERING IN IMAGE PROCESSINGmuthu181188
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
Minmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesMinmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesSamiaAziz4
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
Output primitives in Computer Graphics
Output primitives in Computer GraphicsOutput primitives in Computer Graphics
Output primitives in Computer GraphicsKamal Acharya
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Genetic algorithm ppt
Genetic algorithm pptGenetic algorithm ppt
Genetic algorithm pptMayank Jain
 
support vector machine 1.pptx
support vector machine 1.pptxsupport vector machine 1.pptx
support vector machine 1.pptxsurbhidutta4
 

What's hot (20)

Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Hill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligenceHill climbing algorithm in artificial intelligence
Hill climbing algorithm in artificial intelligence
 
Branch and bound
Branch and boundBranch and bound
Branch and bound
 
Sharpening spatial filters
Sharpening spatial filtersSharpening spatial filters
Sharpening spatial filters
 
Halftoning in Computer Graphics
Halftoning  in Computer GraphicsHalftoning  in Computer Graphics
Halftoning in Computer Graphics
 
Histogram Equalization
Histogram EqualizationHistogram Equalization
Histogram Equalization
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)AI Lecture 7 (uncertainty)
AI Lecture 7 (uncertainty)
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
SPATIAL FILTERING IN IMAGE PROCESSING
SPATIAL FILTERING IN IMAGE PROCESSINGSPATIAL FILTERING IN IMAGE PROCESSING
SPATIAL FILTERING IN IMAGE PROCESSING
 
Back propagation
Back propagationBack propagation
Back propagation
 
Minmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slidesMinmax Algorithm In Artificial Intelligence slides
Minmax Algorithm In Artificial Intelligence slides
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Output primitives in Computer Graphics
Output primitives in Computer GraphicsOutput primitives in Computer Graphics
Output primitives in Computer Graphics
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Genetic algorithm ppt
Genetic algorithm pptGenetic algorithm ppt
Genetic algorithm ppt
 
Decision tree
Decision treeDecision tree
Decision tree
 
support vector machine 1.pptx
support vector machine 1.pptxsupport vector machine 1.pptx
support vector machine 1.pptx
 

Viewers also liked

Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionAdnan Masood
 
DECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEDECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEAnasuya Barik
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its ApplicationsSajida Mohammad
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decisionnozomuhamada
 
Gamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsGamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsAlexander Decker
 
My ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementMy ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementBabasab Patil
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChristian Robert
 
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryFRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryJoe McPhail
 
pattern classification
pattern classificationpattern classification
pattern classificationRanjan Ganguli
 
Rectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsRectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsSandyPoinsett
 
What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?ESRI Bulgaria
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared Distributionsmathscontent
 
Management functions and decision making
Management functions and decision makingManagement functions and decision making
Management functions and decision makingIva Walton
 

Viewers also liked (20)

Pattern recognition
Pattern recognitionPattern recognition
Pattern recognition
 
Decision Theory
Decision TheoryDecision Theory
Decision Theory
 
Bayesian Networks - A Brief Introduction
Bayesian Networks - A Brief IntroductionBayesian Networks - A Brief Introduction
Bayesian Networks - A Brief Introduction
 
DECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLEDECISION THEORY WITH EXAMPLE
DECISION THEORY WITH EXAMPLE
 
Introduction to pattern recognition
Introduction to pattern recognitionIntroduction to pattern recognition
Introduction to pattern recognition
 
Pattern Recognition and its Applications
Pattern Recognition and its ApplicationsPattern Recognition and its Applications
Pattern Recognition and its Applications
 
2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision2012 mdsp pr07 bayes decision
2012 mdsp pr07 bayes decision
 
Gamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applicationsGamma function for different negative numbers and its applications
Gamma function for different negative numbers and its applications
 
09 Unif Exp Gamma
09 Unif Exp Gamma09 Unif Exp Gamma
09 Unif Exp Gamma
 
My ppt @becdoms on importance of business management
My ppt @becdoms on importance of business managementMy ppt @becdoms on importance of business management
My ppt @becdoms on importance of business management
 
Chapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysisChapter 4: Decision theory and Bayesian analysis
Chapter 4: Decision theory and Bayesian analysis
 
Pattern classification
Pattern classificationPattern classification
Pattern classification
 
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability TheoryFRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
FRM - Level 1 Part 2 - Quantitative Methods including Probability Theory
 
pattern classification
pattern classificationpattern classification
pattern classification
 
Rectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing EquationsRectangular Coordinates, Introduction to Graphing Equations
Rectangular Coordinates, Introduction to Graphing Equations
 
Decision theory
Decision theoryDecision theory
Decision theory
 
Probability
ProbabilityProbability
Probability
 
What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?What you Need to Know about Machine Learning?
What you Need to Know about Machine Learning?
 
Gamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared DistributionsGamma, Expoential, Poisson And Chi Squared Distributions
Gamma, Expoential, Poisson And Chi Squared Distributions
 
Management functions and decision making
Management functions and decision makingManagement functions and decision making
Management functions and decision making
 

Similar to Bayseian decision theory

The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selectionguasoni
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesKevin Regan
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysisVARUN KUMAR
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber SecurityAltoros
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1arogozhnikov
 
On Solving Covering Problems
On Solving Covering ProblemsOn Solving Covering Problems
On Solving Covering ProblemsOlivier Coudert
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductorswtyru1989
 
course slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdfcourse slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdfonurenginar1
 
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and ClimateEstimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and Climatemodons
 

Similar to Bayseian decision theory (20)

The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
Bayes ML.ppt
Bayes ML.pptBayes ML.ppt
Bayes ML.ppt
 
Options Portfolio Selection
Options Portfolio SelectionOptions Portfolio Selection
Options Portfolio Selection
 
Regret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision ProcessesRegret-Based Reward Elicitation for Markov Decision Processes
Regret-Based Reward Elicitation for Markov Decision Processes
 
Pres metabief2020jmm
Pres metabief2020jmmPres metabief2020jmm
Pres metabief2020jmm
 
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Normal density and discreminant analysis
Normal density and discreminant analysisNormal density and discreminant analysis
Normal density and discreminant analysis
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 
Optimization tutorial
Optimization tutorialOptimization tutorial
Optimization tutorial
 
MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1MLHEP 2015: Introductory Lecture #1
MLHEP 2015: Introductory Lecture #1
 
On Solving Covering Problems
On Solving Covering ProblemsOn Solving Covering Problems
On Solving Covering Problems
 
CI_L01_Optimization.pdf
CI_L01_Optimization.pdfCI_L01_Optimization.pdf
CI_L01_Optimization.pdf
 
support vector machine
support vector machinesupport vector machine
support vector machine
 
Randomness conductors
Randomness conductorsRandomness conductors
Randomness conductors
 
course slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdfcourse slides of Support-Vector-Machine.pdf
course slides of Support-Vector-Machine.pdf
 
Pr1
Pr1Pr1
Pr1
 
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and ClimateEstimation and Prediction of Complex Systems: Progress in Weather and Climate
Estimation and Prediction of Complex Systems: Progress in Weather and Climate
 

Recently uploaded

Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Recently uploaded (20)

Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Bayseian decision theory

  • 1. Lecture 2. Bayesian Decision Theory Bayes Decision Rule Loss function Decision surface Multivariate normal and Discriminant Function
  • 2. Bayes Decision It is the decision making when all underlying probability distributions are known. It is optimal given the distributions are known. For two classes ω1 and ω2 , Prior probabilities for an unknown new observation: P(ω1) : the new observation belongs to class 1 P(ω2) : the new observation belongs to class 2 P(ω1 ) + P(ω2 ) = 1 It reflects our prior knowledge. It is our decision rule when no feature on the new object is available: Classify as class 1 if P(ω1 ) > P(ω2 )
  • 3. Bayes Decision We observe features on each object. P(x| ω1) & P(x| ω2) : class-specific density The Bayes rule:
  • 6. Loss function Loss function: probability statement --> decision some classification mistakes can be more costly than others. The set of c classes: The set of possible actions: : deciding that an observation belongs to Loss when taking action i given the observation belongs to hidden class j:
  • 7. Loss function The expected loss: Given an observation with covariant vector x, the conditional risk is: Our final goal is to minimize the total risk over all x.
  • 8. Loss function The zero-one loss: All errors are equally costly. The conditional risk is: “The risk corrsponding to this loss function is the average probability error.” R(αi | x)= λ(αi |ωj)P(ωj | x) j=1 j=c ∑ = P(ωj | x)=1−P(ωi | x) j≠i ∑ c,...,1j,i ji1 ji0 ),( ji =    ≠ = =ωαλ
  • 9. Loss function Let denote the loss for deciding class i when the true class is j In minimizing the risk, we decide class one if Rearrange it, we have
  • 10. Loss function λλ θ ω ω ωθ ω ω λλ λλ >= − − )|x(P )|x(P :ifdecidethen )(P )(P .Let 2 1 1 1 2 1121 2212 λ = 0 1 1 0      , then θλ = P(ω2 ) P(ω1) = θa λ = 0 2 1 0       then θλ = 2P(ω2 ) P(ω1) = θb Example:
  • 11. Loss function Likelihood ratio. Zero-one loss function If miss- classifying ω2 is penalized more:
  • 12. Discriminant function & decision surface Features -> discriminant functions gi(x), i=1,…,c Assign class i if gi(x) > gj(x) ∀j ≠ i Decision surface defined by gi(x) = gj(x)
  • 13. Decision surface The discriminant functions help partition the feature space into c decision regions (not necessarily contiguous). Our interest is to estimate the boundaries between the regions.
  • 14. Minimax Minimizing the maximum possible loss. What happens when the priors change?
  • 15. Normal density Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty Normal distribution has the maximum entropy over all distributions with a given mean and variance.
  • 16. Reminder of some results for random vectors Let Σ be a kxk square symmetrix matrix, then it has k pairs of eigenvalues and eigenvectors. A can be decomposed as: Σ=λ1e1e1 ′+λ2e2e2 ′+.......+λkekek ′=PΛ′P Positive-definite matrix: ′xΣx >0,∀x ≠0 λ1 ≥λ2 ≥......≥λk >0 Note: ′xΣx =λ1( ′xe1)2 +......+λk( ′xek)2
  • 17. Normal density Whitening transform: P : eigen vector matrix Λ : diagonal eigen value matrix Aw = PΛ − 1 2 Aw t ΣAw = Λ − 1 2 Pt ΣPΛ − 1 2 = Λ − 1 2 Pt PΛPt PΛ − 1 2 = I Σ=λ1e1e1 ′+λ2e2e2 ′+.......+λkekek ′=PΛ′P
  • 18. Normal density To make a minimum error rate classification (zero-one loss), we use discriminant functions: This is the log of the numerator in the Bayes formula. The log posterior probability is proportional to it. Log is used because we are only comparing the gi’s, and log is monotone. When normal density is assumed: We have:
  • 19. Discriminant function for normal density (1)Σi = σ2 I Linear discriminant function: Note: blue boxes – irrelevant terms.
  • 20. Discriminant function for normal density The decision surface is where With equal prior, x0 is the middle point between the two means. The decision surface is a hyperplane,perpendicular to the line between the means.
  • 21. Discriminant function for normal density “Linear machine”: dicision surfaces are hyperplanes.
  • 22. Discriminant function for normal density With unequal prior probabilities, the decision boundary shifts to the less likely mean.
  • 23. Discriminant function for normal density (2) Σi = Σ
  • 24. Discriminant function for normal density Set: The decision boundary is:
  • 25. Discriminant function for normal density The hyperplane is generally not perpendicular to the line between the means.
  • 26. Discriminant function for normal density (3) Σi is arbitrary Decision boundary is hyperquadrics (hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids) gi(x)= xt Wix+wi t x+wi0 Wi =− 1 2 Σi −1 wi =Σi −1 µi wi0 =− 1 2 µi t Σi −1 µi − 1 2 lnΣi +lnP(ωi)
  • 29. Discriminant function for normal density Extention to multi-class.
  • 30. Discriminant function for discrete features Discrete features: x = [x1, x2, …, xd ]t , xi∈{0,1 } pi = P(xi = 1 | ω1) qi = P(xi = 1 | ω2) The likelihood will be:
  • 31. Discriminant function for discrete features The discriminant function: The likelihood ratio:
  • 32. g(x) = wi i=1 d ∑ xi + w0 wi = ln pi(1−qi) qi(1− pi) i =1,...,d w0 = ln 1− pi 1−qii=1 d ∑ + ln P(ω1) P(ω2) Discriminant function for discrete features So the decision surface is again a hyperplane.
  • 33. Optimality Consider a two-class case. Two ways to make a mistake in the classification: Misclassifying an observation from class 2 to class 1; Misclassifying an observation from class 1 to class 2. The feature space is partitioned into two regions by any classifier: R1 and R2
  • 35. Optimality In the multi-class case, there are numerous ways to make mistakes. It is easier to calculate the probability of correct classification. Bayes classifier maximizes P(correct). Any other partitioning will yield higher probability of error. The result is not dependent on the form of the underlying distributions.