1
UNIT - 4 : Bayes
Theorem
UNIT -4 :
CLUSTERING
BAYES’
THEOREM
UNIT - 4 : Bayes Theorem 2
UNIT - 4 : Bayes Theorem 3
UNIT - 4 : Bayes Theorem 4
Bayes' Theorem in machine learning is a mathematical
theorem that determines the conditional probability of an
event based on new evidence and prior knowledge. It provides
a structured approach for reasoning under uncertainty, making
it useful in many machine learning applications.
Bayes' Theorem finds its applications in machine learning as it is used
in Naïve Bayes classifiers, Bayesian networks, and probabilistic
inference models. It improves predictions by combining prior
probabilities with observed evidence, allowing models to make well-
informed decisions even with limited or uncertain data. From spam
filtering and medical diagnosis to fraud detection and natural language
processing, Bayes' Theorem supports a range of Artificial Intelligence -
driven applications.
5
Understanding Bayes' Theorem
Bayes theorem, also known as Bayes Rule or Bayes Law, helps us
calculate the probability of an event and update it based on new
evidence.
It says:
“The probability of an event A occurring given that event B has
occurred is equal to the product of the likelihood of B given A and
the prior probability of A, divided by the probability of B
occurring.”
P(A|B) = [P(B|A) × P(A)] / P(B)
This theorem updates hypotheses with new evidence, making it an
essential tool in decision-making. In machine learning and
statistics, Bayesian Decision Theory applies these principles to
select actions that minimize risk. UNIT - 4 : Bayes Theorem
UNIT - 4 : Bayes Theorem 6
The formula is expressed as:
P(A|B) = [P(B|A) × P(A)] / P(B)
Here’s what the formula means:
P(A|B): The probability of A occurring, given that B
is true.
P(B|A): The probability of B occurring, assuming A
is true.
P(A): The initial probability of A.
P(B): The total probability of B.
In this formula:
#P(A|B)#: Posterior probability - the probability of A given B has
occurred
#P(B|A)#: Likelihood - the probability of B given A
#P(A), P(B)#: Prior probabilities of events A and B occurring
independently
It's worth noting that for independent events, #P(B | A) = P(B)#. This
means if A does
not influence B, knowing A occurred doesn't change B's probability.
UNIT - 4 : Bayes Theorem 7
Terms Related to Bayes Theorem
1. Probability
Probability measures the likelihood of an event occurring. It is a mathematical way
of describing chance. If you are predicting the weather, probability tells you how
likely it is to rain. We express probability as a number between 0 and 1. Here, 0
means an event will never happen, and 1 means an event will definitely happen.
2. Prior Probability
Prior probability represents your initial belief about something before collecting
new evidence. It is your starting point of understanding. Let's say you are
analyzing a medical condition. The prior probability would be the baseline chance
of someone having that condition before running any specific tests. For example, if
a rare disease affects 1 in 1000 people, the prior probability would be 0.001 or
0.1%.
3. Hypotheses
A hypothesis is a proposed explanation or prediction about something that can be
tested and proven true or false. In Bayesian analysis, hypotheses play a central
role. You start with an initial hypothesis (prior hypothesis) and then update your
understanding as new evidence emerges.
4. Likelihood
Likelihood measures how probable the observed evidence is, given a specific
hypothesis. It answers the question: "If my hypothesis is true, how likely are these
specific observations?"
8
6. Conditional Probability
Conditional probability calculates the chance of an event happening, given that
another event has already occurred. It answers the question: "What is the probability
of X, knowing that Y has happened?" For example, what is the chance of having a
specific disease if you have already tested positive in an initial screening?
7. Posterior Probability
Posterior probability is your updated belief after considering new evidence. It
combines your prior belief with the new information you have discovered. It works
like adjusting a recipe after tasting it. Your initial recipe (prior probability) gets
modified based on the actual taste (new evidence). This results in a refined
understanding (posterior probability).
8. Independent Events
Independent events are occurrences that do not influence each other's probability. If
knowing about one event does not change the likelihood of another, they are
independent. In the case of flipping a coin, each flip is independent. The result of
one flip does not affect the next flip's probability.
9. Random Variables
A random variable represents a quantity with uncertain or probabilistic outcomes.
Unlike fixed values, random variables can take multiple possible values, each with
its own probability. These variables combine mathematical calculations and real-
world uncertainty, allowing precise predictions in unpredictable scenarios.
UNIT - 4 : Bayes Theorem 9
UNIT - 4 : Bayes Theorem 10
UNIT - 4 : Bayes Theorem 11
UNIT - 4 : Bayes Theorem 12
It rains once every ten days in a given area, meaning the probability of rain (P(Rain) is
10% (0.1). Alexa predicts rain accurately 90% of the time when it actually rains
(P(RainPrediction|Rain) = 0.9).
False positives occur when Alexa predicts rain, but it does not rain. False negatives occur
when Alexa fails to predict rain, but it actually rains.
We want to determine the probability of rain given that Alexa predicts it (P(Rain|
RainPrediction)).
Some additional information is given:
Alexa correctly predicts dry weather 80% of the time but incorrectly predicts rain 20% of
the time.
Over 100 days, Alexa predicts rain on 27 days: 9 correct predictions (it rains) and 18
incorrect predictions (it does not rain).
Now, using the formula:
P(Rain RainPrediction) = P(RainPrediction Rain) × P(Rain) / P(RainPrediction)
∣ ∣
Calculated as: P(Rain RainPrediction)= (0.9) × (0.1) / 0.27 = 0.33
∣
Therefore, if Alexa predicts rain, there’s about a 33% chance it will rain.
UNIT - 4 : Bayes Theorem 13
The Bayes’ Theorem in machine learning has
many applications, including:
Text classification and spam detection
Medical diagnosis systems
Recommendation engines
Computer vision and object recognition
Natural language processing (NLP)
Anomaly detection

Bayes Law with example used in machine learning.

  • 1.
    1 UNIT - 4: Bayes Theorem UNIT -4 : CLUSTERING BAYES’ THEOREM
  • 2.
    UNIT - 4: Bayes Theorem 2
  • 3.
    UNIT - 4: Bayes Theorem 3
  • 4.
    UNIT - 4: Bayes Theorem 4 Bayes' Theorem in machine learning is a mathematical theorem that determines the conditional probability of an event based on new evidence and prior knowledge. It provides a structured approach for reasoning under uncertainty, making it useful in many machine learning applications. Bayes' Theorem finds its applications in machine learning as it is used in Naïve Bayes classifiers, Bayesian networks, and probabilistic inference models. It improves predictions by combining prior probabilities with observed evidence, allowing models to make well- informed decisions even with limited or uncertain data. From spam filtering and medical diagnosis to fraud detection and natural language processing, Bayes' Theorem supports a range of Artificial Intelligence - driven applications.
  • 5.
    5 Understanding Bayes' Theorem Bayestheorem, also known as Bayes Rule or Bayes Law, helps us calculate the probability of an event and update it based on new evidence. It says: “The probability of an event A occurring given that event B has occurred is equal to the product of the likelihood of B given A and the prior probability of A, divided by the probability of B occurring.” P(A|B) = [P(B|A) × P(A)] / P(B) This theorem updates hypotheses with new evidence, making it an essential tool in decision-making. In machine learning and statistics, Bayesian Decision Theory applies these principles to select actions that minimize risk. UNIT - 4 : Bayes Theorem
  • 6.
    UNIT - 4: Bayes Theorem 6 The formula is expressed as: P(A|B) = [P(B|A) × P(A)] / P(B) Here’s what the formula means: P(A|B): The probability of A occurring, given that B is true. P(B|A): The probability of B occurring, assuming A is true. P(A): The initial probability of A. P(B): The total probability of B. In this formula: #P(A|B)#: Posterior probability - the probability of A given B has occurred #P(B|A)#: Likelihood - the probability of B given A #P(A), P(B)#: Prior probabilities of events A and B occurring independently It's worth noting that for independent events, #P(B | A) = P(B)#. This means if A does not influence B, knowing A occurred doesn't change B's probability.
  • 7.
    UNIT - 4: Bayes Theorem 7 Terms Related to Bayes Theorem 1. Probability Probability measures the likelihood of an event occurring. It is a mathematical way of describing chance. If you are predicting the weather, probability tells you how likely it is to rain. We express probability as a number between 0 and 1. Here, 0 means an event will never happen, and 1 means an event will definitely happen. 2. Prior Probability Prior probability represents your initial belief about something before collecting new evidence. It is your starting point of understanding. Let's say you are analyzing a medical condition. The prior probability would be the baseline chance of someone having that condition before running any specific tests. For example, if a rare disease affects 1 in 1000 people, the prior probability would be 0.001 or 0.1%. 3. Hypotheses A hypothesis is a proposed explanation or prediction about something that can be tested and proven true or false. In Bayesian analysis, hypotheses play a central role. You start with an initial hypothesis (prior hypothesis) and then update your understanding as new evidence emerges. 4. Likelihood Likelihood measures how probable the observed evidence is, given a specific hypothesis. It answers the question: "If my hypothesis is true, how likely are these specific observations?"
  • 8.
    8 6. Conditional Probability Conditionalprobability calculates the chance of an event happening, given that another event has already occurred. It answers the question: "What is the probability of X, knowing that Y has happened?" For example, what is the chance of having a specific disease if you have already tested positive in an initial screening? 7. Posterior Probability Posterior probability is your updated belief after considering new evidence. It combines your prior belief with the new information you have discovered. It works like adjusting a recipe after tasting it. Your initial recipe (prior probability) gets modified based on the actual taste (new evidence). This results in a refined understanding (posterior probability). 8. Independent Events Independent events are occurrences that do not influence each other's probability. If knowing about one event does not change the likelihood of another, they are independent. In the case of flipping a coin, each flip is independent. The result of one flip does not affect the next flip's probability. 9. Random Variables A random variable represents a quantity with uncertain or probabilistic outcomes. Unlike fixed values, random variables can take multiple possible values, each with its own probability. These variables combine mathematical calculations and real- world uncertainty, allowing precise predictions in unpredictable scenarios.
  • 9.
    UNIT - 4: Bayes Theorem 9
  • 10.
    UNIT - 4: Bayes Theorem 10
  • 11.
    UNIT - 4: Bayes Theorem 11
  • 12.
    UNIT - 4: Bayes Theorem 12 It rains once every ten days in a given area, meaning the probability of rain (P(Rain) is 10% (0.1). Alexa predicts rain accurately 90% of the time when it actually rains (P(RainPrediction|Rain) = 0.9). False positives occur when Alexa predicts rain, but it does not rain. False negatives occur when Alexa fails to predict rain, but it actually rains. We want to determine the probability of rain given that Alexa predicts it (P(Rain| RainPrediction)). Some additional information is given: Alexa correctly predicts dry weather 80% of the time but incorrectly predicts rain 20% of the time. Over 100 days, Alexa predicts rain on 27 days: 9 correct predictions (it rains) and 18 incorrect predictions (it does not rain). Now, using the formula: P(Rain RainPrediction) = P(RainPrediction Rain) × P(Rain) / P(RainPrediction) ∣ ∣ Calculated as: P(Rain RainPrediction)= (0.9) × (0.1) / 0.27 = 0.33 ∣ Therefore, if Alexa predicts rain, there’s about a 33% chance it will rain.
  • 13.
    UNIT - 4: Bayes Theorem 13 The Bayes’ Theorem in machine learning has many applications, including: Text classification and spam detection Medical diagnosis systems Recommendation engines Computer vision and object recognition Natural language processing (NLP) Anomaly detection