SlideShare a Scribd company logo
1 of 64
Topik 6
Uncertainty Problems dan Probabilistic
Machine Learning
Dr. Sunu Wibirama
Modul Kuliah Kecerdasan Buatan
Kode mata kuliah: UGMx 001001132012
June 22, 2022
June 22, 2022
1 Capaian Pembelajaran Mata Kuliah
Topik ini akan memenuhi CPMK 5, yakni mampu mendefinisikan beberapa teknik ma-
chine learning klasik (linear regression, rule-based machine learning, probabilistic machine
learning, clustering) dan konsep dasar deep learning serta implementasinya dalam penge-
nalan citra (convolutional neural network).
Adapun indikator tercapainya CPMK tersebut adalah
memahami teori dasar probabilitas (random variables, probability distributions, joint
probability, conditional probability, marginal probability), memahami terminologi dalam teori
uncertainty dan teori peluang, memahami konsep teori Bayes dan implementasinya.
2 Cakupan Materi
Cakupan materi dalam topik ini sebagai berikut:
a) Introduction to Uncertainty: materi ini menjelaskan tentang sebab-sebab munculnya
ketidakpastian dalam data dan kelemahan machine learning yang berbasis decision
tree (rule-based machine learning).
b) Probability for Machine Learning: materi ini menjelaskan dasar-dasar teori peluang,
termasuk di antaranya random variables, probability distributions, joint probability,
conditional probability, marginal probability.
c) Bayesian Rules: materi ini menjelaskan konsep-konsep dasar implementasi dari con-
ditional, joint, dan marginal probability dalam formula Bayesian rule atau Bayesian
reasoning.
d) Implementasi Bayesian Rule: materi ini menjelaskan implementasi Bayesian rule
dalam beberapa kasus, misalnya memprediksi gender, peluang terinfeksi Covid-19, dan
implementasi pada Computer Aided Diagnosis untuk memprediksi beberapa penyakit
berdasarkan ragam gejala yang ada.
1
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Introduction to Uncertainty (Part 01)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Neural network model Geometric model
Logical model/rule-based model Probabilistic model
Four types of classification technique
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Geometric model
Distance of a house from city hospital
Price of
a house (USD)
Exclusive
housing
Government
supported housing
(e.g.,: SVM, linear discriminant analysis, KNN)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Logical model / Rule-based model
SUBSCRIBE CHEAP LOTTERY
COUPON FOR ONLY $19
TO BE REMOVED FROM FUTURE
MAILINGS, SIMPLY REPLY TO THIS
MESSAGE AND PUT "REMOVE" IN THE
SUBJECT.
Classifying spam email –
easy if you know the “feature”.
’Viagra’ and ‘lottery’ are two
important features of spam email.
Class:
spam
Class:
spam
Class:
ham
(e.g.,: decision tree)
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
What about sentiment analysis?
No definite rule to judge “positive” or “negative” tweet
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
Sentiment analysis and
uncertainty of political view
• When analyzing Twitter data or Facebook data, you
observe the sentiment (tone) of the tweet /status.
• There is no exact rule on determining political view,
except that you can minimize the uncertainty of the
tweet/status tone.
• How to minimize the uncertainty?
• Gathering more evidence (in case of sentiment analysis:
words). But they are so many words? Use language
corpus!
• Measuring the probability of the occurrence
• Categorizing the tweet / status based on the probability
• This is the case when the logical machine learning model
does not work well. We will use the probabilistic model.
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
7
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Introduction to Uncertainty (Part 02)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Uncertainty and probabilistic model
• Information can be incomplete, inconsistent, uncertain, or all three.
In other words, information is often unsuitable for solving a problem /
making inference
• Uncertainty is defined as the lack of the exact knowledge that would
enable us to reach a perfectly reliable conclusion.
• Classical logic (logical model) permits only exact reasoning.
 It assumes that perfect knowledge always exists and the law of the
excluded middle (Every proposition is either true or not true) can
always be applied:
IF A is true IF A is false
THEN A is not false THEN A is not true
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Sources of uncertain knowledge (1)
• Weak implications: domain experts and engineers
have a painful task of establishing concrete
correlations between IF (condition) and THEN
(action) parts of the rules.
• Therefore, expert systems need to have the ability
to handle vague (lack of clarity) associations.
• For example by accepting the degree of
correlations as numerical certainty factors (i.e.,:
strong correlation between two variables
represents more certainty)
Marvin Minsky with Block Blocks Vision Robot at MIT. In 1946, he entered
Harvard University after returning from service in the U.S. Navy during World
War II. After graduating from Harvard in 1950, he attended Princeton
University, earning his Ph.D. in mathematics in 1954. In 1958, Minsky joined
the faculty of MIT’s Department of Electrical Engineering and Computer
Science. A year later, he co-founded the Artificial Intelligence Laboratory.
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Sources of uncertain knowledge (2)
• Imprecise language. Our natural language is
ambiguous and imprecise.
 We describe facts with such terms as often and
sometimes, frequently and hardly (=almost not) ever.
• As a result, it can be difficult to express knowledge
in the precise IF-THEN form of production rules.
• However, if the meaning of the facts is quantified,
it can be used in expert systems.
• In 1944, Ray Simpson asked 355 high school and
college students to place 20 terms such as ”often”
on a scale between 1 and 100.
• In 1968, Milton Hakel repeated this experiment.
4
Ray H. Simpson (1944) The specific meanings of certain terms indicating
differing degrees of frequency,Quarterly Journal of Speech, 30:3, 328-330
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Sources of Uncertain Knowledge (2)
Imprecise
language
5
Term
Always
Very often
Usually
Often
Generally
Frequently
Rather often
About as often as not
Now and then
Sometimes
Occasionally
Once in a while
Not often
Usually not
Seldom
Hardly ever
Very seldom
Rarely
Almost never
Never
Mean value
99
88
85
78
78
73
65
50
20
20
20
15
13
10
10
7
6
5
3
0
Term
Always
Very often
Usually
Often
Generally
Frequently
Rather often
About as often as not
Now and then
Sometimes
Occasionally
Once in a while
Not often
Usually not
Seldom
Hardly ever
Very seldom
Rarely
Almost never
Never
Mean value
100
87
79
74
74
72
72
50
34
29
28
22
16
16
9
8
7
5
2
0
Milton Hakel (1968)
Ray Simpson (1944)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
Sources of uncertain knowledge (3)
• Unknown data. When the data is incomplete or missing, the
only solution is to accept the value “unknown” and proceed to
an approximate reasoning with this value.
• Combining the views of different experts. Large expert
systems usually combine the knowledge and expertise of a
number of experts.
 Unfortunately, experts often have contradictory opinions
and produce conflicting rules.
 To resolve the conflict, the engineer has to attach a
weight to each expert and then calculate the composite
conclusion. But no systematic method exists to obtain
these weights.
6
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
7
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Probability for Machine Learning (Part 01)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Basic statistical measures
• The mean of a vector, usually denoted as ̅, is the mean of
its elements, that is to say the sum of the components
divided by the number of components.
• The variance is a value describing how the data is spread
around the mean. A dataset with a large variance means
that data points are spread far away from the mean. A
dataset with a small variance means that the data points are
grouped closely around the mean.
• The standard deviation is simply the square root of the
variance.
• The covariance between two variables tells if large values
in one variable are associated with large values in the other
and vice versa.
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Basic probability theory
• The concept of probability has a long history
that goes back thousands of years when words
like “probably”, “likely”, “maybe”, “perhaps” and
“possibly” were introduced into spoken
languages.
• However, the mathematical theory of probability
was formulated only in the 17th century.
• The probability of an event is the proportion of
cases in which the event occurs.
• Probability can also be defined as a scientific
measure of chance.
3
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Basic probability theory
• Probability can be expressed mathematically as a numerical index with a range between
zero (an absolute impossibility) to unity (an absolute certainty).
• Most events have a probability index strictly between 0 and 1, which means that each
event has at least two possible outcomes: favourable outcome or success, and
unfavourable outcome or failure.
4
=
=
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Basic probability theory
• If s is the number of times success can occur, and f is
the number of times failure can occur, then
+ = 1
• If we throw a coin, the probability of getting a head will
be equal to the probability of getting a tail. In a single
throw, s = f = 1, s + f = 2, and therefore the probability of
getting a head (or a tail) is 0.5.
5
= =
+
= =
+
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
Geometric representation of event
Mutually exclusive Non-mutually exclusive
A B A B
(¬A)
(¬B)
A
A
B
B
A+B A∩B
Non-mutually exclusive:
Ada peluang event A dan B
muncul bersamaan.
Contoh: peluang munculnya
tail dan angka 6 ketika kita
melempar dadu, lalu
melempar koin
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
Mutually exclusive Non-mutually exclusive
if A and B are non-mutually exclusive
Non-mutually exclusive:
Ada peluang event A dan B muncul bersamaan.
Contoh: peluang munculnya tail dan angka 6 ketika kita melempar dadu, lalu melempar koin
Geometric representation of event
A B A B
Given two events, A and B, we define the probability of A or B as follows:
∪ = + − ∩
or
∪ = + if A and B are mutually exclusive
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
8
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Probability for Machine Learning (Part 02)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Random variables
• A random experiment, or simply experiment describes a
process that gives you uncertain results, as a coin flip, for
instance. The outcome of a random experiment is the result
you obtain.
• A random variable takes a value corresponding to the
outcome of a random experiment. Use big letter to
denote random variables and its corresponding small letter
for one of its value.
• As shown in the figure, if you flip a coin, the two possible
outcomes are ‘heads’ and ‘tails’. An example of random
variable would map ‘heads’ to 0 and ‘tails to 1.
• The event A corresponds to the following set of outcomes:
{‘heads’}. This means that the probability that the outcome
is ‘head’ can be denoted as:
= 0 = ′ℎ = ( )
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Sample Space
• Discrete sample space:
• A sample space containing a finite number of possibilities or an
unending sequence with as many elements as there are
numbers.
• Variable called a “discrete random variable”
• Can be counted
• Example: # of people in the room with red shoes on.
• Continuous sample space:
• Sample space containing an infinite number of possibilities
equal to the number of points on a line segment.
• Non-discrete
• Variable is a “continuous random variable”
• Can’t be counted but can be measured
• Example: heights of children
3
Richard Holzer, Patrick Wüchner, Hermann de Meer,
“Modeling of Self-Organizing Systems: An Overview”,
Universitätsbibliothek TU Berlin, 2010
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Discrete Probability Distributions
• Discrete random variable has a certain
probability of equaling each of its possible
values.
• Example: tossing coin 3 times
• = number of heads (H)
• = {HHH,HHT, HTH, HTT, THH, THT,
TTH, TTT}
• For = 2, = 3/8
• Using formula:
• = =
• 3 = = 3 = 1/8
4
4
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Probability distribution for discrete variables
• The probability distribution of a random variable is a
function that takes the sample space as input and returns
probabilities: probability value of random variable when
its value equals to
• Set of ordered pairs ( , ( ) ) is a
probability function, or
probability mass function, or
probability distribution
of the discrete random variable, , if:
( ) > 0
∑ ( ) = 1
= = ( )
5
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
6
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Probability for Machine Learning (Part 03)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Probability mass functions (1)
• Another example:
• let’s say that you’re running a dice-rolling
experiment.
• is the random variable corresponding to
this experiment. Assuming that the die is
fair, each outcome is equiprobable
• That is, if you run the experiment a large
number of times, you will get each
outcome approximately the same number
of times.
Probability mass function of the random variable X
corresponding to a die rolling a six-sided die estimated from
20 rolls.
Probability mass function of the random variable X
corresponding to a die rolling a six-sided die estimated from
100,000 rolls.
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Probability mass functions (2)
• A stockroom clerk returns three safety
helmets at random to three steel mill
employees who had previously checked
them.
• If Smith (S), Jones (J), and Brown (B), in that
order, receive one of the three hats:
• list the sample points for the possible
orders of returning the helmets
• find the value of random variable that
represents the number of correct matches.
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Probability mass functions (2)
• A stockroom clerk returns three safety
helmets at random to three steel mill
employees who had previously checked
them.
• If Smith (S), Jones (J), and Brown (B), in that
order, receive one of the three hats:
• list the sample points for the possible
orders of returning the helmets
• find the value of random variable that
represents the number of correct matches.
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Cumulative Distributions for Discrete
5
For the random variable , the number of correct matches in the previous
example
The cumulative distribution function for is
0
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
6
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Probability for Machine Learning (Part 04)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Another case…
• Statement 1:
What is the probability of tomorrow
morning temperature equals to 10O
Celsius ?
• Statement 2:
What is the probability of tomorrow
morning temperature less or more
than 10O Celsius ?
What is the answer of each statement?
Which statement is more appropriate to be used?
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Continuous Probability Distributions
• Probability Density Function (PDF)
• Continuous random variable has P = 0
of being exactly any value.
• Example: being exactly 175 cm tall
• ( ℎ =175.000) = 0
• However, can find probability that height
lies within some range. Example:
( ℎ ≥ 175) or
(160 ≤ ℎ ≤ 175)
3
Remember: probability of continuous random variable
must be described over a range
The probability to draw a number between 0 and 0.2 is the
highlighted area under the curve.
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Continuous Distributions
4
P(a < X < b) = f(x)dx
a
b
ò
1
dx
f(x)
-
=
ò


Sum of all f(x) must equal unity
P(a < X < b) must be positive
a b
b)
X
P(a
b)
X
P(a 

=
<
<
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
5
Example
• Given that a random variable, X, has a density function:
• f(x) = 2x, for 0 < x < 1
• f(x) = 0, for all other x
• Verify area under curve = 1.0
2x
0
1
ò dx
=
2x2
2
| 1
0
= 12
- (0)2
f (x)dx
-

ò = + 0
1

ò dx
0dx
-
0
ò +
= x2
| 1
0
= 1.0
1
f(x)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
6
• Given that a random variable, , has a density function:
• f(x) = 2x, for 0 < x < 1
• f(x) = 0, for all other x
• What is the probability that: -¼ < x < 1/2
Example (2)
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
2x
0
1/2
ò dx
= x2
| 1/2
0
= (1/ 2)2
- (0)2
0dx
-1/4
0
ò +
= ¼ - 0 = ¼
P(-¼ < x < ½) =
• Given that a random variable, , has a density function:
• f(x) = 2x, for 0 < x < 1
• f(x) = 0, for all other x
• What is the probability that: -¼ < x < 1/2
Example (2)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
Cumulative distributions for
continues random variable
• The cumulative distribution ( ) of a
continuous random variable, , with
density function, ( ), is:
F(a) = P(X  a) = f(x)dx
-
a
ò
P(a<x<b) = F(b) - F(a)
= f(x)dx
-
b
ò - f(x)dx
-
a
ò = f(x)dx
a
b
ò
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9
9
Differentiation of ( )
Lets you determine ( ) from ( )
f (x) =
dF(x)
dx
Cumulative distributions for
continues random variable
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
10
Example
• Suppose the error in the reaction temperature for a experiment is
a continuous random variable, x, having the cumulative
probability density function of:
• ( ) = x3/9, for –1 < x < 2
• ( ) = 0, for all other x<-1 and x>2
• Thus, expression for density function ( ) for (-1< x < 2)
=
( )
=
/
= /3
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11
11
End of File
22/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Probability Distribution
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Discrete uniform distribution
If the random variable assume the values , , … , with equal probabilities,
then the discrete uniform distribution is given by
• When a light bulb is selected at random from a box that
contains a 40-watt bulb, a 60-watt bulb, a 75-watt bulb, and a
100-watt bulb, each element of the sample space
S = {40, 60, 75, 100} occurs with probability 1/4.
Therefore, we have a uniform distribution, with
22/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Poisson distribution
1. The number of outcomes occurring in one time interval
or specified region is independent of the number that
occurs in any other disjoint time interval or region of
space. Thus, the Poisson process has no memory.
2. The probability that a single outcome will occur during a
very short time interval or in a small region is
proportional to the length of the time interval or the size
of the region and does not depend on the number of
outcomes occurring outside this time interval or region.
3. The probability that more than one outcome will occur in
such a short time interval or fall in such a small region is
negligible (unimportant)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Poisson distribution
• Customer care center receives 100 calls
per hour, 8 hours a day.
• As we can see that the calls are
independent of each other.
• The probability of the number of calls per
minute has a Poisson probability
distribution.
• There can be any number of calls per
minute irrespective of the number of calls
received in the previous minute
Source: https://www.cuemath.com/data/poisson-distribution/
22/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Binomial distribution
 An experiment often consists of repeated trials, each
with two possible outcomes that may be labeled
success or failure.
 The most obvious application deals with the testing of
items as they come off an assembly line, where each
test or trial may indicate a defective or a non-defective
item (or coin, with head or tail)
 We may choose to define either outcome as a success.
The process is referred to as a Bernoulli process.
Each trial is called a Bernoulli trial.
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
1. The experiment consists of n identical trials
2. There are only 2 possible outcomes on each trial.
We will denote one outcome by S (for Success)
and the other by F (for Failure).
3. The probability of S remains the same from trial
to trial. This probability will be denoted by p, and
the probability of F will be denoted by q (q = 1-p).
4. The trials are independent.
5. The binomial random variable is the number of
S in n trials.
Binomial distribution
22/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
Normal distribution
• Most important continuous probability
distribution
• Graph called the “normal curve”
(bell-shaped).
• Total area under the curve = 1
• Derived by DeMoivre and Gauss.
Hence, it is called the “Gaussian”
distribution.
• Describes many phenomena in nature,
industry and research
7
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
Other probability distributions
Source: https://www.kdnuggets.com/2020/02/probability-distributions-data-science.html
22/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9
Why probability distributions?
• Some machine learning models work best under
some distributions assumptions
• For example, these algorithms and function
assume normal distribution: Linear Discriminant
Analysis (LDA), Gaussian Naive Bayes, Logistic
Regression, Linear Regression, Sigmoid
function.
• When you are working with dataset, you are
dealing with sample instead of population.
Probability distributions help you to make
predictions about the whole population.
• Understanding probability distributions help you
to choose appropriate data transformation
methods/ feature extraction techniques. Source: https://www.kdnuggets.com/2020/02/probability-distributions-data-science.html
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
10
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Conditional Probability and Bayesian Rule
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Conditional probability
• Let A be an event in the world and B be another event.
Suppose that events A and B are not mutually exclusive /
not separated, but occur conditionally on the occurrence of
the other.
• The probability that event A will occur if event B occurs is
called the conditional probability.
• Conditional probability is denoted mathematically as p(A|B) in
which the vertical bar represents "given" and the complete
probability expression is interpreted as: conditional probability
of event A occurring given that event B has occurred.
A B
=
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
3
Conditional probability
• The number of times A and B can occur, or the probability that
both A and B will occur, is called the joint probability of A and
B. It is represented mathematically as ∩ .
• The number of ways B can occur is the probability of B: ( )
• The probability of an event A, given that an event B has occurred,
is called the conditional probability of A given B and denoted
by the symbol
• The probability of an event B, given that an event A has occurred,
is called the conditional probability of B given A and denoted
by the symbol
=
∩
( )
=
∩
( )
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Two events A and B are independent if and only if
Otherwise, A and B are dependent.
Conditional probability and independence
Independent:
event A dan B bersifat non-mutually exclusive (ada peluang dua-duanya
muncul bersamaan), tapi kemunculan A tidak mempengaruhi B, dan juga
sebaliknya. Misal: kemunculan mata dadu 6 dan bagian tail pada koin jika
kita melempar dadu, lalu melempar koin
= ( )
= ( )
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
5
Bayesian rule
Hence
and
Substituting the first equation into this equation
yields the Bayesian rule (developed by a statistician Thomas Bayes)
∩ = ( | ) × ( )
∩ = ( | ) × ( )
=
∩
( )
=
| × ( )
( )
=
∩
( )
=
∩
( )
Thomas Bayes (1701-1762)
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
A B
6
Bayesian rule
• If the occurrence of event A depends on only two mutually exclusive
events (B and NOT B) we obtain marginal probability:
where  is the logical function NOT. Similarly
• Substituting this equation into the Bayesian rule
yields:
( ∩ ¬ )
( ∩ )
= × + ¬ × (¬ )
( ∩ )
( ∩ ¬ )
= × + ¬ × (¬ )
=
× ( )
( )
=
× ( )
× + ¬ × (¬ )
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
7
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Implementation Bayesian Reasoning (Part 01)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
A B
2
Bayesian rule
• If the occurrence of event A depends on only two mutually exclusive
events (B and NOT B) we obtain:
where  is the logical function NOT. Similarly
• Substituting this equation into the Bayesian rule
yields:
( ∩ ¬ )
( ∩ )
= × + ¬ × (¬ )
( ∩ )
( ∩ ¬ )
= × + ¬ × (¬ )
=
× ( )
( )
=
× ( )
× + ¬ × (¬ )
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Dilemma at the movies
• This person dropped their ticket
in the hallway.
• Do you call out “Excuse me,
ma’am!” or “Excuse me, sir!”
• You have to make a guess.
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
• What if they’re standing in line
for the men’s restroom?
• Bayesian reasoning (a.k.a
Bayesian inference) is a way to
capture common sense.
• It helps you use what you know
to make better guesses.
Dilemma at the movies
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Put numbers to our dilemma
Out of 100 men
at the movies
4 have
long hair
96 have
short hair
Out of 100 women
at the movies
50 have
long hair
50 have
short hair
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
• About 12 times more women have long hair than men.
4 have
long hair
96 have
short hair
50 have
long hair
50 have
short hair
Out of 100 men
at the movies
Out of 100 women
at the movies
Courtesy of Brandon Rohrer, 2019
Put numbers to our dilemma
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
• But there are 98 men and 2 women in line for the men’s restroom.
Out of 98 men
in line
4 have
long hair
94 have
short hair
Out of 2 women
in line
1 has
long hair
1 has
short hair
Courtesy of Brandon Rohrer, 2019
Put numbers to our dilemma
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
Put numbers to our dilemma
• In the line, 4 times more men have long hair than women.
4 have
long hair
94 have
short hair
1 has
long hair
1 has
short hair
Out of 98 men
in line
Out of 2 women
in line
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9
50 are men
2 men have long hair
48 men
have
short hair
50 are women
25 women
have
long hair
25 women
have
short hair
Out of 100 people
at the movies
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
98 are men
4 men have long hair
94 men
have
short hair
2 are
women
One
woman
has
long
hair
One
woman
has
short
hair
Out of 100 people
In line for the
men’s restroom
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11
Translate to math
P(something) = # something / # everything
P(woman) = Probability that a person is a woman
= # women / # people
= 50 / 100 = .5
P(man) = Probability that a person is a man
= # men / # people
= 50 / 100 = .5
50 are men
50 are women
Out of 100 people
at the movies
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 12
Translate to math
P(something) = # something / # everything
P(woman) = Probability that a person is a woman
= # women / # people
= 2 / 100 = .02
P(man) = Probability that a person is a man
= # men / # people
= 98 / 100 = .98
98 are men
2 are
women
Out of 100 people
In line for the
men’s restroom
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 13
13
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Implementation Bayesian Reasoning (Part 02)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Conditional probabilities
• P(long hair | woman)
• If I know that a person is a woman,
what is the probability that person has
long hair?
• P(long hair | woman)
• = # women with long hair / # women
• = 25 / 50 = .5
50 are women
25 women
have
long hair
25 women
have
short hair
Out of 100 people
at the movies
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Conditional probabilities
• If I know that a person is a man,
what is the probability that person
has long hair?
• P(long hair | man)
• = # men with long hair / # men
• = 2 / 50 = .04
50 are men
2 men have long hair
48 men
have
short hair
Out of 100 people
at the movies
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Conditional probabilities
• P(A | B) is the probability of A, given B.
• “If I know B is the case, what is the
probability that A is also the case?”
• P(A | B) is not the same as P(B | A).
• P(cute | puppy) is not the same as P(puppy | cute)
• If I know the thing I’m holding is a puppy, what is the probability that it is cute?
• If I know the the thing I’m holding is cute, what is the probability that it is a puppy?
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
Joint probabilities
What is the probability that a person
is both a woman and has short hair?
P(woman with short hair)
= P(woman) * P(short hair | woman)
= .5 * .5 = .25
P(man) = .5
P(woman) = .5
Out of probability of 1
P(woman with
short hair) = .25
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
Joint probabilities
P(woman with long hair)
= P(woman) * P(long hair | woman)
= .5 * .5 = .25
P(man) = .5
P(woman) = .5
Out of probability of 1
P(woman with
short hair) = .25
P(woman with
long hair) = .25
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
Joint probabilities
P(man with short hair)
= P(man) * P(short hair | man)
= .5 * .96 = .48
P(man) = .5
P(woman) = .5
Out of probability of 1
P(woman with
short hair) = .25
P(woman with
long hair) = .25
P(man with
short hair) = .48
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
Joint probabilities
• P(man with long hair)
• = P(man) * P(long hair | man)
• = .5 * .04 = .02
P(man) = .5
P(woman) = .5
P(woman with
short hair) = .25
Out of probability of 1
P(woman with
long hair) = .25
P(man with
short hair) = .48
P(man with
long hair) = .02
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9
Joint probabilities
If P(man) = .98 and P(woman) = .02,
then the answers change.
P(man with long hair)
= P(man) * P(long hair | man)
= .98 * .04 = .04
P(man) = .98
P(woman) = .02
P(woman
with short
hair) = .01
Out of probability of 1
P(woman
with long
hair) = .01
P(man with short hair) = .94
P(man with long hair) = .04
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
Joint probabilities
• P(woman with long hair)
• = P(woman) * P(long hair | woman)
• = .02 * .5 = .01
P(man) = .98
P(woman) = .02
P(woman
with short
hair) = .01
Out of probability of 1
P(woman
with long
hair) = .01
P(man with short hair) = .94
P(man with long hair) = .04
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11
Joint probabilities
• P(A and B) is the probability that both A and B
are the case.
• Also written P(A, B) or P(A ∩ B)
• P(A and B) is the same as P(B and A)
• The probability that I am having a jelly donut with
my milk is the same as the probability that I am
having milk with my jelly donut.
• P(donut and milk) = P(milk and donut)
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 12
Marginal probabilities
P(long hair) = P(woman with long hair) +
P(man with long hair)
= .01 + .04 = .05
P(man) = .98
P(woman) = .02
P(woman
with short
hair) = .01
Out of probability of 1
P(woman
with long
hair) = .01
P(man with short hair) = .94
P(man with long hair) = .04
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 13
Marginal probabilities
P(short hair) = P(woman with short hair)+
P(man with short hair)
= .01 + .94 = .95
P(man) = .98
P(woman) = .02
P(woman
with short
hair) = .01
Out of probability of 1
P(woman
with long
hair) = .01
P(man with short hair) = .94
P(man with long hair) = .04
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 14
What we really care about
• We know the person has long hair.
Are they a man or a woman?
• P(man | long hair)
• We don’t know this answer yet, but
we already learned about joint
probabilities and marginal
probabilities.
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 15
Thomas Bayes noticed something cool
• P(man with long hair) = P(long hair) * P(man | long hair)
• P(long hair and man) = P(man) * P(long hair | man)
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 16
Thomas Bayes noticed something cool
• P(man with long hair) = P(long hair) * P(man | long hair)
• P(long hair and man) = P(man) * P(long hair | man)
• Because P(man and long hair) = P(long hair and man)
• P(long hair) * P(man | long hair) = P(man) * P(long hair | man)
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 17
Thomas Bayes noticed something cool
• P(man with long hair) = P(long hair) * P(man | long hair)
• P(long hair and man) = P(man) * P(long hair | man)
• Because P(man and long hair) = P(long hair and man)
• P(long hair) * P(man | long hair) = P(man) * P(long hair | man)
• P(man | long hair) = P(man) * P(long hair | man) / P(long hair)
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 18
Thomas Bayes noticed something cool
• P(man with long hair) = P(long hair) * P(man | long hair)
• P(long hair and man) = P(man) * P(long hair | man)
• Because P(man and long hair) = P(long hair and man)
• P(long hair) * P(man | long hair) = P(man) * P(long hair | man)
• P(man | long hair) = P(man) * P(long hair | man) / P(long hair)
• P(A | B) = P(B | A) * P(A) / P(B)
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 19
Back to the movie theater, this time with Bayes
P(man | long hair) = P(man) * P(long hair | man)
P(long hair)
P(man) = .5
P(woman) = .5
P(long hair | woman) = .5
P(long hair | man) = .04
Courtesy of Brandon Rohrer, 2019
P(woman with
long hair) = .25
P(man with
long hair) = .02
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 20
P(man | long hair) = P(man) * P(long hair | man)
P(long hair)
= P(man) * P(long hair | man)
P(woman with long hair) + P(man with long hair)
P(man) = .5
P(woman) = .5
P(long hair | woman) = .5
P(long hair | man) = .04
Back to the movie theater, this time with Bayes
Courtesy of Brandon Rohrer, 2019
P(woman with
long hair) = .25
P(man with
long hair) = .02
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 21
P(man | long hair) = P(man) * P(long hair | man)
P(long hair)
= P(man) * P(long hair | man)
P(woman with long hair) + P(man with long hair)
P(man | long hair) = .5 * .04 = .02 / .27 = .07
.25 + .02
P(man) = .5
P(woman) = .5
P(long hair | woman) = .5
P(long hair | man) = .04
Back to the movie theater, this time with Bayes
Courtesy of Brandon Rohrer, 2019
P(woman with
long hair) = .25
P(man with
long hair) = .02
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 22
Now, knowing that they are in line of
men’s rest room changes the probability
P(man | long hair)
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 23
P(long hair | man) = .04
P(man) = .98
P(woman) = .02
P(man | long hair) = P(man) * P(long hair | man)
P(long hair)
= P(man) * P(long hair | man)
P(woman with long hair) + P(man with long hair)
P(long hair | woman) = .5
P(woman with long hair) = .01
P(man with long hair) = .04
Back to the movie theater, this time with Bayes
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 24
P(long hair | man) = .04
P(man) = .98
P(woman) = .02
P(man | long hair) = P(man) * P(long hair | man)
P(long hair)
= P(man) * P(long hair | man)
P(woman with long hair) + P(man with long hair)
P(man | long hair) = .98 * .04 = .04 / .05 = .80
.01 + .04
P(long hair | woman) = .5
P(woman with long hair) = .01
P(man with long hair) = .04
Back to the movie theater, this time with Bayes
Courtesy of Brandon Rohrer, 2019
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 25
Dilemma at the movies
This person dropped their
ticket in front of men’s toilet,
surely you can call:
“Excuse me, sir!”
With 0.80 probability of this
person is man when you
see the hair
Courtesy of Brandon Rohrer, 2019
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 26
26
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Implementation Bayesian Reasoning (Part 03)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
The Bayesian rule expressed in terms of hypothesis (H) and evidence (E) looks like this:
where:
( | ) probability that hypothesis H is true given evidence E
( ) is the prior probability of hypothesis H being true;
( | ) is the probability that hypothesis H being true will result in evidence E;
(¬ ) is the prior probability of hypothesis H being false;
( |¬ ) is the probability of finding evidence E even when hypothesis H is false.
Prior probability
Likelihood
Posterior probability
Marginal probability
=
× ( )
× + ¬ × (¬H)
Bayesian reasoning
Thomas Bayes (1701-1762)
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
Computer Aided Diagnosis for Covid-19 detection
Abraham, Bejoy, and Madhu S. Nair. "Computer-aided detection of COVID-19 from
X-ray images using multi-CNN and Bayesnet classifier." Biocybernetics and
biomedical engineering 40, no. 4 (2020): 1436-1445.
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
Computer Aided Diagnosis for Covid-19 detection
• Recently you decide to have a CT-scan test for Covid-19.
If the test is positive, what is the probability you are
infected?
• Suppose you are told the test has a sensitivity of 80%,
which means, if you are infected by Covid-19, the test will
be positive with probability 0.8. In other words
where x = 1 is the event the test is positive, and y = 1 is
the event you are infected by Covid-19.
Note:
x is evidence (E)
y is hypothesis (H)
Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
• Many people conclude they are therefore 80%
likely to be infected by Covid-19. But this is false! It
ignores the prior probability of having Covid-19, which
fortunately is quite low in June 2022:
• We also need to take into account the fact that the test
may be a false positive or false alarm. Unfortunately,
such false positives are quite likely (with current
screening technology):
Prior probability that someone
is infected by Covid-19
Probability that the test result is positive,
although you aren’t infected
Computer Aided Diagnosis for Covid-19 detection
Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
• Combining these three terms using Bayes rule, we can
compute the correct answer as follows:
Note:
x is evidence (E)
y is hypothesis (H)
=
× ( )
× + ¬ × (¬H)
where = 0 = 1 − = 1 = 0.996.
In other words, if the test’s result is positive, you only have about a 3%
chance of actually infected by Covid-19!
p(x=1 | y=1) : peluang seseorang
terinfeksi Covid-19 dan
hasil tes positif
p(y=1) : peluang seseorang
terinfeksi Covid-19
p(y=0) : peluang seseorang
TIDAK terinfeksi Covid-19
p(x=1 | y=0) : false alarm, seseorang
tidak terinfeksi, tapi
hasil tes menyatakan ia
positif terinfeksi Covid-19
Computer Aided Diagnosis for Covid-19 detection
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
• The Computer Aided Diagnosis (CAD) system can then determine whether you are
infected by Covid-19 using a threshold (e.g. corona score) on the probability value.
• For example:
• IF = 1 = 1 > = 80
THEN the CAD will notify the medical doctor that you are infected by Covid-19.
Computer Aided Diagnosis for Covid-19 detection
Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
8
End of File
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1
Sunu Wibirama
sunu@ugm.ac.id
Department of Electrical and Information Engineering
Faculty of Engineering
Universitas Gadjah Mada
INDONESIA
Implementation Bayesian Reasoning (Part 04)
Kecerdasan Buatan | Artificial Intelligence
Version: January 2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2
Bayesian reasoning with multiple hypothesis and evidences
• We can take into account both multiple hypotheses H1, H2,..., Hm and
multiple evidences E1, E2,..., En. The hypotheses as well as the
evidences must be mutually exclusive and exhaustive/comprehensive
• Single evidence E and multiple hypotheses follow:
• Multiple evidences and multiple hypotheses follow:
     
   





m
k
k
k
i
i
i
H
p
H
E
p
H
p
H
E
p
E
H
p
1
     
   





m
k
k
k
n
i
i
n
n
i
H
p
H
E
.
.
.
E
E
p
H
p
H
E
.
.
.
E
E
p
E
.
.
.
E
E
H
p
1
2
1
2
1
2
1
Prior probability
Likelihood
Posterior probability
Marginal probability
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3
• However, this method requires obtaining the conditional
probabilities of all possible combinations of evidences for
all hypotheses. Thus places an enormous / huge burden
on the expert.
• Therefore, in Bayesian reasoning with multiple hypothesis
and evidences, conditional independence among different
evidences assumed.
• Thus, instead of the unworkable equation, we attain:
         
       











m
k
k
k
n
k
k
i
i
n
i
i
n
i
H
p
H
E
p
.
.
.
H
E
p
H
E
p
H
p
H
E
p
H
E
p
H
E
p
E
.
.
.
E
E
H
p
1
2
1
2
1
2
1
.
.
.
Bayesian reasoning with multiple hypothesis and evidences
Prior probability
Likelihood
Posterior probability
Marginal probability
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4
4
• Let us consider a simple example:
• Suppose a CAD (computer aided diagnosis) is given three
conditionally independent evidences E1, E2,..., En, creates three
mutually exclusive and exhaustive (comprehensive) hypotheses
H1, H2,..., Hm, and provides prior probabilities for these hypotheses :
p(H1), p(H2) and p(H3), respectively.
H1= dengue H2=flu H3=malaria
E1=headache E2=cough E3=fever
• The CAD (computer aided diagnosis) system also determines the
conditional probabilities of observing each evidence for all possible
hypotheses.
diseases
symptoms
Bayesian reasoning with multiple hypothesis and evidences
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5
The prior and conditional probabilities
•H1=dengue
•H2=flu
•H3=malaria
•E1=headache, E2=cough, E3=fever.
H y p o t h e s i s
Probability
= 1
i = 2
i = 3
i
0.40
0.9
0.6
0.3
0.35
0.0
0.7
0.8
0.25
0.7
0.9
0.5
 
i
H
p
 
i
H
E
p 1
 
i
H
E
p 2
 
i
H
E
p 3
Peluang terjadinya kasus
penyakit dengue, flu, malaria
Peluang terjadinya sakit kepala
(E1) pada pasien berpenyakit
dengue, flu, malaria
Peluang terjadinya batuk (E2)
pada pasien berpenyakit
dengue, flu, malaria
Peluang terjadinya demam (E3)
pada pasien berpenyakit
dengue, flu, malaria
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6
thus
After evidence E3 is observed, belief in hypothesis H2 decreases and becomes equal to belief in
hypothesis H1 (p(H2) : from 0.35 to 0.34). Belief in hypothesis H3 increases and even nearly reaches
beliefs in hypotheses H1 and H2 (p(H3): from 0.25 to 0.32)
     
   
3
2,
1,
=
,
3
1
3
3
3 i
H
p
H
E
p
H
p
H
E
p
E
H
p
k
k
k
i
i
i





  0.34
25
.
0
9
.
0
+
35
.
0
7
.
0
+
0.40
0.6
0.40
0.6
3
1 





E
H
p
  0.34
25
.
0
9
.
0
+
35
.
0
7
.
0
+
0.40
0.6
35
.
0
7
.
0
3
2 





E
H
p
  0.32
25
.
0
9
.
0
+
35
.
0
7
.
0
+
0.40
0.6
25
.
0
9
.
0
3
3 





E
H
p
Assume that we first observe evidence E3(fever) that is found in a patient.
The CAD system computes the posterior probabilities of each hypothesis as:
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7
Suppose now that we also observe evidence E1 (headache) in a patient.
The posterior probabilities are calculated as follows:
hence
Hypothesis H2 (flu) has now become the most likely one (0.52 compared with 0.19 and 0.29).
       
     
3
2,
1,
=
,
3
1
3
1
3
1
3
1 i
H
p
H
E
p
H
E
p
H
p
H
E
p
H
E
p
E
E
H
p
k
k
k
k
i
i
i
i







  0.19
25
.
0
9
.
0
0.5
+
35
.
0
7
.
0
0.8
+
0.40
0.6
0.3
0.40
0.6
0.3
3
1
1 









E
E
H
p
  0.52
25
.
0
9
.
0
0.5
+
35
.
0
7
.
0
0.8
+
0.40
0.6
0.3
35
.
0
7
.
0
0.8
3
1
2 









E
E
H
p
  0.29
25
.
0
9
.
0
0.5
+
35
.
0
7
.
0
0.8
+
0.40
0.6
0.3
25
.
0
9
.
0
0.5
3
1
3 









E
E
H
p
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8
After observing evidence E2, the final posterior probabilities for all hypotheses are calculated:
hence
Although the initial ranking was H1, H2 and H3, only hypotheses H1 and H3 remain under consideration after all
evidences (E1, E2 and E3) were observed.
In the end, the doctor should than decide whether the patient has got dengue or malaria (malaria is likely as
the probability is higher than dengue, H3 > H1).
         
       
3
2,
1,
=
,
3
1
3
2
1
3
2
1
3
2
1 i
H
p
H
E
p
H
E
p
H
E
p
H
p
H
E
p
H
E
p
H
E
p
E
E
E
H
p
k
k
k
k
k
i
i
i
i
i









  0.45
25
.
0
9
.
0
0.7
0.5
+
35
.
0
7
.
0
0.0
0.8
+
0.40
0.6
0.9
0.3
0.40
0.6
0.9
0.3
3
2
1
1 













E
E
E
H
p
  0
25
.
0
9
.
0
0.7
0.5
+
35
.
0
7
.
0
0.0
0.8
+
0.40
0.6
0.9
0.3
35
.
0
7
.
0
0.0
0.8
3
2
1
2 













E
E
E
H
p
  0.55
25
.
0
9
.
0
0.7
0.5
+
35
.
0
7
.
0
0.0
0.8
+
0.40
0.6
0.9
0.3
25
.
0
9
.
0
0.7
0.5
3
2
1
3 













E
E
E
H
p
17/06/2022
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9
What about sentiment analysis using social media?
You can do similar computation, given that :
H1= neutral
H2= positive sentiment
H3= negative sentiment
E1= occurence of word “illegal”
E2= occurence of word “reform”
E3= occurence of word “crime”
Ek= occurence of word .......
The engineer should provide information of
likelihood and prior probability
sunu@ugm.ac.id
Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10
10
End of File

More Related Content

Similar to Modul Topik 6 - Kecerdasan Buatan.pdf

CapTech Talks Webinar October 2023 Bill Butler.pptx
CapTech Talks Webinar October 2023 Bill Butler.pptxCapTech Talks Webinar October 2023 Bill Butler.pptx
CapTech Talks Webinar October 2023 Bill Butler.pptxCapitolTechU
 
Soft Computing in Education A Primer
Soft Computing in Education A PrimerSoft Computing in Education A Primer
Soft Computing in Education A Primerijtsrd
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis fmaru kindeneh
 
A Degree in Computer Science
A Degree in Computer Science A Degree in Computer Science
A Degree in Computer Science IdilBilgic
 
Understanding Emerging Technology and Its Impact on Online & Blended Learning
Understanding Emerging Technology and Its Impact on Online & Blended LearningUnderstanding Emerging Technology and Its Impact on Online & Blended Learning
Understanding Emerging Technology and Its Impact on Online & Blended LearningStephen Murgatroyd, PhD FBPsS FRSA
 
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...Jason Hong
 
Fundamentals of Artificial Intelligence — QU AIO Leadership in AI
Fundamentals of Artificial Intelligence — QU AIO Leadership in AIFundamentals of Artificial Intelligence — QU AIO Leadership in AI
Fundamentals of Artificial Intelligence — QU AIO Leadership in AIJunaid Qadir
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing withijcsa
 
Fake News and Message Detection
Fake News and Message DetectionFake News and Message Detection
Fake News and Message DetectionIRJET Journal
 
Cyber safety
Cyber safetyCyber safety
Cyber safety10715050
 
Precaution for Covid-19 based on Mask detection and sensor
Precaution for Covid-19 based on Mask detection and sensorPrecaution for Covid-19 based on Mask detection and sensor
Precaution for Covid-19 based on Mask detection and sensorIRJET Journal
 
Design and Implementation High Level Network System for Rural Bank
 Design and Implementation High Level Network System for Rural Bank Design and Implementation High Level Network System for Rural Bank
Design and Implementation High Level Network System for Rural Bankmohamedfaizan11
 
Unboxing the black boxes (Deprecated version)
Unboxing the black boxes (Deprecated version)Unboxing the black boxes (Deprecated version)
Unboxing the black boxes (Deprecated version)BLECKWEN
 
Fourth issue of Newsletter
Fourth issue of NewsletterFourth issue of Newsletter
Fourth issue of NewsletterVipulMakwana24
 
IRJET - Fake News Detection: A Survey
IRJET -  	  Fake News Detection: A SurveyIRJET -  	  Fake News Detection: A Survey
IRJET - Fake News Detection: A SurveyIRJET Journal
 
Day_1_Introduction.pptx
Day_1_Introduction.pptxDay_1_Introduction.pptx
Day_1_Introduction.pptxKritesh Gupta
 
Using Data Integration Models for Understanding Complex Social Systems
Using Data Integration Modelsfor Understanding Complex Social SystemsUsing Data Integration Modelsfor Understanding Complex Social Systems
Using Data Integration Models for Understanding Complex Social SystemsBruce Edmonds
 
Challenges of Artificial Intelligence: Consequences and Solutions
Challenges of Artificial Intelligence: Consequences and SolutionsChallenges of Artificial Intelligence: Consequences and Solutions
Challenges of Artificial Intelligence: Consequences and SolutionsAli Mohammad Saghiri
 

Similar to Modul Topik 6 - Kecerdasan Buatan.pdf (20)

CapTech Talks Webinar October 2023 Bill Butler.pptx
CapTech Talks Webinar October 2023 Bill Butler.pptxCapTech Talks Webinar October 2023 Bill Butler.pptx
CapTech Talks Webinar October 2023 Bill Butler.pptx
 
Soft Computing in Education A Primer
Soft Computing in Education A PrimerSoft Computing in Education A Primer
Soft Computing in Education A Primer
 
chalenges and apportunity of deep learning for big data analysis f
 chalenges and apportunity of deep learning for big data analysis f chalenges and apportunity of deep learning for big data analysis f
chalenges and apportunity of deep learning for big data analysis f
 
A Degree in Computer Science
A Degree in Computer Science A Degree in Computer Science
A Degree in Computer Science
 
Understanding Emerging Technology and Its Impact on Online & Blended Learning
Understanding Emerging Technology and Its Impact on Online & Blended LearningUnderstanding Emerging Technology and Its Impact on Online & Blended Learning
Understanding Emerging Technology and Its Impact on Online & Blended Learning
 
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...
Leveraging Human Factors for Effective Security Training, for ISSA 2013 CISO ...
 
Fundamentals of Artificial Intelligence — QU AIO Leadership in AI
Fundamentals of Artificial Intelligence — QU AIO Leadership in AIFundamentals of Artificial Intelligence — QU AIO Leadership in AI
Fundamentals of Artificial Intelligence — QU AIO Leadership in AI
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing with
 
Fake News and Message Detection
Fake News and Message DetectionFake News and Message Detection
Fake News and Message Detection
 
Cyber safety
Cyber safetyCyber safety
Cyber safety
 
Precaution for Covid-19 based on Mask detection and sensor
Precaution for Covid-19 based on Mask detection and sensorPrecaution for Covid-19 based on Mask detection and sensor
Precaution for Covid-19 based on Mask detection and sensor
 
Design and Implementation High Level Network System for Rural Bank
 Design and Implementation High Level Network System for Rural Bank Design and Implementation High Level Network System for Rural Bank
Design and Implementation High Level Network System for Rural Bank
 
Unboxing the black boxes (Deprecated version)
Unboxing the black boxes (Deprecated version)Unboxing the black boxes (Deprecated version)
Unboxing the black boxes (Deprecated version)
 
Content server
Content serverContent server
Content server
 
Fourth issue of Newsletter
Fourth issue of NewsletterFourth issue of Newsletter
Fourth issue of Newsletter
 
IRJET - Fake News Detection: A Survey
IRJET -  	  Fake News Detection: A SurveyIRJET -  	  Fake News Detection: A Survey
IRJET - Fake News Detection: A Survey
 
Day_1_Introduction.pptx
Day_1_Introduction.pptxDay_1_Introduction.pptx
Day_1_Introduction.pptx
 
Using Data Integration Models for Understanding Complex Social Systems
Using Data Integration Modelsfor Understanding Complex Social SystemsUsing Data Integration Modelsfor Understanding Complex Social Systems
Using Data Integration Models for Understanding Complex Social Systems
 
Challenges of Artificial Intelligence: Consequences and Solutions
Challenges of Artificial Intelligence: Consequences and SolutionsChallenges of Artificial Intelligence: Consequences and Solutions
Challenges of Artificial Intelligence: Consequences and Solutions
 
Responsible Tech Intro
Responsible Tech IntroResponsible Tech Intro
Responsible Tech Intro
 

Recently uploaded

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerunnathinaik
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 

Recently uploaded (20)

Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
internship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developerinternship ppt on smartinternz platform as salesforce developer
internship ppt on smartinternz platform as salesforce developer
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 

Modul Topik 6 - Kecerdasan Buatan.pdf

  • 1. Topik 6 Uncertainty Problems dan Probabilistic Machine Learning Dr. Sunu Wibirama Modul Kuliah Kecerdasan Buatan Kode mata kuliah: UGMx 001001132012 June 22, 2022
  • 2. June 22, 2022 1 Capaian Pembelajaran Mata Kuliah Topik ini akan memenuhi CPMK 5, yakni mampu mendefinisikan beberapa teknik ma- chine learning klasik (linear regression, rule-based machine learning, probabilistic machine learning, clustering) dan konsep dasar deep learning serta implementasinya dalam penge- nalan citra (convolutional neural network). Adapun indikator tercapainya CPMK tersebut adalah memahami teori dasar probabilitas (random variables, probability distributions, joint probability, conditional probability, marginal probability), memahami terminologi dalam teori uncertainty dan teori peluang, memahami konsep teori Bayes dan implementasinya. 2 Cakupan Materi Cakupan materi dalam topik ini sebagai berikut: a) Introduction to Uncertainty: materi ini menjelaskan tentang sebab-sebab munculnya ketidakpastian dalam data dan kelemahan machine learning yang berbasis decision tree (rule-based machine learning). b) Probability for Machine Learning: materi ini menjelaskan dasar-dasar teori peluang, termasuk di antaranya random variables, probability distributions, joint probability, conditional probability, marginal probability. c) Bayesian Rules: materi ini menjelaskan konsep-konsep dasar implementasi dari con- ditional, joint, dan marginal probability dalam formula Bayesian rule atau Bayesian reasoning. d) Implementasi Bayesian Rule: materi ini menjelaskan implementasi Bayesian rule dalam beberapa kasus, misalnya memprediksi gender, peluang terinfeksi Covid-19, dan implementasi pada Computer Aided Diagnosis untuk memprediksi beberapa penyakit berdasarkan ragam gejala yang ada. 1
  • 3. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Introduction to Uncertainty (Part 01) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Neural network model Geometric model Logical model/rule-based model Probabilistic model Four types of classification technique
  • 4. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Geometric model Distance of a house from city hospital Price of a house (USD) Exclusive housing Government supported housing (e.g.,: SVM, linear discriminant analysis, KNN) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Logical model / Rule-based model SUBSCRIBE CHEAP LOTTERY COUPON FOR ONLY $19 TO BE REMOVED FROM FUTURE MAILINGS, SIMPLY REPLY TO THIS MESSAGE AND PUT "REMOVE" IN THE SUBJECT. Classifying spam email – easy if you know the “feature”. ’Viagra’ and ‘lottery’ are two important features of spam email. Class: spam Class: spam Class: ham (e.g.,: decision tree)
  • 5. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 What about sentiment analysis? No definite rule to judge “positive” or “negative” tweet sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 Sentiment analysis and uncertainty of political view • When analyzing Twitter data or Facebook data, you observe the sentiment (tone) of the tweet /status. • There is no exact rule on determining political view, except that you can minimize the uncertainty of the tweet/status tone. • How to minimize the uncertainty? • Gathering more evidence (in case of sentiment analysis: words). But they are so many words? Use language corpus! • Measuring the probability of the occurrence • Categorizing the tweet / status based on the probability • This is the case when the logical machine learning model does not work well. We will use the probabilistic model.
  • 6. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 7 End of File
  • 7. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Introduction to Uncertainty (Part 02) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Uncertainty and probabilistic model • Information can be incomplete, inconsistent, uncertain, or all three. In other words, information is often unsuitable for solving a problem / making inference • Uncertainty is defined as the lack of the exact knowledge that would enable us to reach a perfectly reliable conclusion. • Classical logic (logical model) permits only exact reasoning.  It assumes that perfect knowledge always exists and the law of the excluded middle (Every proposition is either true or not true) can always be applied: IF A is true IF A is false THEN A is not false THEN A is not true
  • 8. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Sources of uncertain knowledge (1) • Weak implications: domain experts and engineers have a painful task of establishing concrete correlations between IF (condition) and THEN (action) parts of the rules. • Therefore, expert systems need to have the ability to handle vague (lack of clarity) associations. • For example by accepting the degree of correlations as numerical certainty factors (i.e.,: strong correlation between two variables represents more certainty) Marvin Minsky with Block Blocks Vision Robot at MIT. In 1946, he entered Harvard University after returning from service in the U.S. Navy during World War II. After graduating from Harvard in 1950, he attended Princeton University, earning his Ph.D. in mathematics in 1954. In 1958, Minsky joined the faculty of MIT’s Department of Electrical Engineering and Computer Science. A year later, he co-founded the Artificial Intelligence Laboratory. sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Sources of uncertain knowledge (2) • Imprecise language. Our natural language is ambiguous and imprecise.  We describe facts with such terms as often and sometimes, frequently and hardly (=almost not) ever. • As a result, it can be difficult to express knowledge in the precise IF-THEN form of production rules. • However, if the meaning of the facts is quantified, it can be used in expert systems. • In 1944, Ray Simpson asked 355 high school and college students to place 20 terms such as ”often” on a scale between 1 and 100. • In 1968, Milton Hakel repeated this experiment. 4 Ray H. Simpson (1944) The specific meanings of certain terms indicating differing degrees of frequency,Quarterly Journal of Speech, 30:3, 328-330
  • 9. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Sources of Uncertain Knowledge (2) Imprecise language 5 Term Always Very often Usually Often Generally Frequently Rather often About as often as not Now and then Sometimes Occasionally Once in a while Not often Usually not Seldom Hardly ever Very seldom Rarely Almost never Never Mean value 99 88 85 78 78 73 65 50 20 20 20 15 13 10 10 7 6 5 3 0 Term Always Very often Usually Often Generally Frequently Rather often About as often as not Now and then Sometimes Occasionally Once in a while Not often Usually not Seldom Hardly ever Very seldom Rarely Almost never Never Mean value 100 87 79 74 74 72 72 50 34 29 28 22 16 16 9 8 7 5 2 0 Milton Hakel (1968) Ray Simpson (1944) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 Sources of uncertain knowledge (3) • Unknown data. When the data is incomplete or missing, the only solution is to accept the value “unknown” and proceed to an approximate reasoning with this value. • Combining the views of different experts. Large expert systems usually combine the knowledge and expertise of a number of experts.  Unfortunately, experts often have contradictory opinions and produce conflicting rules.  To resolve the conflict, the engineer has to attach a weight to each expert and then calculate the composite conclusion. But no systematic method exists to obtain these weights. 6
  • 10. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 7 End of File
  • 11. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Probability for Machine Learning (Part 01) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Basic statistical measures • The mean of a vector, usually denoted as ̅, is the mean of its elements, that is to say the sum of the components divided by the number of components. • The variance is a value describing how the data is spread around the mean. A dataset with a large variance means that data points are spread far away from the mean. A dataset with a small variance means that the data points are grouped closely around the mean. • The standard deviation is simply the square root of the variance. • The covariance between two variables tells if large values in one variable are associated with large values in the other and vice versa.
  • 12. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Basic probability theory • The concept of probability has a long history that goes back thousands of years when words like “probably”, “likely”, “maybe”, “perhaps” and “possibly” were introduced into spoken languages. • However, the mathematical theory of probability was formulated only in the 17th century. • The probability of an event is the proportion of cases in which the event occurs. • Probability can also be defined as a scientific measure of chance. 3 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Basic probability theory • Probability can be expressed mathematically as a numerical index with a range between zero (an absolute impossibility) to unity (an absolute certainty). • Most events have a probability index strictly between 0 and 1, which means that each event has at least two possible outcomes: favourable outcome or success, and unfavourable outcome or failure. 4 = =
  • 13. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Basic probability theory • If s is the number of times success can occur, and f is the number of times failure can occur, then + = 1 • If we throw a coin, the probability of getting a head will be equal to the probability of getting a tail. In a single throw, s = f = 1, s + f = 2, and therefore the probability of getting a head (or a tail) is 0.5. 5 = = + = = + sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 Geometric representation of event Mutually exclusive Non-mutually exclusive A B A B (¬A) (¬B) A A B B A+B A∩B Non-mutually exclusive: Ada peluang event A dan B muncul bersamaan. Contoh: peluang munculnya tail dan angka 6 ketika kita melempar dadu, lalu melempar koin
  • 14. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 Mutually exclusive Non-mutually exclusive if A and B are non-mutually exclusive Non-mutually exclusive: Ada peluang event A dan B muncul bersamaan. Contoh: peluang munculnya tail dan angka 6 ketika kita melempar dadu, lalu melempar koin Geometric representation of event A B A B Given two events, A and B, we define the probability of A or B as follows: ∪ = + − ∩ or ∪ = + if A and B are mutually exclusive sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 8 End of File
  • 15. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Probability for Machine Learning (Part 02) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Random variables • A random experiment, or simply experiment describes a process that gives you uncertain results, as a coin flip, for instance. The outcome of a random experiment is the result you obtain. • A random variable takes a value corresponding to the outcome of a random experiment. Use big letter to denote random variables and its corresponding small letter for one of its value. • As shown in the figure, if you flip a coin, the two possible outcomes are ‘heads’ and ‘tails’. An example of random variable would map ‘heads’ to 0 and ‘tails to 1. • The event A corresponds to the following set of outcomes: {‘heads’}. This means that the probability that the outcome is ‘head’ can be denoted as: = 0 = ′ℎ = ( )
  • 16. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Sample Space • Discrete sample space: • A sample space containing a finite number of possibilities or an unending sequence with as many elements as there are numbers. • Variable called a “discrete random variable” • Can be counted • Example: # of people in the room with red shoes on. • Continuous sample space: • Sample space containing an infinite number of possibilities equal to the number of points on a line segment. • Non-discrete • Variable is a “continuous random variable” • Can’t be counted but can be measured • Example: heights of children 3 Richard Holzer, Patrick Wüchner, Hermann de Meer, “Modeling of Self-Organizing Systems: An Overview”, Universitätsbibliothek TU Berlin, 2010 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Discrete Probability Distributions • Discrete random variable has a certain probability of equaling each of its possible values. • Example: tossing coin 3 times • = number of heads (H) • = {HHH,HHT, HTH, HTT, THH, THT, TTH, TTT} • For = 2, = 3/8 • Using formula: • = = • 3 = = 3 = 1/8 4 4
  • 17. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Probability distribution for discrete variables • The probability distribution of a random variable is a function that takes the sample space as input and returns probabilities: probability value of random variable when its value equals to • Set of ordered pairs ( , ( ) ) is a probability function, or probability mass function, or probability distribution of the discrete random variable, , if: ( ) > 0 ∑ ( ) = 1 = = ( ) 5 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 6 End of File
  • 18. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Probability for Machine Learning (Part 03) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Probability mass functions (1) • Another example: • let’s say that you’re running a dice-rolling experiment. • is the random variable corresponding to this experiment. Assuming that the die is fair, each outcome is equiprobable • That is, if you run the experiment a large number of times, you will get each outcome approximately the same number of times. Probability mass function of the random variable X corresponding to a die rolling a six-sided die estimated from 20 rolls. Probability mass function of the random variable X corresponding to a die rolling a six-sided die estimated from 100,000 rolls.
  • 19. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Probability mass functions (2) • A stockroom clerk returns three safety helmets at random to three steel mill employees who had previously checked them. • If Smith (S), Jones (J), and Brown (B), in that order, receive one of the three hats: • list the sample points for the possible orders of returning the helmets • find the value of random variable that represents the number of correct matches. sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Probability mass functions (2) • A stockroom clerk returns three safety helmets at random to three steel mill employees who had previously checked them. • If Smith (S), Jones (J), and Brown (B), in that order, receive one of the three hats: • list the sample points for the possible orders of returning the helmets • find the value of random variable that represents the number of correct matches.
  • 20. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Cumulative Distributions for Discrete 5 For the random variable , the number of correct matches in the previous example The cumulative distribution function for is 0 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 6 End of File
  • 21. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Probability for Machine Learning (Part 04) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Another case… • Statement 1: What is the probability of tomorrow morning temperature equals to 10O Celsius ? • Statement 2: What is the probability of tomorrow morning temperature less or more than 10O Celsius ? What is the answer of each statement? Which statement is more appropriate to be used?
  • 22. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Continuous Probability Distributions • Probability Density Function (PDF) • Continuous random variable has P = 0 of being exactly any value. • Example: being exactly 175 cm tall • ( ℎ =175.000) = 0 • However, can find probability that height lies within some range. Example: ( ℎ ≥ 175) or (160 ≤ ℎ ≤ 175) 3 Remember: probability of continuous random variable must be described over a range The probability to draw a number between 0 and 0.2 is the highlighted area under the curve. sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Continuous Distributions 4 P(a < X < b) = f(x)dx a b ò 1 dx f(x) - = ò   Sum of all f(x) must equal unity P(a < X < b) must be positive a b b) X P(a b) X P(a   = < <
  • 23. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 5 Example • Given that a random variable, X, has a density function: • f(x) = 2x, for 0 < x < 1 • f(x) = 0, for all other x • Verify area under curve = 1.0 2x 0 1 ò dx = 2x2 2 | 1 0 = 12 - (0)2 f (x)dx -  ò = + 0 1  ò dx 0dx - 0 ò + = x2 | 1 0 = 1.0 1 f(x) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 6 • Given that a random variable, , has a density function: • f(x) = 2x, for 0 < x < 1 • f(x) = 0, for all other x • What is the probability that: -¼ < x < 1/2 Example (2)
  • 24. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 2x 0 1/2 ò dx = x2 | 1/2 0 = (1/ 2)2 - (0)2 0dx -1/4 0 ò + = ¼ - 0 = ¼ P(-¼ < x < ½) = • Given that a random variable, , has a density function: • f(x) = 2x, for 0 < x < 1 • f(x) = 0, for all other x • What is the probability that: -¼ < x < 1/2 Example (2) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 Cumulative distributions for continues random variable • The cumulative distribution ( ) of a continuous random variable, , with density function, ( ), is: F(a) = P(X  a) = f(x)dx - a ò P(a<x<b) = F(b) - F(a) = f(x)dx - b ò - f(x)dx - a ò = f(x)dx a b ò
  • 25. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9 9 Differentiation of ( ) Lets you determine ( ) from ( ) f (x) = dF(x) dx Cumulative distributions for continues random variable sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10 10 Example • Suppose the error in the reaction temperature for a experiment is a continuous random variable, x, having the cumulative probability density function of: • ( ) = x3/9, for –1 < x < 2 • ( ) = 0, for all other x<-1 and x>2 • Thus, expression for density function ( ) for (-1< x < 2) = ( ) = / = /3
  • 26. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11 11 End of File
  • 27. 22/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Probability Distribution Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Discrete uniform distribution If the random variable assume the values , , … , with equal probabilities, then the discrete uniform distribution is given by • When a light bulb is selected at random from a box that contains a 40-watt bulb, a 60-watt bulb, a 75-watt bulb, and a 100-watt bulb, each element of the sample space S = {40, 60, 75, 100} occurs with probability 1/4. Therefore, we have a uniform distribution, with
  • 28. 22/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Poisson distribution 1. The number of outcomes occurring in one time interval or specified region is independent of the number that occurs in any other disjoint time interval or region of space. Thus, the Poisson process has no memory. 2. The probability that a single outcome will occur during a very short time interval or in a small region is proportional to the length of the time interval or the size of the region and does not depend on the number of outcomes occurring outside this time interval or region. 3. The probability that more than one outcome will occur in such a short time interval or fall in such a small region is negligible (unimportant) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Poisson distribution • Customer care center receives 100 calls per hour, 8 hours a day. • As we can see that the calls are independent of each other. • The probability of the number of calls per minute has a Poisson probability distribution. • There can be any number of calls per minute irrespective of the number of calls received in the previous minute Source: https://www.cuemath.com/data/poisson-distribution/
  • 29. 22/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Binomial distribution  An experiment often consists of repeated trials, each with two possible outcomes that may be labeled success or failure.  The most obvious application deals with the testing of items as they come off an assembly line, where each test or trial may indicate a defective or a non-defective item (or coin, with head or tail)  We may choose to define either outcome as a success. The process is referred to as a Bernoulli process. Each trial is called a Bernoulli trial. sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 1. The experiment consists of n identical trials 2. There are only 2 possible outcomes on each trial. We will denote one outcome by S (for Success) and the other by F (for Failure). 3. The probability of S remains the same from trial to trial. This probability will be denoted by p, and the probability of F will be denoted by q (q = 1-p). 4. The trials are independent. 5. The binomial random variable is the number of S in n trials. Binomial distribution
  • 30. 22/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 Normal distribution • Most important continuous probability distribution • Graph called the “normal curve” (bell-shaped). • Total area under the curve = 1 • Derived by DeMoivre and Gauss. Hence, it is called the “Gaussian” distribution. • Describes many phenomena in nature, industry and research 7 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 Other probability distributions Source: https://www.kdnuggets.com/2020/02/probability-distributions-data-science.html
  • 31. 22/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9 Why probability distributions? • Some machine learning models work best under some distributions assumptions • For example, these algorithms and function assume normal distribution: Linear Discriminant Analysis (LDA), Gaussian Naive Bayes, Logistic Regression, Linear Regression, Sigmoid function. • When you are working with dataset, you are dealing with sample instead of population. Probability distributions help you to make predictions about the whole population. • Understanding probability distributions help you to choose appropriate data transformation methods/ feature extraction techniques. Source: https://www.kdnuggets.com/2020/02/probability-distributions-data-science.html sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10 10 End of File
  • 32. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Conditional Probability and Bayesian Rule Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Conditional probability • Let A be an event in the world and B be another event. Suppose that events A and B are not mutually exclusive / not separated, but occur conditionally on the occurrence of the other. • The probability that event A will occur if event B occurs is called the conditional probability. • Conditional probability is denoted mathematically as p(A|B) in which the vertical bar represents "given" and the complete probability expression is interpreted as: conditional probability of event A occurring given that event B has occurred. A B =
  • 33. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 3 Conditional probability • The number of times A and B can occur, or the probability that both A and B will occur, is called the joint probability of A and B. It is represented mathematically as ∩ . • The number of ways B can occur is the probability of B: ( ) • The probability of an event A, given that an event B has occurred, is called the conditional probability of A given B and denoted by the symbol • The probability of an event B, given that an event A has occurred, is called the conditional probability of B given A and denoted by the symbol = ∩ ( ) = ∩ ( ) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Two events A and B are independent if and only if Otherwise, A and B are dependent. Conditional probability and independence Independent: event A dan B bersifat non-mutually exclusive (ada peluang dua-duanya muncul bersamaan), tapi kemunculan A tidak mempengaruhi B, dan juga sebaliknya. Misal: kemunculan mata dadu 6 dan bagian tail pada koin jika kita melempar dadu, lalu melempar koin = ( ) = ( )
  • 34. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 5 Bayesian rule Hence and Substituting the first equation into this equation yields the Bayesian rule (developed by a statistician Thomas Bayes) ∩ = ( | ) × ( ) ∩ = ( | ) × ( ) = ∩ ( ) = | × ( ) ( ) = ∩ ( ) = ∩ ( ) Thomas Bayes (1701-1762) sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 A B 6 Bayesian rule • If the occurrence of event A depends on only two mutually exclusive events (B and NOT B) we obtain marginal probability: where  is the logical function NOT. Similarly • Substituting this equation into the Bayesian rule yields: ( ∩ ¬ ) ( ∩ ) = × + ¬ × (¬ ) ( ∩ ) ( ∩ ¬ ) = × + ¬ × (¬ ) = × ( ) ( ) = × ( ) × + ¬ × (¬ )
  • 35. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 7 End of File
  • 36. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Implementation Bayesian Reasoning (Part 01) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 A B 2 Bayesian rule • If the occurrence of event A depends on only two mutually exclusive events (B and NOT B) we obtain: where  is the logical function NOT. Similarly • Substituting this equation into the Bayesian rule yields: ( ∩ ¬ ) ( ∩ ) = × + ¬ × (¬ ) ( ∩ ) ( ∩ ¬ ) = × + ¬ × (¬ ) = × ( ) ( ) = × ( ) × + ¬ × (¬ )
  • 37. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Dilemma at the movies • This person dropped their ticket in the hallway. • Do you call out “Excuse me, ma’am!” or “Excuse me, sir!” • You have to make a guess. Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 • What if they’re standing in line for the men’s restroom? • Bayesian reasoning (a.k.a Bayesian inference) is a way to capture common sense. • It helps you use what you know to make better guesses. Dilemma at the movies Courtesy of Brandon Rohrer, 2019
  • 38. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Put numbers to our dilemma Out of 100 men at the movies 4 have long hair 96 have short hair Out of 100 women at the movies 50 have long hair 50 have short hair Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 • About 12 times more women have long hair than men. 4 have long hair 96 have short hair 50 have long hair 50 have short hair Out of 100 men at the movies Out of 100 women at the movies Courtesy of Brandon Rohrer, 2019 Put numbers to our dilemma
  • 39. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 • But there are 98 men and 2 women in line for the men’s restroom. Out of 98 men in line 4 have long hair 94 have short hair Out of 2 women in line 1 has long hair 1 has short hair Courtesy of Brandon Rohrer, 2019 Put numbers to our dilemma sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 Put numbers to our dilemma • In the line, 4 times more men have long hair than women. 4 have long hair 94 have short hair 1 has long hair 1 has short hair Out of 98 men in line Out of 2 women in line Courtesy of Brandon Rohrer, 2019
  • 40. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9 50 are men 2 men have long hair 48 men have short hair 50 are women 25 women have long hair 25 women have short hair Out of 100 people at the movies Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10 98 are men 4 men have long hair 94 men have short hair 2 are women One woman has long hair One woman has short hair Out of 100 people In line for the men’s restroom Courtesy of Brandon Rohrer, 2019
  • 41. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11 Translate to math P(something) = # something / # everything P(woman) = Probability that a person is a woman = # women / # people = 50 / 100 = .5 P(man) = Probability that a person is a man = # men / # people = 50 / 100 = .5 50 are men 50 are women Out of 100 people at the movies Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 12 Translate to math P(something) = # something / # everything P(woman) = Probability that a person is a woman = # women / # people = 2 / 100 = .02 P(man) = Probability that a person is a man = # men / # people = 98 / 100 = .98 98 are men 2 are women Out of 100 people In line for the men’s restroom Courtesy of Brandon Rohrer, 2019
  • 42. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 13 13 End of File
  • 43. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Implementation Bayesian Reasoning (Part 02) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Conditional probabilities • P(long hair | woman) • If I know that a person is a woman, what is the probability that person has long hair? • P(long hair | woman) • = # women with long hair / # women • = 25 / 50 = .5 50 are women 25 women have long hair 25 women have short hair Out of 100 people at the movies Courtesy of Brandon Rohrer, 2019
  • 44. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Conditional probabilities • If I know that a person is a man, what is the probability that person has long hair? • P(long hair | man) • = # men with long hair / # men • = 2 / 50 = .04 50 are men 2 men have long hair 48 men have short hair Out of 100 people at the movies Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Conditional probabilities • P(A | B) is the probability of A, given B. • “If I know B is the case, what is the probability that A is also the case?” • P(A | B) is not the same as P(B | A). • P(cute | puppy) is not the same as P(puppy | cute) • If I know the thing I’m holding is a puppy, what is the probability that it is cute? • If I know the the thing I’m holding is cute, what is the probability that it is a puppy? Courtesy of Brandon Rohrer, 2019
  • 45. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 Joint probabilities What is the probability that a person is both a woman and has short hair? P(woman with short hair) = P(woman) * P(short hair | woman) = .5 * .5 = .25 P(man) = .5 P(woman) = .5 Out of probability of 1 P(woman with short hair) = .25 Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 Joint probabilities P(woman with long hair) = P(woman) * P(long hair | woman) = .5 * .5 = .25 P(man) = .5 P(woman) = .5 Out of probability of 1 P(woman with short hair) = .25 P(woman with long hair) = .25 Courtesy of Brandon Rohrer, 2019
  • 46. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 Joint probabilities P(man with short hair) = P(man) * P(short hair | man) = .5 * .96 = .48 P(man) = .5 P(woman) = .5 Out of probability of 1 P(woman with short hair) = .25 P(woman with long hair) = .25 P(man with short hair) = .48 Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 Joint probabilities • P(man with long hair) • = P(man) * P(long hair | man) • = .5 * .04 = .02 P(man) = .5 P(woman) = .5 P(woman with short hair) = .25 Out of probability of 1 P(woman with long hair) = .25 P(man with short hair) = .48 P(man with long hair) = .02 Courtesy of Brandon Rohrer, 2019
  • 47. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9 Joint probabilities If P(man) = .98 and P(woman) = .02, then the answers change. P(man with long hair) = P(man) * P(long hair | man) = .98 * .04 = .04 P(man) = .98 P(woman) = .02 P(woman with short hair) = .01 Out of probability of 1 P(woman with long hair) = .01 P(man with short hair) = .94 P(man with long hair) = .04 Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10 Joint probabilities • P(woman with long hair) • = P(woman) * P(long hair | woman) • = .02 * .5 = .01 P(man) = .98 P(woman) = .02 P(woman with short hair) = .01 Out of probability of 1 P(woman with long hair) = .01 P(man with short hair) = .94 P(man with long hair) = .04 Courtesy of Brandon Rohrer, 2019
  • 48. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 11 Joint probabilities • P(A and B) is the probability that both A and B are the case. • Also written P(A, B) or P(A ∩ B) • P(A and B) is the same as P(B and A) • The probability that I am having a jelly donut with my milk is the same as the probability that I am having milk with my jelly donut. • P(donut and milk) = P(milk and donut) Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 12 Marginal probabilities P(long hair) = P(woman with long hair) + P(man with long hair) = .01 + .04 = .05 P(man) = .98 P(woman) = .02 P(woman with short hair) = .01 Out of probability of 1 P(woman with long hair) = .01 P(man with short hair) = .94 P(man with long hair) = .04 Courtesy of Brandon Rohrer, 2019
  • 49. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 13 Marginal probabilities P(short hair) = P(woman with short hair)+ P(man with short hair) = .01 + .94 = .95 P(man) = .98 P(woman) = .02 P(woman with short hair) = .01 Out of probability of 1 P(woman with long hair) = .01 P(man with short hair) = .94 P(man with long hair) = .04 Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 14 What we really care about • We know the person has long hair. Are they a man or a woman? • P(man | long hair) • We don’t know this answer yet, but we already learned about joint probabilities and marginal probabilities. Courtesy of Brandon Rohrer, 2019
  • 50. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 15 Thomas Bayes noticed something cool • P(man with long hair) = P(long hair) * P(man | long hair) • P(long hair and man) = P(man) * P(long hair | man) Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 16 Thomas Bayes noticed something cool • P(man with long hair) = P(long hair) * P(man | long hair) • P(long hair and man) = P(man) * P(long hair | man) • Because P(man and long hair) = P(long hair and man) • P(long hair) * P(man | long hair) = P(man) * P(long hair | man) Courtesy of Brandon Rohrer, 2019
  • 51. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 17 Thomas Bayes noticed something cool • P(man with long hair) = P(long hair) * P(man | long hair) • P(long hair and man) = P(man) * P(long hair | man) • Because P(man and long hair) = P(long hair and man) • P(long hair) * P(man | long hair) = P(man) * P(long hair | man) • P(man | long hair) = P(man) * P(long hair | man) / P(long hair) Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 18 Thomas Bayes noticed something cool • P(man with long hair) = P(long hair) * P(man | long hair) • P(long hair and man) = P(man) * P(long hair | man) • Because P(man and long hair) = P(long hair and man) • P(long hair) * P(man | long hair) = P(man) * P(long hair | man) • P(man | long hair) = P(man) * P(long hair | man) / P(long hair) • P(A | B) = P(B | A) * P(A) / P(B) Courtesy of Brandon Rohrer, 2019
  • 52. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 19 Back to the movie theater, this time with Bayes P(man | long hair) = P(man) * P(long hair | man) P(long hair) P(man) = .5 P(woman) = .5 P(long hair | woman) = .5 P(long hair | man) = .04 Courtesy of Brandon Rohrer, 2019 P(woman with long hair) = .25 P(man with long hair) = .02 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 20 P(man | long hair) = P(man) * P(long hair | man) P(long hair) = P(man) * P(long hair | man) P(woman with long hair) + P(man with long hair) P(man) = .5 P(woman) = .5 P(long hair | woman) = .5 P(long hair | man) = .04 Back to the movie theater, this time with Bayes Courtesy of Brandon Rohrer, 2019 P(woman with long hair) = .25 P(man with long hair) = .02
  • 53. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 21 P(man | long hair) = P(man) * P(long hair | man) P(long hair) = P(man) * P(long hair | man) P(woman with long hair) + P(man with long hair) P(man | long hair) = .5 * .04 = .02 / .27 = .07 .25 + .02 P(man) = .5 P(woman) = .5 P(long hair | woman) = .5 P(long hair | man) = .04 Back to the movie theater, this time with Bayes Courtesy of Brandon Rohrer, 2019 P(woman with long hair) = .25 P(man with long hair) = .02 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 22 Now, knowing that they are in line of men’s rest room changes the probability P(man | long hair) Courtesy of Brandon Rohrer, 2019
  • 54. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 23 P(long hair | man) = .04 P(man) = .98 P(woman) = .02 P(man | long hair) = P(man) * P(long hair | man) P(long hair) = P(man) * P(long hair | man) P(woman with long hair) + P(man with long hair) P(long hair | woman) = .5 P(woman with long hair) = .01 P(man with long hair) = .04 Back to the movie theater, this time with Bayes Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 24 P(long hair | man) = .04 P(man) = .98 P(woman) = .02 P(man | long hair) = P(man) * P(long hair | man) P(long hair) = P(man) * P(long hair | man) P(woman with long hair) + P(man with long hair) P(man | long hair) = .98 * .04 = .04 / .05 = .80 .01 + .04 P(long hair | woman) = .5 P(woman with long hair) = .01 P(man with long hair) = .04 Back to the movie theater, this time with Bayes Courtesy of Brandon Rohrer, 2019
  • 55. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 25 Dilemma at the movies This person dropped their ticket in front of men’s toilet, surely you can call: “Excuse me, sir!” With 0.80 probability of this person is man when you see the hair Courtesy of Brandon Rohrer, 2019 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 26 26 End of File
  • 56. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Implementation Bayesian Reasoning (Part 03) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 The Bayesian rule expressed in terms of hypothesis (H) and evidence (E) looks like this: where: ( | ) probability that hypothesis H is true given evidence E ( ) is the prior probability of hypothesis H being true; ( | ) is the probability that hypothesis H being true will result in evidence E; (¬ ) is the prior probability of hypothesis H being false; ( |¬ ) is the probability of finding evidence E even when hypothesis H is false. Prior probability Likelihood Posterior probability Marginal probability = × ( ) × + ¬ × (¬H) Bayesian reasoning Thomas Bayes (1701-1762)
  • 57. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 Computer Aided Diagnosis for Covid-19 detection Abraham, Bejoy, and Madhu S. Nair. "Computer-aided detection of COVID-19 from X-ray images using multi-CNN and Bayesnet classifier." Biocybernetics and biomedical engineering 40, no. 4 (2020): 1436-1445. sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 Computer Aided Diagnosis for Covid-19 detection • Recently you decide to have a CT-scan test for Covid-19. If the test is positive, what is the probability you are infected? • Suppose you are told the test has a sensitivity of 80%, which means, if you are infected by Covid-19, the test will be positive with probability 0.8. In other words where x = 1 is the event the test is positive, and y = 1 is the event you are infected by Covid-19. Note: x is evidence (E) y is hypothesis (H) Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans
  • 58. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 • Many people conclude they are therefore 80% likely to be infected by Covid-19. But this is false! It ignores the prior probability of having Covid-19, which fortunately is quite low in June 2022: • We also need to take into account the fact that the test may be a false positive or false alarm. Unfortunately, such false positives are quite likely (with current screening technology): Prior probability that someone is infected by Covid-19 Probability that the test result is positive, although you aren’t infected Computer Aided Diagnosis for Covid-19 detection Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 • Combining these three terms using Bayes rule, we can compute the correct answer as follows: Note: x is evidence (E) y is hypothesis (H) = × ( ) × + ¬ × (¬H) where = 0 = 1 − = 1 = 0.996. In other words, if the test’s result is positive, you only have about a 3% chance of actually infected by Covid-19! p(x=1 | y=1) : peluang seseorang terinfeksi Covid-19 dan hasil tes positif p(y=1) : peluang seseorang terinfeksi Covid-19 p(y=0) : peluang seseorang TIDAK terinfeksi Covid-19 p(x=1 | y=0) : false alarm, seseorang tidak terinfeksi, tapi hasil tes menyatakan ia positif terinfeksi Covid-19 Computer Aided Diagnosis for Covid-19 detection
  • 59. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 • The Computer Aided Diagnosis (CAD) system can then determine whether you are infected by Covid-19 using a threshold (e.g. corona score) on the probability value. • For example: • IF = 1 = 1 > = 80 THEN the CAD will notify the medical doctor that you are infected by Covid-19. Computer Aided Diagnosis for Covid-19 detection Source: https://spectrum.ieee.org/hospitals-deploy-ai-tools-detect-covid19-chest-scans sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 8 End of File
  • 60. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 1 Sunu Wibirama sunu@ugm.ac.id Department of Electrical and Information Engineering Faculty of Engineering Universitas Gadjah Mada INDONESIA Implementation Bayesian Reasoning (Part 04) Kecerdasan Buatan | Artificial Intelligence Version: January 2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 2 Bayesian reasoning with multiple hypothesis and evidences • We can take into account both multiple hypotheses H1, H2,..., Hm and multiple evidences E1, E2,..., En. The hypotheses as well as the evidences must be mutually exclusive and exhaustive/comprehensive • Single evidence E and multiple hypotheses follow: • Multiple evidences and multiple hypotheses follow:                m k k k i i i H p H E p H p H E p E H p 1                m k k k n i i n n i H p H E . . . E E p H p H E . . . E E p E . . . E E H p 1 2 1 2 1 2 1 Prior probability Likelihood Posterior probability Marginal probability
  • 61. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 3 • However, this method requires obtaining the conditional probabilities of all possible combinations of evidences for all hypotheses. Thus places an enormous / huge burden on the expert. • Therefore, in Bayesian reasoning with multiple hypothesis and evidences, conditional independence among different evidences assumed. • Thus, instead of the unworkable equation, we attain:                              m k k k n k k i i n i i n i H p H E p . . . H E p H E p H p H E p H E p H E p E . . . E E H p 1 2 1 2 1 2 1 . . . Bayesian reasoning with multiple hypothesis and evidences Prior probability Likelihood Posterior probability Marginal probability sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 4 4 • Let us consider a simple example: • Suppose a CAD (computer aided diagnosis) is given three conditionally independent evidences E1, E2,..., En, creates three mutually exclusive and exhaustive (comprehensive) hypotheses H1, H2,..., Hm, and provides prior probabilities for these hypotheses : p(H1), p(H2) and p(H3), respectively. H1= dengue H2=flu H3=malaria E1=headache E2=cough E3=fever • The CAD (computer aided diagnosis) system also determines the conditional probabilities of observing each evidence for all possible hypotheses. diseases symptoms Bayesian reasoning with multiple hypothesis and evidences
  • 62. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 5 The prior and conditional probabilities •H1=dengue •H2=flu •H3=malaria •E1=headache, E2=cough, E3=fever. H y p o t h e s i s Probability = 1 i = 2 i = 3 i 0.40 0.9 0.6 0.3 0.35 0.0 0.7 0.8 0.25 0.7 0.9 0.5   i H p   i H E p 1   i H E p 2   i H E p 3 Peluang terjadinya kasus penyakit dengue, flu, malaria Peluang terjadinya sakit kepala (E1) pada pasien berpenyakit dengue, flu, malaria Peluang terjadinya batuk (E2) pada pasien berpenyakit dengue, flu, malaria Peluang terjadinya demam (E3) pada pasien berpenyakit dengue, flu, malaria sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 6 thus After evidence E3 is observed, belief in hypothesis H2 decreases and becomes equal to belief in hypothesis H1 (p(H2) : from 0.35 to 0.34). Belief in hypothesis H3 increases and even nearly reaches beliefs in hypotheses H1 and H2 (p(H3): from 0.25 to 0.32)           3 2, 1, = , 3 1 3 3 3 i H p H E p H p H E p E H p k k k i i i        0.34 25 . 0 9 . 0 + 35 . 0 7 . 0 + 0.40 0.6 0.40 0.6 3 1       E H p   0.34 25 . 0 9 . 0 + 35 . 0 7 . 0 + 0.40 0.6 35 . 0 7 . 0 3 2       E H p   0.32 25 . 0 9 . 0 + 35 . 0 7 . 0 + 0.40 0.6 25 . 0 9 . 0 3 3       E H p Assume that we first observe evidence E3(fever) that is found in a patient. The CAD system computes the posterior probabilities of each hypothesis as:
  • 63. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 7 Suppose now that we also observe evidence E1 (headache) in a patient. The posterior probabilities are calculated as follows: hence Hypothesis H2 (flu) has now become the most likely one (0.52 compared with 0.19 and 0.29).               3 2, 1, = , 3 1 3 1 3 1 3 1 i H p H E p H E p H p H E p H E p E E H p k k k k i i i i          0.19 25 . 0 9 . 0 0.5 + 35 . 0 7 . 0 0.8 + 0.40 0.6 0.3 0.40 0.6 0.3 3 1 1           E E H p   0.52 25 . 0 9 . 0 0.5 + 35 . 0 7 . 0 0.8 + 0.40 0.6 0.3 35 . 0 7 . 0 0.8 3 1 2           E E H p   0.29 25 . 0 9 . 0 0.5 + 35 . 0 7 . 0 0.8 + 0.40 0.6 0.3 25 . 0 9 . 0 0.5 3 1 3           E E H p sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 8 After observing evidence E2, the final posterior probabilities for all hypotheses are calculated: hence Although the initial ranking was H1, H2 and H3, only hypotheses H1 and H3 remain under consideration after all evidences (E1, E2 and E3) were observed. In the end, the doctor should than decide whether the patient has got dengue or malaria (malaria is likely as the probability is higher than dengue, H3 > H1).                   3 2, 1, = , 3 1 3 2 1 3 2 1 3 2 1 i H p H E p H E p H E p H p H E p H E p H E p E E E H p k k k k k i i i i i            0.45 25 . 0 9 . 0 0.7 0.5 + 35 . 0 7 . 0 0.0 0.8 + 0.40 0.6 0.9 0.3 0.40 0.6 0.9 0.3 3 2 1 1               E E E H p   0 25 . 0 9 . 0 0.7 0.5 + 35 . 0 7 . 0 0.0 0.8 + 0.40 0.6 0.9 0.3 35 . 0 7 . 0 0.0 0.8 3 2 1 2               E E E H p   0.55 25 . 0 9 . 0 0.7 0.5 + 35 . 0 7 . 0 0.0 0.8 + 0.40 0.6 0.9 0.3 25 . 0 9 . 0 0.7 0.5 3 2 1 3               E E E H p
  • 64. 17/06/2022 sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 9 What about sentiment analysis using social media? You can do similar computation, given that : H1= neutral H2= positive sentiment H3= negative sentiment E1= occurence of word “illegal” E2= occurence of word “reform” E3= occurence of word “crime” Ek= occurence of word ....... The engineer should provide information of likelihood and prior probability sunu@ugm.ac.id Copyright © 2022 Sunu Wibirama | Do not distribute without permission @sunu_wibirama 10 10 End of File