SlideShare a Scribd company logo
1 of 21
Download to read offline
Text Book slides modified by Prof M.Shashi as per the
AU syllabus
Data Generation Process
 The process that had generated the data is not
completely known and hence is modelled as a random
process.
 The outcome of a random process is modelled as a
random variable.
 Based on the available information or features the value
of a random variable is not predictable with certainty and
hence is non-deterministic.
 Probability theory deals with the study and analysis of
such random processes
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 2
Probability and Inference
 Result of tossing a coin is ∈ {Heads,Tails}
 Random var X ∈{1,0}
 Po denotes probability of heads, P(X=1)
 This implies P(X=0)=1- po
 X is Bernoulli distributed and its probability is expressed as
P{X} = po
X (1 ‒ po)(1 ‒ X)
 Data Sample: X = {xt }N
t =1
Estimation: po = # {Heads}/#{Tosses} = ∑t
xt / N
 Prediction of next toss:
Infer Heads if po > ½, Tails otherwise
3
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Classification
 Values of observable input variables is the basis for
prediction
 Credit scoring: Inputs are income and savings.
Output is low-risk vs high-risk
 Input: x = [x1,x2]T ,Output: C = {0,1}
 Prediction:
 Error =1- Max { P(C=1|(x1 , x2), P(C=0|(x1 , x2)}



=
=
>
=
=



=
>
=
=
otherwise
0
)
|
(
)
|
(
if
1
choose
or
otherwise
0
)
|
(
if
1
choose
C
C
C
C
,x
x
C
P
,x
x
C
P
.
,x
x
C
P
2
1
2
1
2
1
0
1
5
0
1
4
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Bayes’ Rule
( )
( ) ( )
( )
x
x
x
p
p
P
P
C
C
C
|
| =
( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) 1
|
1
|
0
0
0
|
1
1
|
1
1
0
=
=
+
=
=
=
+
=
=
=
=
=
+
=
x
x
x
x
x
C
C
C
C
C
C
C
C
P
p
P
p
P
p
p
P
P
5
posterior
Likelihood of X in C
prior
Prob of Evidence,x, irrespective of C
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
For Binary classification
Bayes’ Rule: K>2 Classes
( ) ( ) ( )
( )
( ) ( )
( ) ( )
∑
=
=
=
K
k
k
k
i
i
i
i
i
C
P
C
p
C
P
C
p
p
C
P
C
p
C
P
1
|
|
|
|
x
x
x
x
x
( ) ( )
( ) ( )
x
x |
max
|
if
choose
and
1
k
k
i
i
K
i
i
i
C
P
C
P
C
C
P
C
P
=
=
≥ ∑
=
1
0
6
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Decision Making considering
Losses and Risks
 Loss incurred by False Positives and False Negatives
may not be equal in domains like finance, health and
disaster management.
 Action to assign an input to Ci: αi
 Loss of αi when the true state is Ck : λik
 Expected risk in taking the action αi is
( ) ( )
( ) ( )
x
x
x
x
|
min
|
if
choose
|
|
k
k
i
i
k
K
k
ik
i
R
R
C
P
R
α
α
α
λ
α
=
= ∑
=1
7
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Losses and Risks: 0/1 Loss Case



≠
=
=
k
i
k
i
ik
if
if
1
0
λ
( ) ( )
( )
( )
x
x
x
x
|
|
|
|
i
i
k
k
K
k
k
ik
i
C
P
C
P
C
P
R
−
=
=
=
∑
∑
≠
=
1
1
λ
α
8
For minimum risk, choose the most probable class
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
All errors are equally costly:
Losses and Risks: Reject as CK+1
(if misclassification is costlier than manual work,
eg: sorting mail by optical digit recognizer)
1
0
1
1
0
<
<





+
=
=
= λ
λ
λ
otherwise
if
if
,
K
i
k
i
ik
( ) ( )
( ) ( ) ( )
x
x
x
x
x
|
|
|
|
|
i
i
k
k
i
K
k
k
K
C
P
C
P
R
C
P
R
−
=
=
=
=
∑
∑
≠
=
+
1
1
1
α
λ
λ
α
( ) ( ) ( )
otherwise
reject
|
and
|
|
if
choose λ
−
>
≠
∀
> 1
x
x
x i
k
i
i C
P
i
k
C
P
C
P
C
9
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Classification using Discriminant
Functions
( ) ( )
x
x k
k
i
i g
g
C max
if
choose =
( ) ( )
{ }
x
x
x k
k
i
i g
g max
| =
=
R
( )
( )
( )
( ) ( )




−
=
i
i
i
i
i
C
P
C
p
C
P
R
g
|
|
|
x
x
x
x
α
10
g(x) Divides the feature space into
K decision regions R1,...,RK
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Discriminant functions can be
defined as
K=2 Classes
 Dichotomizer (K=2) vs Polychotomizer (K>2)
 Single discriminant function is often used for 2-
class classification
 g(x) = g1(x) – g2(x)
 Log odds:
( )


 >
otherwise
if
choose
2
1 0
C
g
C x
( )
( )
x
x
|
|
log
2
1
C
P
C
P
11
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Utility Theory for making Rational
Decisions under uncertainty
 Prob of state k given exidence x: P (Sk|x)
 Utility of action αi when state is k: Uik
 Expected utility:
 Maximizing expected utility is equivalent to
minimizing expected risk.
( ) ( )
( ) ( )
x
x
x
x
|
max
|
if
Choose
|
|
j
j
i
i
k
k
ik
i
EU
EU
α
S
P
U
EU
α
α
α
=
= ∑
12
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Value of Information
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 13
• Observable features like blood test, MRI scan, etc. are
costly and unless they are needed for diagnosis
should not be asked for.
• Value of information has to be assessed in such
domains.
• Observed features: x Newly added features: z
• The expected utility of the best action in state k before
and after adding z is given by
• Value of Info given by z =(EU(x,z)-EU(x)) and if it is
greater than 0 then only z is useful.
( ) ( )
( ) ( )
∑
∑
=
=
k
k
jk
j
k
k
jk
j
z
S
P
U
z
EU
S
P
U
EU
,
|
max
,
|
max
x
x
x
x
Bayesian Belief Network
14
Bayesian Networks
 Directed Acyclic Graphical model to represent the interaction
between the random variables denoted by nodes and directed
edges between them.
 The nodes in the DAG structure have conditional probabilities
as parameters to be learned based on a set of known
examples or through domain knowledge.
 Bayesian networks represents conditional independence
between certain nodes which is helpful to break down the
problem of finding joint distribution of many variables into
local structures.
P(X1, …Xd )= ∏ P(Xi |parents(Xi ))
 Accordingly for the Bayesian Network shown in the diagram
P(C,S,R,W,F)=P(C)P(S|C)P(R|C)P(W|S,R)P(F|R)
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
Bayesian Networks contd…
 In Bayesian networks the input and output variables are not
explicitly designated. Based on the available evidence, the
belief in the variables propagates to infer the prob of the other
variables.
 Hidden variables may also be represented by some of the
nodes and their conditional probabilities are estimated based
on the values of their parents representing related observed
variables.
 Deals with the numeric and categorical variables also
 The structure should be created by human expert after
identifying the casual relationships among the variables and
the local hierarchies.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
Influence Diagrams: Graphical models for
generalization of Bayesian Networks for Decision Making
Lecture Notes for E Alpaydın 2010 Introduction
to Machine Learning 2e © The MIT Press (V1.0) 17
 Influence Diagram contains
 chance nodes rep the Random
variables in BN,
 decision nodes rep choice of
action/classification and
 utility node for utility estimation.
Bayesian Network(BN)
for Classification
Association Rules
 Association rule: X → Y
 People who buy X are also likely to buy Y.
 A rule implies association, not necessarily causation.
 In order to find such associations, the frequent itemsets
are to be found out from the transaction database.
 The number of transactions that cover an itemset is
referred to as its support
 An itemset is considered frequent enough based on a
minimum support threshold.
18
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
Apriori Property
 All subsets of a frequent itemset are frequent. Hence, if a
set is found to be infrequent all its supersets cease to be
frequent and hence pruned.
 For (X,Y,Z), a 3-item set, to be frequent (have enough
support), (X,Y), (X,Z), and (Y,Z) should be frequent.
 If (X,Y) is not frequent, none of its supersets can be
frequent.
 Once we find the frequent k-item sets, we convert them
to rules: X, Y → Z, ...
and X → Y, Z, ...
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 19
Association measures
 Support (X → Y):
 Confidence (X → Y):
 Lift (X → Y):
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 20
( ) { }
{ }
ons
transacti
all
#
and
covering
ns
transactio
#
,
Y
X
Y
X
P =
( ) ( )
{ }
{ }
X
Y
X
X
P
Y
X
P
X
Y
P
covering
ns
transactio
#
and
covering
ns
transactio
#
)
(
,
|
=
=
( )
)
(
)
|
(
)
(
)
(
,
Y
P
X
Y
P
Y
P
X
P
Y
X
P
=
=
Conclusion
 Discussed the formalism for optimal decision making
under uncertainty
 The concepts of probability theory are found to be useful
for modelling uncertainty and accordingly utility of
making a choice or decision is estimated.
 The next chapters focus on how to estimate these
probabilities from a given dataset. They are categorised
as:
 Parametric approaches
 Semiparametric and nonparametric approaches
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 21

More Related Content

Similar to Prof M Shashi AU Syllabus Textbook Slides on Data Generation and Probability Theory

isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..butest
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheetJoachim Gwoke
 
Statement of stochastic programming problems
Statement of stochastic programming problemsStatement of stochastic programming problems
Statement of stochastic programming problemsSSA KPI
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-feiTianlu Wang
 
Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...Wajih Alaiyan
 
Introduction
IntroductionIntroduction
Introductionbutest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorizationmidi
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewMohamed Farouk
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfAnna Carbone
 
Introduction to Machine Learning Lectures
Introduction to Machine Learning LecturesIntroduction to Machine Learning Lectures
Introduction to Machine Learning Lecturesssuserfece35
 
Isspit presentation
Isspit presentationIsspit presentation
Isspit presentationELVINUGONNA
 
FL-01 Introduction.pptx
FL-01 Introduction.pptxFL-01 Introduction.pptx
FL-01 Introduction.pptxSourabhRuhil4
 

Similar to Prof M Shashi AU Syllabus Textbook Slides on Data Generation and Probability Theory (20)

AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
isabelle_webinar_jan..
isabelle_webinar_jan..isabelle_webinar_jan..
isabelle_webinar_jan..
 
Probability cheatsheet
Probability cheatsheetProbability cheatsheet
Probability cheatsheet
 
Probability Cheatsheet.pdf
Probability Cheatsheet.pdfProbability Cheatsheet.pdf
Probability Cheatsheet.pdf
 
Statement of stochastic programming problems
Statement of stochastic programming problemsStatement of stochastic programming problems
Statement of stochastic programming problems
 
ML unit-1.pptx
ML unit-1.pptxML unit-1.pptx
ML unit-1.pptx
 
Lausanne 2019 #1
Lausanne 2019 #1Lausanne 2019 #1
Lausanne 2019 #1
 
Lecture13 xing fei-fei
Lecture13 xing fei-feiLecture13 xing fei-fei
Lecture13 xing fei-fei
 
Lecture12 xing
Lecture12 xingLecture12 xing
Lecture12 xing
 
Econometrics 2017-graduate-3
Econometrics 2017-graduate-3Econometrics 2017-graduate-3
Econometrics 2017-graduate-3
 
Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...Use of the correlation coefficient as a measure of effectiveness of a scoring...
Use of the correlation coefficient as a measure of effectiveness of a scoring...
 
Introduction
IntroductionIntroduction
Introduction
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
When Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying ViewWhen Classifier Selection meets Information Theory: A Unifying View
When Classifier Selection meets Information Theory: A Unifying View
 
Estimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample SetsEstimating Space-Time Covariance from Finite Sample Sets
Estimating Space-Time Covariance from Finite Sample Sets
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
 
Introduction to Machine Learning Lectures
Introduction to Machine Learning LecturesIntroduction to Machine Learning Lectures
Introduction to Machine Learning Lectures
 
Isspit presentation
Isspit presentationIsspit presentation
Isspit presentation
 
FL-01 Introduction.pptx
FL-01 Introduction.pptxFL-01 Introduction.pptx
FL-01 Introduction.pptx
 
i2ml3e-chap3.pptx
i2ml3e-chap3.pptxi2ml3e-chap3.pptx
i2ml3e-chap3.pptx
 

Recently uploaded

Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc.../:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...lizamodels9
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfmuskan1121w
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 

Recently uploaded (20)

Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Mehrauli Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc.../:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...
/:Call Girls In Jaypee Siddharth - 5 Star Hotel New Delhi ➥9990211544 Top Esc...
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Best Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting PartnershipBest Practices for Implementing an External Recruiting Partnership
Best Practices for Implementing an External Recruiting Partnership
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
rishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdfrishikeshgirls.in- Rishikesh call girl.pdf
rishikeshgirls.in- Rishikesh call girl.pdf
 
KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)
 
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service PuneVIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Kirti 8617697112 Independent Escort Service Pune
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 

Prof M Shashi AU Syllabus Textbook Slides on Data Generation and Probability Theory

  • 1. Text Book slides modified by Prof M.Shashi as per the AU syllabus
  • 2. Data Generation Process  The process that had generated the data is not completely known and hence is modelled as a random process.  The outcome of a random process is modelled as a random variable.  Based on the available information or features the value of a random variable is not predictable with certainty and hence is non-deterministic.  Probability theory deals with the study and analysis of such random processes Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 2
  • 3. Probability and Inference  Result of tossing a coin is ∈ {Heads,Tails}  Random var X ∈{1,0}  Po denotes probability of heads, P(X=1)  This implies P(X=0)=1- po  X is Bernoulli distributed and its probability is expressed as P{X} = po X (1 ‒ po)(1 ‒ X)  Data Sample: X = {xt }N t =1 Estimation: po = # {Heads}/#{Tosses} = ∑t xt / N  Prediction of next toss: Infer Heads if po > ½, Tails otherwise 3 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 4. Classification  Values of observable input variables is the basis for prediction  Credit scoring: Inputs are income and savings. Output is low-risk vs high-risk  Input: x = [x1,x2]T ,Output: C = {0,1}  Prediction:  Error =1- Max { P(C=1|(x1 , x2), P(C=0|(x1 , x2)}    = = > = =    = > = = otherwise 0 ) | ( ) | ( if 1 choose or otherwise 0 ) | ( if 1 choose C C C C ,x x C P ,x x C P . ,x x C P 2 1 2 1 2 1 0 1 5 0 1 4 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 5. Bayes’ Rule ( ) ( ) ( ) ( ) x x x p p P P C C C | | = ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 | 1 | 0 0 0 | 1 1 | 1 1 0 = = + = = = + = = = = = + = x x x x x C C C C C C C C P p P p P p p P P 5 posterior Likelihood of X in C prior Prob of Evidence,x, irrespective of C Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) For Binary classification
  • 6. Bayes’ Rule: K>2 Classes ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ∑ = = = K k k k i i i i i C P C p C P C p p C P C p C P 1 | | | | x x x x x ( ) ( ) ( ) ( ) x x | max | if choose and 1 k k i i K i i i C P C P C C P C P = = ≥ ∑ = 1 0 6 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 7. Decision Making considering Losses and Risks  Loss incurred by False Positives and False Negatives may not be equal in domains like finance, health and disaster management.  Action to assign an input to Ci: αi  Loss of αi when the true state is Ck : λik  Expected risk in taking the action αi is ( ) ( ) ( ) ( ) x x x x | min | if choose | | k k i i k K k ik i R R C P R α α α λ α = = ∑ =1 7 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 8. Losses and Risks: 0/1 Loss Case    ≠ = = k i k i ik if if 1 0 λ ( ) ( ) ( ) ( ) x x x x | | | | i i k k K k k ik i C P C P C P R − = = = ∑ ∑ ≠ = 1 1 λ α 8 For minimum risk, choose the most probable class Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) All errors are equally costly:
  • 9. Losses and Risks: Reject as CK+1 (if misclassification is costlier than manual work, eg: sorting mail by optical digit recognizer) 1 0 1 1 0 < <      + = = = λ λ λ otherwise if if , K i k i ik ( ) ( ) ( ) ( ) ( ) x x x x x | | | | | i i k k i K k k K C P C P R C P R − = = = = ∑ ∑ ≠ = + 1 1 1 α λ λ α ( ) ( ) ( ) otherwise reject | and | | if choose λ − > ≠ ∀ > 1 x x x i k i i C P i k C P C P C 9 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 10. Classification using Discriminant Functions ( ) ( ) x x k k i i g g C max if choose = ( ) ( ) { } x x x k k i i g g max | = = R ( ) ( ) ( ) ( ) ( )     − = i i i i i C P C p C P R g | | | x x x x α 10 g(x) Divides the feature space into K decision regions R1,...,RK Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) Discriminant functions can be defined as
  • 11. K=2 Classes  Dichotomizer (K=2) vs Polychotomizer (K>2)  Single discriminant function is often used for 2- class classification  g(x) = g1(x) – g2(x)  Log odds: ( )    > otherwise if choose 2 1 0 C g C x ( ) ( ) x x | | log 2 1 C P C P 11 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 12. Utility Theory for making Rational Decisions under uncertainty  Prob of state k given exidence x: P (Sk|x)  Utility of action αi when state is k: Uik  Expected utility:  Maximizing expected utility is equivalent to minimizing expected risk. ( ) ( ) ( ) ( ) x x x x | max | if Choose | | j j i i k k ik i EU EU α S P U EU α α α = = ∑ 12 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 13. Value of Information Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 13 • Observable features like blood test, MRI scan, etc. are costly and unless they are needed for diagnosis should not be asked for. • Value of information has to be assessed in such domains. • Observed features: x Newly added features: z • The expected utility of the best action in state k before and after adding z is given by • Value of Info given by z =(EU(x,z)-EU(x)) and if it is greater than 0 then only z is useful. ( ) ( ) ( ) ( ) ∑ ∑ = = k k jk j k k jk j z S P U z EU S P U EU , | max , | max x x x x
  • 15. Bayesian Networks  Directed Acyclic Graphical model to represent the interaction between the random variables denoted by nodes and directed edges between them.  The nodes in the DAG structure have conditional probabilities as parameters to be learned based on a set of known examples or through domain knowledge.  Bayesian networks represents conditional independence between certain nodes which is helpful to break down the problem of finding joint distribution of many variables into local structures. P(X1, …Xd )= ∏ P(Xi |parents(Xi ))  Accordingly for the Bayesian Network shown in the diagram P(C,S,R,W,F)=P(C)P(S|C)P(R|C)P(W|S,R)P(F|R) Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 15
  • 16. Bayesian Networks contd…  In Bayesian networks the input and output variables are not explicitly designated. Based on the available evidence, the belief in the variables propagates to infer the prob of the other variables.  Hidden variables may also be represented by some of the nodes and their conditional probabilities are estimated based on the values of their parents representing related observed variables.  Deals with the numeric and categorical variables also  The structure should be created by human expert after identifying the casual relationships among the variables and the local hierarchies. Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 16
  • 17. Influence Diagrams: Graphical models for generalization of Bayesian Networks for Decision Making Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 17  Influence Diagram contains  chance nodes rep the Random variables in BN,  decision nodes rep choice of action/classification and  utility node for utility estimation. Bayesian Network(BN) for Classification
  • 18. Association Rules  Association rule: X → Y  People who buy X are also likely to buy Y.  A rule implies association, not necessarily causation.  In order to find such associations, the frequent itemsets are to be found out from the transaction database.  The number of transactions that cover an itemset is referred to as its support  An itemset is considered frequent enough based on a minimum support threshold. 18 Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0)
  • 19. Apriori Property  All subsets of a frequent itemset are frequent. Hence, if a set is found to be infrequent all its supersets cease to be frequent and hence pruned.  For (X,Y,Z), a 3-item set, to be frequent (have enough support), (X,Y), (X,Z), and (Y,Z) should be frequent.  If (X,Y) is not frequent, none of its supersets can be frequent.  Once we find the frequent k-item sets, we convert them to rules: X, Y → Z, ... and X → Y, Z, ... Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 19
  • 20. Association measures  Support (X → Y):  Confidence (X → Y):  Lift (X → Y): Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 20 ( ) { } { } ons transacti all # and covering ns transactio # , Y X Y X P = ( ) ( ) { } { } X Y X X P Y X P X Y P covering ns transactio # and covering ns transactio # ) ( , | = = ( ) ) ( ) | ( ) ( ) ( , Y P X Y P Y P X P Y X P = =
  • 21. Conclusion  Discussed the formalism for optimal decision making under uncertainty  The concepts of probability theory are found to be useful for modelling uncertainty and accordingly utility of making a choice or decision is estimated.  The next chapters focus on how to estimate these probabilities from a given dataset. They are categorised as:  Parametric approaches  Semiparametric and nonparametric approaches Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) 21