1
Probability for
Machine Learning
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 2
Probabilistic Machine Learning
• Not all machine learning models are probabilistic
• … but most of them have probabilistic interpretations
• Predictions need to have associated confidence
• Confidence = probability
• Arguments for probabilistic approach
• Complete framework for Machine Learning
• Makes assumptions explicit
• Recovers most non-probabilistic models as special cases
• Modular: Easily extensible
3
References
• “Introduction to Probability Models”, Sheldon Ross
• “Introduction to Probability and Statistics for
Engineers and Scientists”, Sheldon Ross
• “Introduction To Probability”, Dimitri P. Bertsekas,
John N. Tsitsiklis
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 4
Basics
• Random experiment , outcome , events , sample space
• Probability measure
• Axioms of probability, basic laws of probability
• Discrete sample space, discrete probability measure
• Continuous sample space, continuous probability measure
• Conditional probability, multiplicative rule, theorem of total
probability, Bayes theorem
• Independence, pair-wise, mutual, conditional independence
5
Random Variables
• Example:
• Experiment: Tossing of two coins
• Random variable: sum of two outcomes
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
6
Discrete Random Variables
• Probability mass function
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 7
Example distributions: Discrete
• Bernoulli:
• Binomial:
• Poisson:
• Geometric:
• Empirical distribution: Given , , where is the Dirac delta measure
8
Continuous Random Variables
• Probability density function
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 9
Example density functions
• Uniform:
• Exponential:
• Standard Normal:
• Gaussian:
• Laplace:
• Gamma:
• Beta:
10
Random Variables
• Cumulative distribution function
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 11
Moments
• Mean
• Variance
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 12
Random Vectors and Joint Distributions
• Discrete Random Vector
• Joint pmf
• Continuous Random Vector
• Joint pdf
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 13
Example multi-variate distributions
• Multi-variate Gaussian
• Multinomial
• Dirichlet
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 14
Random Vectors and Joint Distributions
• Given ,
• Marginal distributions
• Expectation
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 15
Conditional Probability
• Conditional pmf
• Conditional pdf
• Given ,
• Multiplication Rule
• Bayes rule
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 16
Conditional Probability
• Given ,
• Conditional Expectation
• Law of Total Expectation
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 17
Independence and Conditional Independence
• Independence
• Conditional Independence
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 18
Covariance
• Covariance
• Correlation co-efficient
• Covariance matrix for a random vector X
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 19
Central Limit Theorem
• N i.i.d. random variables with mean , variance
•
• As N increases the distribution of approaches the
standard normal distribution
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 20
Notions from Information Theory
• Entropy
• KL divergence
• Mutual Information
• Point-wise Mutual Information
Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 21
Jensen’s Inequality
• For a convex function f() and a random variable X
• Equality holds if f(x) is linear

New lecture on Probability for machine learning.pptx

  • 1.
    1 Probability for Machine Learning Foundationsof Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 2.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 2 Probabilistic Machine Learning • Not all machine learning models are probabilistic • … but most of them have probabilistic interpretations • Predictions need to have associated confidence • Confidence = probability • Arguments for probabilistic approach • Complete framework for Machine Learning • Makes assumptions explicit • Recovers most non-probabilistic models as special cases • Modular: Easily extensible
  • 3.
    3 References • “Introduction toProbability Models”, Sheldon Ross • “Introduction to Probability and Statistics for Engineers and Scientists”, Sheldon Ross • “Introduction To Probability”, Dimitri P. Bertsekas, John N. Tsitsiklis Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 4.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 4 Basics • Random experiment , outcome , events , sample space • Probability measure • Axioms of probability, basic laws of probability • Discrete sample space, discrete probability measure • Continuous sample space, continuous probability measure • Conditional probability, multiplicative rule, theorem of total probability, Bayes theorem • Independence, pair-wise, mutual, conditional independence
  • 5.
    5 Random Variables • Example: •Experiment: Tossing of two coins • Random variable: sum of two outcomes Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 6.
    6 Discrete Random Variables •Probability mass function Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 7.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 7 Example distributions: Discrete • Bernoulli: • Binomial: • Poisson: • Geometric: • Empirical distribution: Given , , where is the Dirac delta measure
  • 8.
    8 Continuous Random Variables •Probability density function Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 9.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 9 Example density functions • Uniform: • Exponential: • Standard Normal: • Gaussian: • Laplace: • Gamma: • Beta:
  • 10.
    10 Random Variables • Cumulativedistribution function Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya
  • 11.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 11 Moments • Mean • Variance
  • 12.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 12 Random Vectors and Joint Distributions • Discrete Random Vector • Joint pmf • Continuous Random Vector • Joint pdf
  • 13.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 13 Example multi-variate distributions • Multi-variate Gaussian • Multinomial • Dirichlet
  • 14.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 14 Random Vectors and Joint Distributions • Given , • Marginal distributions • Expectation
  • 15.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 15 Conditional Probability • Conditional pmf • Conditional pdf • Given , • Multiplication Rule • Bayes rule
  • 16.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 16 Conditional Probability • Given , • Conditional Expectation • Law of Total Expectation
  • 17.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 17 Independence and Conditional Independence • Independence • Conditional Independence
  • 18.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 18 Covariance • Covariance • Correlation co-efficient • Covariance matrix for a random vector X
  • 19.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 19 Central Limit Theorem • N i.i.d. random variables with mean , variance • • As N increases the distribution of approaches the standard normal distribution
  • 20.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 20 Notions from Information Theory • Entropy • KL divergence • Mutual Information • Point-wise Mutual Information
  • 21.
    Foundations of Algorithmsand Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya 21 Jensen’s Inequality • For a convex function f() and a random variable X • Equality holds if f(x) is linear