Bayesian Methods in Machine
          Learning
         Mohammad Shazri
       mshazri@gmail.com
   shazri.shahrir@extolcorp.com
Sources
• Papers from Michael Tipping
  -Bayesian Inference: An introduction to Principles in Machine Learning
• Youtube from mathematicalmonk
approximate P(B|A)

• P(B|A)=f(A;w) ….. A is the exemplars and w is
  the weights.|”given A, what is B?”.
• B and A usually relates as D= {A_n,B_n} from
  n=1 to N.
• Prior statements, possible to over-fit.
Initial statement
• Given A , what is the likelihood of B.
• P(B|A)
• Then to approximate P(B|A)
Setup and Model
• Setup; it is a supervised learning situation
• (1) Models/Basis* shape are normally
  distributed
• (2) confidence weights/tuneVar are
  independent.
• (3) there are precision tuners on
  basis+weights. Var and alpha
• (4) all are known except weights.
Usual vs MAP
• Usual ; over-fitting
• MAP ; solves over-fitting but there is no
  interpretation of BR.
______________________________________
• Confidence-Level/Interpretation of the
  unknown
Complexity Control : Regularization
• E(w)= E_d(w)+lambda*E_w(w)
• E_d is the standard regression
  model, liner, sigmoid etc.
• E_w is the penalty
• Lamda is a hyperparameter
Error model




-Note there is no x…
Log of error model
Tune Shape




Confidence on w
+
shape of error contribution
Interpretation
Lastly


-Correspondence to NN
-How we honestly manage complexity in Extol
Questions..
• thanks

Bayesian His2011

  • 1.
    Bayesian Methods inMachine Learning Mohammad Shazri mshazri@gmail.com shazri.shahrir@extolcorp.com
  • 2.
    Sources • Papers fromMichael Tipping -Bayesian Inference: An introduction to Principles in Machine Learning • Youtube from mathematicalmonk
  • 3.
    approximate P(B|A) • P(B|A)=f(A;w)….. A is the exemplars and w is the weights.|”given A, what is B?”. • B and A usually relates as D= {A_n,B_n} from n=1 to N. • Prior statements, possible to over-fit.
  • 4.
    Initial statement • GivenA , what is the likelihood of B. • P(B|A) • Then to approximate P(B|A)
  • 5.
    Setup and Model •Setup; it is a supervised learning situation • (1) Models/Basis* shape are normally distributed • (2) confidence weights/tuneVar are independent. • (3) there are precision tuners on basis+weights. Var and alpha • (4) all are known except weights.
  • 6.
    Usual vs MAP •Usual ; over-fitting • MAP ; solves over-fitting but there is no interpretation of BR. ______________________________________ • Confidence-Level/Interpretation of the unknown
  • 7.
    Complexity Control :Regularization • E(w)= E_d(w)+lambda*E_w(w) • E_d is the standard regression model, liner, sigmoid etc. • E_w is the penalty • Lamda is a hyperparameter
  • 8.
  • 9.
  • 10.
    Tune Shape Confidence onw + shape of error contribution
  • 11.
  • 12.
    Lastly -Correspondence to NN -Howwe honestly manage complexity in Extol
  • 13.