SlideShare a Scribd company logo
Gaussian Processes: Applications in Machine
                 Learning

                      Abhishek Agarwal
                        (05329022)
        Under the Guidance of Prof. Sunita Sarawagi
                   KReSIT, IIT Bombay


                   Seminar Presentation
                     March 29, 2006




      Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Outline




      Introduction to Gaussian Processes(GP)
      Prior & Posterior Distributions
      GP Models: Regression
      GP Models: Binary Classification
      Covariance Functions
      Conclusion.




               Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Introduction



      Supervised Learning
      Gaussian Processes
          Defines distribution over functions.
          Collection of random variables, any finite number of which
          have joint Gaussian distributions.[1] [2]

                                             f ∼ GP(m, k)

          Hyperparameters and Covariance function.
          Predictions




               Abhishek Agarwal (05329022)      Gaussian Processes: Applications in Machine Learning
Prior Distribution


      Represents our belief about the function distribution, which
      we pass through parameters
      Example: GP(m, k)

                   1
             m(x) = x 2 , k(x, x ) = exp(− 1 (x − x )2 ).
                                           2
                   4
      To draw sample from the distribution:
          Pick some data points.
          Find distribution parameters at each point.

                   µi = m(xi )       &       Σij = k(xi , xj )     i, j = 1, . . . , n

          Pick the function values from each individual distribution.



               Abhishek Agarwal (05329022)      Gaussian Processes: Applications in Machine Learning
Prior Distribution(contd.)

                                9



                                8



                                7



                                6
              function values




                                5



                                4



                                3



                                2



                                1
                                −5   −4    −3    −2    −1          0        1   2      3     4     5
                                                              data points




      Figure: Prior distribution over function using Gaussian Process



                                Abhishek Agarwal (05329022)            Gaussian Processes: Applications in Machine Learning
Posterior Distribution

       Distribution changes in presence of Training data D(x, y ).
       Functions which satisy D are given higher probability.

                                         8


                                         7


                                         6


                                         5
                      function values




                                         4


                                         3


                                         2


                                         1


                                         0


                                        −1
                                         −5   −4   −3   −2   −1        0        1   2   3   4   5
                                                                  data points




    Figure: Posterior distribution over functions using Gaussian Processes



                 Abhishek Agarwal (05329022)                                Gaussian Processes: Applications in Machine Learning
Posterior Distribution (contd.)



      Prediction for unlabeled data x∗
          GP outputs the function distribution at x∗
          Let f be the distribution at data points in D and f∗ at x∗
          f and f∗ will have a joint Gaussian distribution, represented as:

                           f                 µ            Σ        Σ∗
                                 ∼
                          f∗                 µ∗          Σ∗ T      Σ∗∗
          Conditional distribution of f∗ given f can be expressed as:

            f∗ |f ∼ N ( µ∗ + Σ∗ T Σ−1 (f − µ), Σ∗∗ − Σ∗ T Σ−1 Σ∗ )                          (1)




               Abhishek Agarwal (05329022)    Gaussian Processes: Applications in Machine Learning
Posterior Distribution (contd.)
      Parameters of the posterior in Eq. 1 are:
       f∗ |D ∼ GP(mD , kD )                                                           ,
                                 where mD (x)             = m(x) + Σ(X , x)T Σ−1 (f − m)
                                         kD (x, x ) = k(x, x ) − Σ(X , x)T Σ−1 Σ(X , x )
                            8



                            7



                            6



                            5
          function values




                            4



                            3



                            2



                            1



                            0
                            −5      −4     −3     −2    −1           0        1   2       3   4   5
                                                                data points




                                            Figure: Prediction from GP Applications in Machine Learning
                                  Abhishek Agarwal (05329022)
                                                            Gaussian Processes:
GP Models: Regression



     GP can be directly applied to Bayesian Linear Regression
     model like:
         f (x) = φ(x)T w with prior w ∼ N (0, Σ)
         Parameters for this distribution will be:

                 E[f (x)]          = φ(x)T E[w ]      = 0,
            E[f (x)f (x )]      = φ(x)T E[ww T ]φ(x ) = φ(x)T Σp φ(x )

     So, f (x) and f (x ) are jointly Gaussian with zero mean and
     covariance φ(x)T Σp φ(x ).




              Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
GP Models: Regression (contd.)


     In Regression, posterior distribution over the weights, is given
     as (9):

                                        likelhood ∗ prior
                   posterior =
                                       marginal likelihood

     Both prior p(f|X ) and likelihood p(y |f, X ) are Gaussian:
                            prior: f|X ∼ N (0, K ) (5)
                       likelihood: y|f ∼ N (f, σ n 2 I)
     Marginal Likelihood p(y |X ) is defined as (6):

                       p(y |X ) =           p(y |f, X )p(f|X )df                           (2)




              Abhishek Agarwal (05329022)     Gaussian Processes: Applications in Machine Learning
GP Models: Classification


     Modeling Binary Classifier
         Squash the output of a regression model using a response
         function, like sigmoid.
         Ex: Linear logistic regression model:
                                                                   1
                    p(C1 |x) = λ(x T w ),        λ(z) =
                                                              1 + exp(−z)

         Likelihood is expressed as (7):

                                p(yi |xi , w ) = σ(yi fi ),
                                    fi ∼ f (xi ) = x i T w

         and therefore its non-Gaussain.




             Abhishek Agarwal (05329022)    Gaussian Processes: Applications in Machine Learning
GP Models: Classification (contd.)



      Distribution over latent function, after seeing the test data, is
      given as:

               p(f∗ |X , y , x∗ ) =          p(f∗ |X , x∗ , f)p(f|X , y )df,                 (3)

      where p(f|X , y ) = p(y |f)p(f|X )/p(y |X ) is the posterior over
      the latent variable.
      Computation of the above integral is analytically intractable
           Both, likelihood and posterior are non-Gaussian.
           Need to use some analytic Approximation of integrals.




               Abhishek Agarwal (05329022)      Gaussian Processes: Applications in Machine Learning
GP Models: Laplace Approximations


     Gaussian Approximation of p(f|X , y ):
          Using second order Taylor expansion, we obtain:

                                  q(f|X , y ) = N (f|ˆ A−1 )
                                                     f,

          where where ˆ = argmaxf p(f|X , y ) and
                        f
          A=−          log p(f|X , y )|f=ˆ
                                         f
          To find ˆ we use Newton’s method, because of non-linearity of
                  f,
            log p(f|X , y ) (9)
     Prediction is given as:


        π∗ = p(y∗ = +1|X , y , x∗ ) =            σ(f∗ )p(f∗ |X , y , x∗ )df∗ ,           (4)




              Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Covariance Function



      Encodes our belief about the prior distribution over function
      Some properties:
          Staionary
          Isotropic
          Dot-Product Covariance
      Ex: Squared Exponential(SE) covarince function:

                                                1
                  cov (f (xp ), f (xq )) = exp(− |xp − xq |2 )
                                                2
      Learned with other hyper-parameters.




               Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Summary and Future Work



     Current Research:
         Fast sparse approximation algorithm for matrix inversion.
         Approximation algorithm for non-Gaussian likelihoods.
     GP approach has outperformed traditional methods in many
     applications.
         Gaussin Process based Positioning System (GPPS) [6]
         Multi user Detection (MUD) in CDMA [7]
     GP models are more powerful and flexible than simple
     linear parametric models and less complex in comparison
     to other models like multi-layer perceptrons. [1]




             Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Rasmussen and Williams. Gaussian Process for Machine
Learning, The MIT Press, 2006.
Matthias Seeger. Gaussian Process for Machine Learning,
2004. International Journal of Neural Systems, 14(2):69-106,
2004.
Christopher Williams, Bayesian Classification with Gaussian
Processes, In IEEE Trans. Pattern analysis and Machine
Intelligence, 1998
Rasmussen and Williams, Gaussian Process for Regression. In
Proceedings of NIPS’ 1996.
Rasmussen, Evaluation of Gaussian Processes and Other
Methods for Non-linear Regression. PhD thesis, Dept. of
Computer Science, University of Toronto, 1996. Available from
http://www.cs.utoronto.ca/ carl/



         Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Anton Schwaighofer, et. al. GPPS: A Gaussian Process
Positioning System for Cellular Networks, In proceedings of
NIPS’ 2003.
Murillo-Fuentes, et. al. Gaussian Processes for Multiuser
Detection in CDMA receivers, Advances in Neural Information
Processing System’ 2005
David Mackay, Introduction to Gaussian Processes
C. Williams. Gaussian processes. In M. A. Arbib, editor,
Handbook of Brain Theory and Neural Networks, pages
466-470. The MIT Press, second edition, 2002.




         Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Thank You !!




                              Questions ??




               Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Extra

        Prior:
                                1            1          n
                 log p(f|X ) = − f T K −1 f − log |K | − log 2π                              (5)
                                2            2          2
        Mariginal likelihood

                       1                    1                 n
        log p(y|X ) = − yT (K +σ n 2 I)−1 y− log |K +σ n 2 I|− log 2π
                       2                    2                 2
                                                                  (6)
        Likelihood
                              p(y = +1|x, w ) = σ(x T w ),                                   (7)
        For symmetric like hood σ(−z) = 1 − σ(z).

                                 p(yi |xi , w ) = σ(x i T w ),                               (8)


                  Abhishek Agarwal (05329022)   Gaussian Processes: Applications in Machine Learning
Extra (contd.)




      first derivative of posterior

                              ˆ = K(
                              f              log p(f|X , y ))

      Prediction
                                             p(y|X, w) ∗ p(w)
                         p(w |y , X ) =
                                                 p(y |X )




               Abhishek Agarwal (05329022)    Gaussian Processes: Applications in Machine Learning

More Related Content

What's hot

Greedy Algorithm - Knapsack Problem
Greedy Algorithm - Knapsack ProblemGreedy Algorithm - Knapsack Problem
Greedy Algorithm - Knapsack Problem
Madhu Bala
 
Scheme Programming Language
Scheme Programming LanguageScheme Programming Language
Scheme Programming Language
Reham AlBlehid
 
LR Parsing
LR ParsingLR Parsing
LR Parsing
Eelco Visser
 
Solution of equations and eigenvalue problems
Solution of equations and eigenvalue problemsSolution of equations and eigenvalue problems
Solution of equations and eigenvalue problems
Santhanam Krishnan
 
Fuzzy Logic
Fuzzy LogicFuzzy Logic
Fuzzy Logic
Rishikese MR
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
Dr. C.V. Suresh Babu
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms
Xin-She Yang
 
Rank, Nullity, and Fundamental Matrix Spaces.pptx
Rank, Nullity, and Fundamental Matrix Spaces.pptxRank, Nullity, and Fundamental Matrix Spaces.pptx
Rank, Nullity, and Fundamental Matrix Spaces.pptx
froilandoblon1
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
Xin-She Yang
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctions
Sufyan Sahoo
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
Joy Dutta
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
VARUN KUMAR
 
Numerical Computing
Numerical Computing Numerical Computing
Numerical Computing
Er. Shiva K. Shrestha
 
Newton's forward & backward interpolation
Newton's forward & backward interpolationNewton's forward & backward interpolation
Newton's forward & backward interpolation
Harshad Koshti
 
Matlab Functions
Matlab FunctionsMatlab Functions
Matlab Functions
Umer Azeem
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
Mostafa G. M. Mostafa
 
EULER AND FERMAT THEOREM
EULER AND FERMAT THEOREMEULER AND FERMAT THEOREM
EULER AND FERMAT THEOREM
ankita pandey
 
Fractional Calculus PP
Fractional Calculus PPFractional Calculus PP
Fractional Calculus PPVRRITC
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
Databricks
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Mohammed Bennamoun
 

What's hot (20)

Greedy Algorithm - Knapsack Problem
Greedy Algorithm - Knapsack ProblemGreedy Algorithm - Knapsack Problem
Greedy Algorithm - Knapsack Problem
 
Scheme Programming Language
Scheme Programming LanguageScheme Programming Language
Scheme Programming Language
 
LR Parsing
LR ParsingLR Parsing
LR Parsing
 
Solution of equations and eigenvalue problems
Solution of equations and eigenvalue problemsSolution of equations and eigenvalue problems
Solution of equations and eigenvalue problems
 
Fuzzy Logic
Fuzzy LogicFuzzy Logic
Fuzzy Logic
 
Randomized algorithms ver 1.0
Randomized algorithms ver 1.0Randomized algorithms ver 1.0
Randomized algorithms ver 1.0
 
Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms Nature-Inspired Optimization Algorithms
Nature-Inspired Optimization Algorithms
 
Rank, Nullity, and Fundamental Matrix Spaces.pptx
Rank, Nullity, and Fundamental Matrix Spaces.pptxRank, Nullity, and Fundamental Matrix Spaces.pptx
Rank, Nullity, and Fundamental Matrix Spaces.pptx
 
Metaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical AnalysisMetaheuristic Algorithms: A Critical Analysis
Metaheuristic Algorithms: A Critical Analysis
 
03 convexfunctions
03 convexfunctions03 convexfunctions
03 convexfunctions
 
Simulated Annealing
Simulated AnnealingSimulated Annealing
Simulated Annealing
 
Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
 
Numerical Computing
Numerical Computing Numerical Computing
Numerical Computing
 
Newton's forward & backward interpolation
Newton's forward & backward interpolationNewton's forward & backward interpolation
Newton's forward & backward interpolation
 
Matlab Functions
Matlab FunctionsMatlab Functions
Matlab Functions
 
Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)Csc446: Pattren Recognition (LN1)
Csc446: Pattren Recognition (LN1)
 
EULER AND FERMAT THEOREM
EULER AND FERMAT THEOREMEULER AND FERMAT THEOREM
EULER AND FERMAT THEOREM
 
Fractional Calculus PP
Fractional Calculus PPFractional Calculus PP
Fractional Calculus PP
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 

Viewers also liked

論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
Keisuke Uto
 
Flexible and efficient Gaussian process models for machine ...
Flexible and efficient Gaussian process models for machine ...Flexible and efficient Gaussian process models for machine ...
Flexible and efficient Gaussian process models for machine ...butest
 
Pasolli_TH1_T09_2.ppt
Pasolli_TH1_T09_2.pptPasolli_TH1_T09_2.ppt
Pasolli_TH1_T09_2.pptgrssieee
 
The Role Of Translators In MT: EU 2010
The Role Of Translators In MT:  EU 2010The Role Of Translators In MT:  EU 2010
The Role Of Translators In MT: EU 2010
LoriThicke
 
Bird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysisBird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysis
Radboud University Medical Center
 
1 factor vs.2 factor gaussian model for zero coupon bond pricing final
1 factor vs.2 factor gaussian model for zero coupon bond pricing   final1 factor vs.2 factor gaussian model for zero coupon bond pricing   final
1 factor vs.2 factor gaussian model for zero coupon bond pricing final
Financial Algorithms
 
Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesTraining and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian Processes
Keyon Vafa
 
03 the gaussian kernel
03 the gaussian kernel03 the gaussian kernel
03 the gaussian kernelMhAcKnI
 
Kernal methods part2
Kernal methods part2Kernal methods part2
Kernal methods part2
Beatrice van Eden
 
Differentiating the translation process: A corpus analysis of editorial influ...
Differentiating the translation process: A corpus analysis of editorial influ...Differentiating the translation process: A corpus analysis of editorial influ...
Differentiating the translation process: A corpus analysis of editorial influ...
Mario Bisiada
 
Inventory
InventoryInventory
Inventorytopabhi
 
Gaussian model (kabani & sumeet)
Gaussian model (kabani & sumeet)Gaussian model (kabani & sumeet)
Gaussian model (kabani & sumeet)
Sumeet Khirade
 
linear equation and gaussian elimination
linear equation and gaussian eliminationlinear equation and gaussian elimination
linear equation and gaussian elimination
Aju Thadikulangara
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
mark madsen
 
07 history of cv vision paradigms - system - algorithms - applications - eva...
07  history of cv vision paradigms - system - algorithms - applications - eva...07  history of cv vision paradigms - system - algorithms - applications - eva...
07 history of cv vision paradigms - system - algorithms - applications - eva...zukun
 
Streamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve ProductivityStreamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve Productivity
Kevin Fream
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine Learning
Gianluca Bontempi
 
Applying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network RoutingApplying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network Routingbutest
 
Supervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured TextSupervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured Text
International Journal of Engineering Inventions www.ijeijournal.com
 

Viewers also liked (20)

論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
論文紹介:Practical bayesian optimization of machine learning algorithms(nips2012)
 
Flexible and efficient Gaussian process models for machine ...
Flexible and efficient Gaussian process models for machine ...Flexible and efficient Gaussian process models for machine ...
Flexible and efficient Gaussian process models for machine ...
 
Pasolli_TH1_T09_2.ppt
Pasolli_TH1_T09_2.pptPasolli_TH1_T09_2.ppt
Pasolli_TH1_T09_2.ppt
 
The Role Of Translators In MT: EU 2010
The Role Of Translators In MT:  EU 2010The Role Of Translators In MT:  EU 2010
The Role Of Translators In MT: EU 2010
 
Bird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysisBird’s-eye view of Gaussian harmonic analysis
Bird’s-eye view of Gaussian harmonic analysis
 
1 factor vs.2 factor gaussian model for zero coupon bond pricing final
1 factor vs.2 factor gaussian model for zero coupon bond pricing   final1 factor vs.2 factor gaussian model for zero coupon bond pricing   final
1 factor vs.2 factor gaussian model for zero coupon bond pricing final
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
Training and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian ProcessesTraining and Inference for Deep Gaussian Processes
Training and Inference for Deep Gaussian Processes
 
03 the gaussian kernel
03 the gaussian kernel03 the gaussian kernel
03 the gaussian kernel
 
Kernal methods part2
Kernal methods part2Kernal methods part2
Kernal methods part2
 
Differentiating the translation process: A corpus analysis of editorial influ...
Differentiating the translation process: A corpus analysis of editorial influ...Differentiating the translation process: A corpus analysis of editorial influ...
Differentiating the translation process: A corpus analysis of editorial influ...
 
Inventory
InventoryInventory
Inventory
 
Gaussian model (kabani & sumeet)
Gaussian model (kabani & sumeet)Gaussian model (kabani & sumeet)
Gaussian model (kabani & sumeet)
 
linear equation and gaussian elimination
linear equation and gaussian eliminationlinear equation and gaussian elimination
linear equation and gaussian elimination
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
07 history of cv vision paradigms - system - algorithms - applications - eva...
07  history of cv vision paradigms - system - algorithms - applications - eva...07  history of cv vision paradigms - system - algorithms - applications - eva...
07 history of cv vision paradigms - system - algorithms - applications - eva...
 
Streamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve ProductivityStreamlining Technology to Reduce Complexity and Improve Productivity
Streamlining Technology to Reduce Complexity and Improve Productivity
 
Some Take-Home Message about Machine Learning
Some Take-Home Message about Machine LearningSome Take-Home Message about Machine Learning
Some Take-Home Message about Machine Learning
 
Applying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network RoutingApplying Reinforcement Learning for Network Routing
Applying Reinforcement Learning for Network Routing
 
Supervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured TextSupervised Approach to Extract Sentiments from Unstructured Text
Supervised Approach to Extract Sentiments from Unstructured Text
 

Similar to Gaussian Processes: Applications in Machine Learning

MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
The Statistical and Applied Mathematical Sciences Institute
 
Introductory maths analysis chapter 02 official
Introductory maths analysis   chapter 02 officialIntroductory maths analysis   chapter 02 official
Introductory maths analysis chapter 02 official
Evert Sandye Taasiringan
 
Chapter2 functionsandgraphs-151003144959-lva1-app6891
Chapter2 functionsandgraphs-151003144959-lva1-app6891Chapter2 functionsandgraphs-151003144959-lva1-app6891
Chapter2 functionsandgraphs-151003144959-lva1-app6891
Cleophas Rwemera
 
Chapter 2 - Functions and Graphs
Chapter 2 - Functions and GraphsChapter 2 - Functions and Graphs
Chapter 2 - Functions and Graphs
Muhammad Bilal Khairuddin
 
Algebra 1
Algebra 1Algebra 1
Algebra 1
April Smith
 
AnsChap1.pdf
AnsChap1.pdfAnsChap1.pdf
AnsChap1.pdf
SANTHIYAAPKALITHASAN
 
Jan. 4 Function L1
Jan. 4 Function L1Jan. 4 Function L1
Jan. 4 Function L1RyanWatt
 
Natural and Clamped Cubic Splines
Natural and Clamped Cubic SplinesNatural and Clamped Cubic Splines
Natural and Clamped Cubic SplinesMark Brandao
 
ICPR 2016
ICPR 2016ICPR 2016
Functions
FunctionsFunctions
Funções 4
Funções 4Funções 4
Funções 4
KalculosOnline
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Big_Data_Ukraine
 
ppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdfppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdf
surefooted
 
Chapter 1 (math 1)
Chapter 1 (math 1)Chapter 1 (math 1)
Chapter 1 (math 1)
Amr Mohamed
 
Modul 1 functions
Modul 1 functionsModul 1 functions
Modul 1 functions
Norelyana Ali
 
8517ijaia06
8517ijaia068517ijaia06
8517ijaia06
ijaia
 
FUNCTIONS L.1.pdf
FUNCTIONS L.1.pdfFUNCTIONS L.1.pdf
FUNCTIONS L.1.pdf
Marjorie Malveda
 

Similar to Gaussian Processes: Applications in Machine Learning (20)

MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
Introductory maths analysis chapter 02 official
Introductory maths analysis   chapter 02 officialIntroductory maths analysis   chapter 02 official
Introductory maths analysis chapter 02 official
 
Chapter2 functionsandgraphs-151003144959-lva1-app6891
Chapter2 functionsandgraphs-151003144959-lva1-app6891Chapter2 functionsandgraphs-151003144959-lva1-app6891
Chapter2 functionsandgraphs-151003144959-lva1-app6891
 
Chapter 2 - Functions and Graphs
Chapter 2 - Functions and GraphsChapter 2 - Functions and Graphs
Chapter 2 - Functions and Graphs
 
Lec3
Lec3Lec3
Lec3
 
Algebra 1
Algebra 1Algebra 1
Algebra 1
 
AnsChap1.pdf
AnsChap1.pdfAnsChap1.pdf
AnsChap1.pdf
 
Jan. 4 Function L1
Jan. 4 Function L1Jan. 4 Function L1
Jan. 4 Function L1
 
Natural and Clamped Cubic Splines
Natural and Clamped Cubic SplinesNatural and Clamped Cubic Splines
Natural and Clamped Cubic Splines
 
Functions
FunctionsFunctions
Functions
 
ICPR 2016
ICPR 2016ICPR 2016
ICPR 2016
 
Functions
FunctionsFunctions
Functions
 
Funções 4
Funções 4Funções 4
Funções 4
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
ppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdfppt - Deep Learning From Scratch.pdf
ppt - Deep Learning From Scratch.pdf
 
735
735735
735
 
Chapter 1 (math 1)
Chapter 1 (math 1)Chapter 1 (math 1)
Chapter 1 (math 1)
 
Modul 1 functions
Modul 1 functionsModul 1 functions
Modul 1 functions
 
8517ijaia06
8517ijaia068517ijaia06
8517ijaia06
 
FUNCTIONS L.1.pdf
FUNCTIONS L.1.pdfFUNCTIONS L.1.pdf
FUNCTIONS L.1.pdf
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Gaussian Processes: Applications in Machine Learning

  • 1. Gaussian Processes: Applications in Machine Learning Abhishek Agarwal (05329022) Under the Guidance of Prof. Sunita Sarawagi KReSIT, IIT Bombay Seminar Presentation March 29, 2006 Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 2. Outline Introduction to Gaussian Processes(GP) Prior & Posterior Distributions GP Models: Regression GP Models: Binary Classification Covariance Functions Conclusion. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 3. Introduction Supervised Learning Gaussian Processes Defines distribution over functions. Collection of random variables, any finite number of which have joint Gaussian distributions.[1] [2] f ∼ GP(m, k) Hyperparameters and Covariance function. Predictions Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 4. Prior Distribution Represents our belief about the function distribution, which we pass through parameters Example: GP(m, k) 1 m(x) = x 2 , k(x, x ) = exp(− 1 (x − x )2 ). 2 4 To draw sample from the distribution: Pick some data points. Find distribution parameters at each point. µi = m(xi ) & Σij = k(xi , xj ) i, j = 1, . . . , n Pick the function values from each individual distribution. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 5. Prior Distribution(contd.) 9 8 7 6 function values 5 4 3 2 1 −5 −4 −3 −2 −1 0 1 2 3 4 5 data points Figure: Prior distribution over function using Gaussian Process Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 6. Posterior Distribution Distribution changes in presence of Training data D(x, y ). Functions which satisy D are given higher probability. 8 7 6 5 function values 4 3 2 1 0 −1 −5 −4 −3 −2 −1 0 1 2 3 4 5 data points Figure: Posterior distribution over functions using Gaussian Processes Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 7. Posterior Distribution (contd.) Prediction for unlabeled data x∗ GP outputs the function distribution at x∗ Let f be the distribution at data points in D and f∗ at x∗ f and f∗ will have a joint Gaussian distribution, represented as: f µ Σ Σ∗ ∼ f∗ µ∗ Σ∗ T Σ∗∗ Conditional distribution of f∗ given f can be expressed as: f∗ |f ∼ N ( µ∗ + Σ∗ T Σ−1 (f − µ), Σ∗∗ − Σ∗ T Σ−1 Σ∗ ) (1) Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 8. Posterior Distribution (contd.) Parameters of the posterior in Eq. 1 are: f∗ |D ∼ GP(mD , kD ) , where mD (x) = m(x) + Σ(X , x)T Σ−1 (f − m) kD (x, x ) = k(x, x ) − Σ(X , x)T Σ−1 Σ(X , x ) 8 7 6 5 function values 4 3 2 1 0 −5 −4 −3 −2 −1 0 1 2 3 4 5 data points Figure: Prediction from GP Applications in Machine Learning Abhishek Agarwal (05329022) Gaussian Processes:
  • 9. GP Models: Regression GP can be directly applied to Bayesian Linear Regression model like: f (x) = φ(x)T w with prior w ∼ N (0, Σ) Parameters for this distribution will be: E[f (x)] = φ(x)T E[w ] = 0, E[f (x)f (x )] = φ(x)T E[ww T ]φ(x ) = φ(x)T Σp φ(x ) So, f (x) and f (x ) are jointly Gaussian with zero mean and covariance φ(x)T Σp φ(x ). Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 10. GP Models: Regression (contd.) In Regression, posterior distribution over the weights, is given as (9): likelhood ∗ prior posterior = marginal likelihood Both prior p(f|X ) and likelihood p(y |f, X ) are Gaussian: prior: f|X ∼ N (0, K ) (5) likelihood: y|f ∼ N (f, σ n 2 I) Marginal Likelihood p(y |X ) is defined as (6): p(y |X ) = p(y |f, X )p(f|X )df (2) Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 11. GP Models: Classification Modeling Binary Classifier Squash the output of a regression model using a response function, like sigmoid. Ex: Linear logistic regression model: 1 p(C1 |x) = λ(x T w ), λ(z) = 1 + exp(−z) Likelihood is expressed as (7): p(yi |xi , w ) = σ(yi fi ), fi ∼ f (xi ) = x i T w and therefore its non-Gaussain. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 12. GP Models: Classification (contd.) Distribution over latent function, after seeing the test data, is given as: p(f∗ |X , y , x∗ ) = p(f∗ |X , x∗ , f)p(f|X , y )df, (3) where p(f|X , y ) = p(y |f)p(f|X )/p(y |X ) is the posterior over the latent variable. Computation of the above integral is analytically intractable Both, likelihood and posterior are non-Gaussian. Need to use some analytic Approximation of integrals. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 13. GP Models: Laplace Approximations Gaussian Approximation of p(f|X , y ): Using second order Taylor expansion, we obtain: q(f|X , y ) = N (f|ˆ A−1 ) f, where where ˆ = argmaxf p(f|X , y ) and f A=− log p(f|X , y )|f=ˆ f To find ˆ we use Newton’s method, because of non-linearity of f, log p(f|X , y ) (9) Prediction is given as: π∗ = p(y∗ = +1|X , y , x∗ ) = σ(f∗ )p(f∗ |X , y , x∗ )df∗ , (4) Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 14. Covariance Function Encodes our belief about the prior distribution over function Some properties: Staionary Isotropic Dot-Product Covariance Ex: Squared Exponential(SE) covarince function: 1 cov (f (xp ), f (xq )) = exp(− |xp − xq |2 ) 2 Learned with other hyper-parameters. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 15. Summary and Future Work Current Research: Fast sparse approximation algorithm for matrix inversion. Approximation algorithm for non-Gaussian likelihoods. GP approach has outperformed traditional methods in many applications. Gaussin Process based Positioning System (GPPS) [6] Multi user Detection (MUD) in CDMA [7] GP models are more powerful and flexible than simple linear parametric models and less complex in comparison to other models like multi-layer perceptrons. [1] Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 16. Rasmussen and Williams. Gaussian Process for Machine Learning, The MIT Press, 2006. Matthias Seeger. Gaussian Process for Machine Learning, 2004. International Journal of Neural Systems, 14(2):69-106, 2004. Christopher Williams, Bayesian Classification with Gaussian Processes, In IEEE Trans. Pattern analysis and Machine Intelligence, 1998 Rasmussen and Williams, Gaussian Process for Regression. In Proceedings of NIPS’ 1996. Rasmussen, Evaluation of Gaussian Processes and Other Methods for Non-linear Regression. PhD thesis, Dept. of Computer Science, University of Toronto, 1996. Available from http://www.cs.utoronto.ca/ carl/ Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 17. Anton Schwaighofer, et. al. GPPS: A Gaussian Process Positioning System for Cellular Networks, In proceedings of NIPS’ 2003. Murillo-Fuentes, et. al. Gaussian Processes for Multiuser Detection in CDMA receivers, Advances in Neural Information Processing System’ 2005 David Mackay, Introduction to Gaussian Processes C. Williams. Gaussian processes. In M. A. Arbib, editor, Handbook of Brain Theory and Neural Networks, pages 466-470. The MIT Press, second edition, 2002. Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 18. Thank You !! Questions ?? Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 19. Extra Prior: 1 1 n log p(f|X ) = − f T K −1 f − log |K | − log 2π (5) 2 2 2 Mariginal likelihood 1 1 n log p(y|X ) = − yT (K +σ n 2 I)−1 y− log |K +σ n 2 I|− log 2π 2 2 2 (6) Likelihood p(y = +1|x, w ) = σ(x T w ), (7) For symmetric like hood σ(−z) = 1 − σ(z). p(yi |xi , w ) = σ(x i T w ), (8) Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning
  • 20. Extra (contd.) first derivative of posterior ˆ = K( f log p(f|X , y )) Prediction p(y|X, w) ∗ p(w) p(w |y , X ) = p(y |X ) Abhishek Agarwal (05329022) Gaussian Processes: Applications in Machine Learning