SlideShare a Scribd company logo
Machine Learning     
    
    
       
   
Srihari




   Basic Concepts in Machine
           Learning
                   Sargur N. Srihari




                                                      1
Machine Learning   
   
   
   
   
Srihari




     Introduction to ML: Topics
1. Polynomial Curve Fitting
2. Probability Theory of multiple variables
3. Maximum Likelihood
4. Bayesian Approach
5. Model Selection
6. Curse of Dimensionality



                                                  2
Machine Learning     
    
    
       
   
Srihari




       Polynomial Curve Fitting

                   Sargur N. Srihari




                                                      3
Machine Learning   
   
        
   
        
Srihari




    Simple Regression Problem
•  Observe Real-valued
   input variable x
•  Use x to predict value
   of target variable t
                          Target
•  Synthetic data         Variable
   generated from
   sin(2π x)
•  Random noise in
                                             Input Variable
   target values
                                                              4
Machine Learning   
        
          
        
        
Srihari




                        Notation
•  N observations of x
   x = (x1,..,xN)T
   t = (t1,..,tN)T
•  Goal is to exploit training
   set to predict value of
   from x
                                     Data Generation:
•  Inherently a difficult            N = 10
   problem                           Spaced uniformly in range [0,1]
                                     Generated from sin(2πx) by adding
•  Probability theory allows         small Gaussian noise
   us to make a prediction           Noise typical due to unobserved variables



                                                                          5
Machine Learning   
   
   
     
     
Srihari




        Polynomial Curve Fitting
•  Polynomial function


•  Where M is the order of the polynomial
•  Is higher value of M better? We’ll see shortly!
•  Coefficients w0 ,…wM are denoted by vector w
•  Nonlinear function of x, linear function of
   coefficients w
•  Called Linear Models

                                                       6
Machine Learning     
   
   
          
        
Srihari




                       Error Function
•  Sum of squares of the
   errors between the                Red line is best polynomial fit
   predictions y(xn,w) for
   each data point xn and
   target value tn


•  Factor ½ included for
   later convenience
•  Solve by choosing value
   of w for which E(w) is
   as small as possible
                                                                   7
Machine Learning   
   
       
              
                            
Srihari




    Minimization of Error Function
•  Error function is a quadratic in
   coefficients w
•  Derivative with respect to             Since y(x,w) = ∑ w j x j
                                                                              M




   coefficients will be linear in
                                                                              j= 0




   elements of w
                                          ∂E(w) N
                                                = ∑{y(x n ,w) − t n }x n
                                                                       i

•  Thus error function has a unique        ∂w i   n=1


   minimum denoted w*
                                                                  N      M
                                                         = ∑{∑ w j x nj − t n }x n
                                                                                 i

                                                                 n =1   j=0

•  Resulting polynomial is y(x,w*)        Setting equal to zero
                                          N   M                              N

                                          ∑∑w            j   x   i+ j
                                                                 n      = ∑ tn x n
                                                                                 i

                                          n=1 j= 0                        n=1

                                          Set of M +1 equations (i = 0,.., M) are
                                          solved to get elements of w *
                                                                                             8


                                  €
Machine Learning    
          
   
   
   
Srihari


      Choosing the order of M
•  Model Comparison or Model Selection
•  Red lines are best fits with
   –  M = 0,1,3,9 and N=10


                                              Poor
                                              representations
                                              of sin(2πx)




                                                   Over Fit
                           Best Fit
                                                   Poor
                           to                      representation
                           sin(2πx)
                                                   of sin(2πx)


                                                            9
Machine Learning        
               
            
       
       
Srihari




         Generalization Performance
    •  Consider separate test set of
       100 points
    •  For each value of M evaluate
             1 N
      E(w*) = ∑{y(x n ,w*) − t n }2
             2 n=1                                M
                                      y(x,w*) = ∑ w *j x j
                                                 j= 0

       for training data and test data
€   •  Use RMS error €
                                                                                      M=9 means ten
                                                                                      degrees of
                                                                           Small      freedom.
                                                             Poor due to              Tuned
       –  Division by N allows different                                   Error
                                                             Inflexible               exactly to 10
          sizes of N to be compared on                       polynomials              training points
          equal footing                                                               (wild
       –  Square root ensures ERMS is                                                 oscillations
          measured in same units as t                                                 in polynomial)
                                                                                              10
Machine Learning       
        
        
        
          
Srihari


  Values of Coefficients w* for
different polynomials of order M




        As M increases magnitude of coefficients increases
        At M=9 finely tuned to random noise in target values
                                                                          11
Machine Learning   
   
    
   
   
Srihari




    Increasing Size of Data Set

N=15, 100
For a given model complexity
  overfitting problem is less
  severe as size of data set
  increases
Larger the data set, the more
  complex we can afford to fit
  the data
Data should be no less than 5
  to 10 times adaptive
  parameters in model                               12
Machine Learning   
   
      
      
      
Srihari


 Least Squares is case of Maximum
            Likelihood
•  Unsatisfying to limit the number of parameters to
   size of training set
•  More reasonable to choose model complexity
   according to problem complexity
•  Least squares approach is a specific case of
   maximum likelihood
  –  Over-fitting is a general property of maximum likelihood
•  Bayesian approach avoids over-fitting problem
  –  No. of parameters can greatly exceed no. of data points
  –  Effective no. of parameters adapts automatically to size of
     data set
                                                             13
Machine Learning   
   
     
     
        
Srihari




Regularization of Least Squares
•  Using relatively complex models with data
   sets of limited size
•  Add a penalty term to error function to
   discourage coefficients from reaching
   large values                  •  λ determines relative
                                    importance of
                                           regularization term
                                           to error term
                                        •  Can be minimized
                                            exactly in closed form
                                        •  Known as shrinkage
                                            in statistics
                                            Weight decay in neural
                                              networks          14
Machine Learning             
     
            
          
      
Srihari




                  Effect of Regularizer
M=9 polynomials using regularized error function
                           Optimal
                                                            No                       Large
                                                            Regularizer              Regularizer
                                                            λ = 0
                   λ = 1




 Large Regularizer              No Regularizer

                                               λ = 0




                                                                                          15
Machine Learning   
      
      
   
      
Srihari




  Impact of Regularization on Error
•  λ controls the complexity of the
   model and hence degree of over-
   fitting
   –  Analogous to choice of M
•  Suggested Approach:
•  Training set
   –  to determine coefficients w             M=9 polynomial
   –  For different values of (M or λ)
•  Validation set (holdout)
   –  to optimize model complexity (M or λ)
                                                               16
Machine Learning   
   
   
     
     
Srihari




Concluding Remarks on Regression
•  Approach suggests partitioning data into training
   set to determine coefficients w
•  Separate validation set (or hold-out set) to
   optimize model complexity M or λ
•  More sophisticated approaches are not as
   wasteful of training data
•  More principled approach is based on probability
   theory
•  Classification is a special case of regression
   where target value is discrete values
                                                      17

More Related Content

What's hot

Principle of mathematical induction
Principle of mathematical inductionPrinciple of mathematical induction
Principle of mathematical induction
Kriti Varshney
 
Lesson 10: Derivatives of Trigonometric Functions
Lesson 10: Derivatives of Trigonometric FunctionsLesson 10: Derivatives of Trigonometric Functions
Lesson 10: Derivatives of Trigonometric Functions
Matthew Leingang
 
Second order homogeneous linear differential equations
Second order homogeneous linear differential equations Second order homogeneous linear differential equations
Second order homogeneous linear differential equations
Viraj Patel
 
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
Ceni Babaoglu, PhD
 
support vector regression
support vector regressionsupport vector regression
support vector regression
Akhilesh Joshi
 
Svm
SvmSvm
Mathematical Induction
Mathematical InductionMathematical Induction
Mathematical Induction
Edelyn Cagas
 
System of linear equations
System of linear equationsSystem of linear equations
System of linear equations
Diler4
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
Sanghyuk Chun
 
Series solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointSeries solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular point
vaibhav tailor
 
Ch9-Gauss_Elimination4.pdf
Ch9-Gauss_Elimination4.pdfCh9-Gauss_Elimination4.pdf
Ch9-Gauss_Elimination4.pdf
RahulUkhande
 
Least square method
Least square methodLeast square method
Least square method
Somya Bagai
 
Lesson 16: Exponential Growth and Decay
Lesson 16: Exponential Growth and DecayLesson 16: Exponential Growth and Decay
Lesson 16: Exponential Growth and Decay
Matthew Leingang
 
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamicsOne dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
Premashis Kumar
 
Methods of solving ODE
Methods of solving ODEMethods of solving ODE
Methods of solving ODE
kishor pokar
 
Lesson02 Vectors And Matrices Slides
Lesson02   Vectors And Matrices SlidesLesson02   Vectors And Matrices Slides
Lesson02 Vectors And Matrices Slides
Matthew Leingang
 
CMSC 56 | Lecture 15: Closures of Relations
CMSC 56 | Lecture 15: Closures of RelationsCMSC 56 | Lecture 15: Closures of Relations
CMSC 56 | Lecture 15: Closures of Relations
allyn joy calcaben
 
Newton's forward difference
Newton's forward differenceNewton's forward difference
Newton's forward difference
Raj Parekh
 
Matrices and System of Linear Equations ppt
Matrices and System of Linear Equations pptMatrices and System of Linear Equations ppt
Matrices and System of Linear Equations ppt
Drazzer_Dhruv
 
Discrete Structure Mathematics lecture 1
Discrete Structure Mathematics lecture 1Discrete Structure Mathematics lecture 1
Discrete Structure Mathematics lecture 1
Amr Rashed
 

What's hot (20)

Principle of mathematical induction
Principle of mathematical inductionPrinciple of mathematical induction
Principle of mathematical induction
 
Lesson 10: Derivatives of Trigonometric Functions
Lesson 10: Derivatives of Trigonometric FunctionsLesson 10: Derivatives of Trigonometric Functions
Lesson 10: Derivatives of Trigonometric Functions
 
Second order homogeneous linear differential equations
Second order homogeneous linear differential equations Second order homogeneous linear differential equations
Second order homogeneous linear differential equations
 
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
 
support vector regression
support vector regressionsupport vector regression
support vector regression
 
Svm
SvmSvm
Svm
 
Mathematical Induction
Mathematical InductionMathematical Induction
Mathematical Induction
 
System of linear equations
System of linear equationsSystem of linear equations
System of linear equations
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 
Series solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular pointSeries solutions at ordinary point and regular singular point
Series solutions at ordinary point and regular singular point
 
Ch9-Gauss_Elimination4.pdf
Ch9-Gauss_Elimination4.pdfCh9-Gauss_Elimination4.pdf
Ch9-Gauss_Elimination4.pdf
 
Least square method
Least square methodLeast square method
Least square method
 
Lesson 16: Exponential Growth and Decay
Lesson 16: Exponential Growth and DecayLesson 16: Exponential Growth and Decay
Lesson 16: Exponential Growth and Decay
 
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamicsOne dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
One dimensional flow,Bifurcation and Metamaterial in nonlinear dynamics
 
Methods of solving ODE
Methods of solving ODEMethods of solving ODE
Methods of solving ODE
 
Lesson02 Vectors And Matrices Slides
Lesson02   Vectors And Matrices SlidesLesson02   Vectors And Matrices Slides
Lesson02 Vectors And Matrices Slides
 
CMSC 56 | Lecture 15: Closures of Relations
CMSC 56 | Lecture 15: Closures of RelationsCMSC 56 | Lecture 15: Closures of Relations
CMSC 56 | Lecture 15: Closures of Relations
 
Newton's forward difference
Newton's forward differenceNewton's forward difference
Newton's forward difference
 
Matrices and System of Linear Equations ppt
Matrices and System of Linear Equations pptMatrices and System of Linear Equations ppt
Matrices and System of Linear Equations ppt
 
Discrete Structure Mathematics lecture 1
Discrete Structure Mathematics lecture 1Discrete Structure Mathematics lecture 1
Discrete Structure Mathematics lecture 1
 

Viewers also liked

Curve fitting
Curve fitting Curve fitting
Curve fitting
shopnohinami
 
Curve fitting - Lecture Notes
Curve fitting - Lecture NotesCurve fitting - Lecture Notes
Curve fitting - Lecture Notes
Dr. Nirav Vyas
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
Mayank Bhatt
 
Applied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
Applied Numerical Methods Curve Fitting: Least Squares Regression, InterpolationApplied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
Applied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
Brian Erandio
 
Polynomials and Curve Fitting in MATLAB
Polynomials and Curve Fitting in MATLABPolynomials and Curve Fitting in MATLAB
Polynomials and Curve Fitting in MATLAB
Shameer Ahmed Koya
 
Numerical method (curve fitting)
Numerical method (curve fitting)Numerical method (curve fitting)
Numerical method (curve fitting)
Varendra University Rajshahi-bangladesh
 
The least square method
The least square methodThe least square method
The least square method
kevinlefol
 
Applied numerical methods lec8
Applied numerical methods lec8Applied numerical methods lec8
Applied numerical methods lec8
Yasser Ahmed
 
Matlab polynimials and curve fitting
Matlab polynimials and curve fittingMatlab polynimials and curve fitting
Matlab polynimials and curve fitting
Ameen San
 
Method of least square
Method of least squareMethod of least square
Method of least square
Somya Bagai
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
dusan4rs
 
case study of curve fitting
case study of curve fittingcase study of curve fitting
case study of curve fitting
Adarsh Patel
 
Polynomial functions
Polynomial functionsPolynomial functions
Polynomial functions
nicole379865
 
Non linear curve fitting
Non linear curve fitting Non linear curve fitting
Non linear curve fitting
Anumita Mondal
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
Mandy Suzanne
 
2.4 grapgs of second degree functions
2.4 grapgs of second degree functions2.4 grapgs of second degree functions
2.4 grapgs of second degree functions
math260
 
Automating pKa Curve Fitting Using Origin
Automating pKa Curve Fitting Using OriginAutomating pKa Curve Fitting Using Origin
Automating pKa Curve Fitting Using Origin
Brian Bissett
 
Curve Fitting results 1110
Curve Fitting results 1110Curve Fitting results 1110
Curve Fitting results 1110
Ray Wang
 
Applocation of Numerical Methods
Applocation of Numerical MethodsApplocation of Numerical Methods
Applocation of Numerical Methods
Nahian Ahmed
 
WF ED 540, CLASS MEETING 6, Linear Regression, 2016
WF ED 540, CLASS MEETING 6, Linear Regression, 2016WF ED 540, CLASS MEETING 6, Linear Regression, 2016
WF ED 540, CLASS MEETING 6, Linear Regression, 2016
Penn State University
 

Viewers also liked (20)

Curve fitting
Curve fitting Curve fitting
Curve fitting
 
Curve fitting - Lecture Notes
Curve fitting - Lecture NotesCurve fitting - Lecture Notes
Curve fitting - Lecture Notes
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
Applied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
Applied Numerical Methods Curve Fitting: Least Squares Regression, InterpolationApplied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
Applied Numerical Methods Curve Fitting: Least Squares Regression, Interpolation
 
Polynomials and Curve Fitting in MATLAB
Polynomials and Curve Fitting in MATLABPolynomials and Curve Fitting in MATLAB
Polynomials and Curve Fitting in MATLAB
 
Numerical method (curve fitting)
Numerical method (curve fitting)Numerical method (curve fitting)
Numerical method (curve fitting)
 
The least square method
The least square methodThe least square method
The least square method
 
Applied numerical methods lec8
Applied numerical methods lec8Applied numerical methods lec8
Applied numerical methods lec8
 
Matlab polynimials and curve fitting
Matlab polynimials and curve fittingMatlab polynimials and curve fitting
Matlab polynimials and curve fitting
 
Method of least square
Method of least squareMethod of least square
Method of least square
 
Curve fitting
Curve fittingCurve fitting
Curve fitting
 
case study of curve fitting
case study of curve fittingcase study of curve fitting
case study of curve fitting
 
Polynomial functions
Polynomial functionsPolynomial functions
Polynomial functions
 
Non linear curve fitting
Non linear curve fitting Non linear curve fitting
Non linear curve fitting
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 
2.4 grapgs of second degree functions
2.4 grapgs of second degree functions2.4 grapgs of second degree functions
2.4 grapgs of second degree functions
 
Automating pKa Curve Fitting Using Origin
Automating pKa Curve Fitting Using OriginAutomating pKa Curve Fitting Using Origin
Automating pKa Curve Fitting Using Origin
 
Curve Fitting results 1110
Curve Fitting results 1110Curve Fitting results 1110
Curve Fitting results 1110
 
Applocation of Numerical Methods
Applocation of Numerical MethodsApplocation of Numerical Methods
Applocation of Numerical Methods
 
WF ED 540, CLASS MEETING 6, Linear Regression, 2016
WF ED 540, CLASS MEETING 6, Linear Regression, 2016WF ED 540, CLASS MEETING 6, Linear Regression, 2016
WF ED 540, CLASS MEETING 6, Linear Regression, 2016
 

Similar to Curve fitting

Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputs
zukun
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
Vu Pham
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
npinto
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Beniamino Murgante
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
zukun
 
Dsp2
Dsp2Dsp2
Higham
HighamHigham
Bouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration ToolboxBouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration Toolbox
Yuji Oyamada
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
MLconf
 
04 structured support vector machine
04 structured support vector machine04 structured support vector machine
04 structured support vector machine
zukun
 
Notes 6-2
Notes 6-2Notes 6-2
Notes 6-2
Jimbo Lamb
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic Sampling
Gabriel Peyré
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
홍배 김
 
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
Dr. Jai Sakthi
 
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
ICAC09
 
Bayesian Methods for Machine Learning
Bayesian Methods for Machine LearningBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning
butest
 
icml2004 tutorial on bayesian methods for machine learning
icml2004 tutorial on bayesian methods for machine learningicml2004 tutorial on bayesian methods for machine learning
icml2004 tutorial on bayesian methods for machine learning
zukun
 
feedforward-network-
feedforward-network-feedforward-network-
feedforward-network-
neelamsanjeevkumar
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
GayathriRHICETCSESTA
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
GayathriRHICETCSESTA
 

Similar to Curve fitting (20)

Machine learning of structured outputs
Machine learning of structured outputsMachine learning of structured outputs
Machine learning of structured outputs
 
K-means, EM and Mixture models
K-means, EM and Mixture modelsK-means, EM and Mixture models
K-means, EM and Mixture models
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
 
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
 
Dsp2
Dsp2Dsp2
Dsp2
 
Higham
HighamHigham
Higham
 
Bouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration ToolboxBouguet's MatLab Camera Calibration Toolbox
Bouguet's MatLab Camera Calibration Toolbox
 
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
Animashree Anandkumar, Electrical Engineering and CS Dept, UC Irvine at MLcon...
 
04 structured support vector machine
04 structured support vector machine04 structured support vector machine
04 structured support vector machine
 
Notes 6-2
Notes 6-2Notes 6-2
Notes 6-2
 
Mesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic SamplingMesh Processing Course : Geodesic Sampling
Mesh Processing Course : Geodesic Sampling
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
/.Amd mnt/lotus/host/home/jaishakthi/presentation/tencon/tencon
 
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...Multiple Kernel Learning based Approach to Representation and Feature Selecti...
Multiple Kernel Learning based Approach to Representation and Feature Selecti...
 
Bayesian Methods for Machine Learning
Bayesian Methods for Machine LearningBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning
 
icml2004 tutorial on bayesian methods for machine learning
icml2004 tutorial on bayesian methods for machine learningicml2004 tutorial on bayesian methods for machine learning
icml2004 tutorial on bayesian methods for machine learning
 
feedforward-network-
feedforward-network-feedforward-network-
feedforward-network-
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.pptcs621-lect18-feedforward-network-contd-2009-9-24.ppt
cs621-lect18-feedforward-network-contd-2009-9-24.ppt
 

More from JULIO GONZALEZ SANZ

Cmmi hm 2008 sepg model changes for high maturity 1v01[1]
Cmmi hm 2008 sepg model changes for high maturity  1v01[1]Cmmi hm 2008 sepg model changes for high maturity  1v01[1]
Cmmi hm 2008 sepg model changes for high maturity 1v01[1]
JULIO GONZALEZ SANZ
 
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
JULIO GONZALEZ SANZ
 
Cmmi 26 ago_2009_
Cmmi 26 ago_2009_Cmmi 26 ago_2009_
Cmmi 26 ago_2009_
JULIO GONZALEZ SANZ
 
Creation use-of-simple-model
Creation use-of-simple-modelCreation use-of-simple-model
Creation use-of-simple-model
JULIO GONZALEZ SANZ
 
Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]
JULIO GONZALEZ SANZ
 
Workshop healthy ingredients ppm[1]
Workshop healthy ingredients ppm[1]Workshop healthy ingredients ppm[1]
Workshop healthy ingredients ppm[1]
JULIO GONZALEZ SANZ
 
The need for a balanced measurement system
The need for a balanced measurement systemThe need for a balanced measurement system
The need for a balanced measurement system
JULIO GONZALEZ SANZ
 
Magic quadrant
Magic quadrantMagic quadrant
Magic quadrant
JULIO GONZALEZ SANZ
 
6 six sigma presentation
6 six sigma presentation6 six sigma presentation
6 six sigma presentation
JULIO GONZALEZ SANZ
 
Volvo csr suppliers guide vsib
Volvo csr suppliers guide vsibVolvo csr suppliers guide vsib
Volvo csr suppliers guide vsib
JULIO GONZALEZ SANZ
 
Just in-time and lean production
Just in-time and lean productionJust in-time and lean production
Just in-time and lean production
JULIO GONZALEZ SANZ
 
History of manufacturing systems and lean thinking enfr
History of manufacturing systems and lean thinking enfrHistory of manufacturing systems and lean thinking enfr
History of manufacturing systems and lean thinking enfrJULIO GONZALEZ SANZ
 
Using minitab exec files
Using minitab exec filesUsing minitab exec files
Using minitab exec files
JULIO GONZALEZ SANZ
 
Sga iso-14001
Sga iso-14001Sga iso-14001
Sga iso-14001
JULIO GONZALEZ SANZ
 
Cslt closing plenary_portugal
Cslt closing plenary_portugalCslt closing plenary_portugal
Cslt closing plenary_portugal
JULIO GONZALEZ SANZ
 
Une 66175 presentacion norma 2006 por julio
Une 66175 presentacion norma 2006 por julioUne 66175 presentacion norma 2006 por julio
Une 66175 presentacion norma 2006 por julio
JULIO GONZALEZ SANZ
 
Swebokv3
Swebokv3 Swebokv3
An architecture for data quality
An architecture for data qualityAn architecture for data quality
An architecture for data quality
JULIO GONZALEZ SANZ
 
Sap analytics creating smart business processes
Sap analytics   creating smart business processesSap analytics   creating smart business processes
Sap analytics creating smart business processes
JULIO GONZALEZ SANZ
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
JULIO GONZALEZ SANZ
 

More from JULIO GONZALEZ SANZ (20)

Cmmi hm 2008 sepg model changes for high maturity 1v01[1]
Cmmi hm 2008 sepg model changes for high maturity  1v01[1]Cmmi hm 2008 sepg model changes for high maturity  1v01[1]
Cmmi hm 2008 sepg model changes for high maturity 1v01[1]
 
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
Cmmi%20 model%20changes%20for%20high%20maturity%20v01[1]
 
Cmmi 26 ago_2009_
Cmmi 26 ago_2009_Cmmi 26 ago_2009_
Cmmi 26 ago_2009_
 
Creation use-of-simple-model
Creation use-of-simple-modelCreation use-of-simple-model
Creation use-of-simple-model
 
Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]Introduction to bayesian_networks[1]
Introduction to bayesian_networks[1]
 
Workshop healthy ingredients ppm[1]
Workshop healthy ingredients ppm[1]Workshop healthy ingredients ppm[1]
Workshop healthy ingredients ppm[1]
 
The need for a balanced measurement system
The need for a balanced measurement systemThe need for a balanced measurement system
The need for a balanced measurement system
 
Magic quadrant
Magic quadrantMagic quadrant
Magic quadrant
 
6 six sigma presentation
6 six sigma presentation6 six sigma presentation
6 six sigma presentation
 
Volvo csr suppliers guide vsib
Volvo csr suppliers guide vsibVolvo csr suppliers guide vsib
Volvo csr suppliers guide vsib
 
Just in-time and lean production
Just in-time and lean productionJust in-time and lean production
Just in-time and lean production
 
History of manufacturing systems and lean thinking enfr
History of manufacturing systems and lean thinking enfrHistory of manufacturing systems and lean thinking enfr
History of manufacturing systems and lean thinking enfr
 
Using minitab exec files
Using minitab exec filesUsing minitab exec files
Using minitab exec files
 
Sga iso-14001
Sga iso-14001Sga iso-14001
Sga iso-14001
 
Cslt closing plenary_portugal
Cslt closing plenary_portugalCslt closing plenary_portugal
Cslt closing plenary_portugal
 
Une 66175 presentacion norma 2006 por julio
Une 66175 presentacion norma 2006 por julioUne 66175 presentacion norma 2006 por julio
Une 66175 presentacion norma 2006 por julio
 
Swebokv3
Swebokv3 Swebokv3
Swebokv3
 
An architecture for data quality
An architecture for data qualityAn architecture for data quality
An architecture for data quality
 
Sap analytics creating smart business processes
Sap analytics   creating smart business processesSap analytics   creating smart business processes
Sap analytics creating smart business processes
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 

Curve fitting

  • 1. Machine Learning Srihari Basic Concepts in Machine Learning Sargur N. Srihari 1
  • 2. Machine Learning Srihari Introduction to ML: Topics 1. Polynomial Curve Fitting 2. Probability Theory of multiple variables 3. Maximum Likelihood 4. Bayesian Approach 5. Model Selection 6. Curse of Dimensionality 2
  • 3. Machine Learning Srihari Polynomial Curve Fitting Sargur N. Srihari 3
  • 4. Machine Learning Srihari Simple Regression Problem •  Observe Real-valued input variable x •  Use x to predict value of target variable t Target •  Synthetic data Variable generated from sin(2π x) •  Random noise in Input Variable target values 4
  • 5. Machine Learning Srihari Notation •  N observations of x x = (x1,..,xN)T t = (t1,..,tN)T •  Goal is to exploit training set to predict value of from x Data Generation: •  Inherently a difficult N = 10 problem Spaced uniformly in range [0,1] Generated from sin(2πx) by adding •  Probability theory allows small Gaussian noise us to make a prediction Noise typical due to unobserved variables 5
  • 6. Machine Learning Srihari Polynomial Curve Fitting •  Polynomial function •  Where M is the order of the polynomial •  Is higher value of M better? We’ll see shortly! •  Coefficients w0 ,…wM are denoted by vector w •  Nonlinear function of x, linear function of coefficients w •  Called Linear Models 6
  • 7. Machine Learning Srihari Error Function •  Sum of squares of the errors between the Red line is best polynomial fit predictions y(xn,w) for each data point xn and target value tn •  Factor ½ included for later convenience •  Solve by choosing value of w for which E(w) is as small as possible 7
  • 8. Machine Learning Srihari Minimization of Error Function •  Error function is a quadratic in coefficients w •  Derivative with respect to Since y(x,w) = ∑ w j x j M coefficients will be linear in j= 0 elements of w ∂E(w) N = ∑{y(x n ,w) − t n }x n i •  Thus error function has a unique ∂w i n=1 minimum denoted w* N M = ∑{∑ w j x nj − t n }x n i n =1 j=0 •  Resulting polynomial is y(x,w*) Setting equal to zero N M N ∑∑w j x i+ j n = ∑ tn x n i n=1 j= 0 n=1 Set of M +1 equations (i = 0,.., M) are solved to get elements of w * 8 €
  • 9. Machine Learning Srihari Choosing the order of M •  Model Comparison or Model Selection •  Red lines are best fits with –  M = 0,1,3,9 and N=10 Poor representations of sin(2πx) Over Fit Best Fit Poor to representation sin(2πx) of sin(2πx) 9
  • 10. Machine Learning Srihari Generalization Performance •  Consider separate test set of 100 points •  For each value of M evaluate 1 N E(w*) = ∑{y(x n ,w*) − t n }2 2 n=1 M y(x,w*) = ∑ w *j x j j= 0 for training data and test data € •  Use RMS error € M=9 means ten degrees of Small freedom. Poor due to Tuned –  Division by N allows different Error Inflexible exactly to 10 sizes of N to be compared on polynomials training points equal footing (wild –  Square root ensures ERMS is oscillations measured in same units as t in polynomial) 10
  • 11. Machine Learning Srihari Values of Coefficients w* for different polynomials of order M As M increases magnitude of coefficients increases At M=9 finely tuned to random noise in target values 11
  • 12. Machine Learning Srihari Increasing Size of Data Set N=15, 100 For a given model complexity overfitting problem is less severe as size of data set increases Larger the data set, the more complex we can afford to fit the data Data should be no less than 5 to 10 times adaptive parameters in model 12
  • 13. Machine Learning Srihari Least Squares is case of Maximum Likelihood •  Unsatisfying to limit the number of parameters to size of training set •  More reasonable to choose model complexity according to problem complexity •  Least squares approach is a specific case of maximum likelihood –  Over-fitting is a general property of maximum likelihood •  Bayesian approach avoids over-fitting problem –  No. of parameters can greatly exceed no. of data points –  Effective no. of parameters adapts automatically to size of data set 13
  • 14. Machine Learning Srihari Regularization of Least Squares •  Using relatively complex models with data sets of limited size •  Add a penalty term to error function to discourage coefficients from reaching large values •  λ determines relative importance of regularization term to error term •  Can be minimized exactly in closed form •  Known as shrinkage in statistics Weight decay in neural networks 14
  • 15. Machine Learning Srihari Effect of Regularizer M=9 polynomials using regularized error function Optimal No Large Regularizer Regularizer λ = 0 λ = 1 Large Regularizer No Regularizer λ = 0 15
  • 16. Machine Learning Srihari Impact of Regularization on Error •  λ controls the complexity of the model and hence degree of over- fitting –  Analogous to choice of M •  Suggested Approach: •  Training set –  to determine coefficients w M=9 polynomial –  For different values of (M or λ) •  Validation set (holdout) –  to optimize model complexity (M or λ) 16
  • 17. Machine Learning Srihari Concluding Remarks on Regression •  Approach suggests partitioning data into training set to determine coefficients w •  Separate validation set (or hold-out set) to optimize model complexity M or λ •  More sophisticated approaches are not as wasteful of training data •  More principled approach is based on probability theory •  Classification is a special case of regression where target value is discrete values 17