SlideShare a Scribd company logo
1 of 23
Download to read offline
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Parametric versus Semi/nonparametric Regression
Models
Hamdy F. F. Mahmoud
Virginia Polytechnic Institute and State University
Department of Statistics
LISA short course series- July 23, 2014
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 1/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 2/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Outline
1 What is semi/nonparametric regression?
2 When should we use semi/nonparametric regression?
3 semi/nonparametric regression estimation methods:
a). Kernel Regression
b). Smoothing Spline
4 Other methods in nonparametric regression models
estimation.
5 Discussion and Recommendations
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 3/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
What is semi/nonparametric regression?
Nonparametric regression is a form of regression analysis in which
NONE of the predictors take predetermined forms
with the response but are constructed according to
information derived from the data.
Semiparametric regression is a form of regression analysis in which
a PART of the predictors do not take predetermined
forms and the other part takes known forms with the
response.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 4/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
What is semi/nonparametric regression?
Example:
Assume that we have a response variable Y and two explanatory
variables, x1 and x2. In general the regression model that describes
the relationship can be written as:
Y = f1(x1)+f2(x2)+
Some parametric regression models:
• Y = β0 +β1x1 +β2x2 + (Multiple linear regression model)
• Y = β0 +β10x1 +β11x2
1 +β20x2 +β21x2
2 + (Polynomial
regression model of second order)
• Y = β0 +β1x1 +β2e(β3x2) + (Nonlinear regression model)
• log(µ) = β0 +β1x1 +β2x2 (Poisson regression when Y is
count)
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 5/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
What is semi/nonparametric regression?
• If we do not know f1 and f2 functions, we need to use a
NONparametric regression model.
• If we do not know f1 and know f2, we need to use
SEMIparametric regression model.
Example:
Y = β0 +β1x1 +f (x2)+
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 6/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
When we should use semi/nonparametric regression?
The FIRST STEP in any analysis is GRAPHICAL ANALYSIS for
the response (dependent) variable and the explanatory
(independent) variables.
Examples:
• Boxplots
• Area plots
• Scatterplots
GO TO REAL DATA [Course R Code]
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 7/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
When should we use semi/nonparametric regression?
Four principal assumptions which justify the use of linear regression
models for purposes of fitting and inferences:
• Linearity
• Independence
• Constant variance
• Normality
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 8/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
When should we use semi/nonparametric regression?
• Violations of linearity are extremely serious, especially when
you extrapolate beyond the range of the sample data.
• How to detect:
• Nonlinearity is usually most evident in a plot of residuals versus
predicted values.
• Use Goodness of fit test.
• How to fix:
• Use a nonlinear transformation to the dependent and/or
independent variables such as a log transformation, square
root, or power transformation.
• Add another regressors which is a nonlinear function of one of
the other variables. For example, if you have regressed Y on X,
it may make sense to regress Y on both X and X2 (i.e.,
X-squared).
• Use semi(non)parametric regression model.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 9/
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
When should we use semi/nonparametric regression?
GO to R Code file to:
• Practice on identifying the relationship (linear or not linear)
using different data sets.
• For wage data set, regress log(wage) on age using linear
regression model.
• Check the linearity assumption.
• Try to fix the problem.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 10
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Estimation methods
Two of the most commonly used approaches to nonparametric
regression are:
1 Kernel Regression: estimates the conditional expectation of
Y at given value x using a weighted filter to the data.
2 Smoothing splines: minimize the sum of squared residuals
plus a term which penalizes the roughness of the fit.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 11
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
1. Nadaraya-Watson Kernel Regression [local constant]
Nadaraya and Watson 1964 proposed a method to estimate ˆf (x0)
at a given value x0 as a locally weighted average of all y s
associated to the values around x. The Nadaraya-Watson
estimator is:
fh(x) =
n
i=1
K(
x−xi
h
)yi
n
i=1
K(
x−xi
h
)
where K is a Kernel function (weight function) with a bandwidth h.
Remark: K function should give us weights decline as one moves
away from the target value.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 12
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 13
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Popular choices of weight function are:
• Epanechnikov: K(·) = 3
4 (1−d2), d2 < 1, 0 otherwise,
• Minimum var: K(·) = 3
8 (1−5d2), d2 < 1, 0 otherwise,
• Gaussian density: exp(−x−xi
h )
• Tricube function: W (z) = (1−|z|3)3 for |z| < 1 and 0
otherwise.
Problem: local constant has one difficulty is that a kernel
smoother still exhibits bias at the end points.
Solution: Use local linear kernel regression
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 14
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
How to choose the bandwidth?
• Rule of thumb: If we use Gaussian then it can be shown that
the optimal choice for h is
h = 4ˆσ5
3n
1
5
≈ 1.06ˆσn−1/5,
where ˆσ is the standard deviation of the samples.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 15
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
GO to R Code file to:
1 Apply Kernel regression on wage data
2 Study the effect of bandwidth h on estimation
3 Compare between Kernel regression (nonparametric) and
second order polynomial regression (parametric) in terms of
fitting and prediction.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 16
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
2. Spline Smoothing
• A spline is a piecewise polynomial with pieces defined by a
sequence of knots
θ1 < θ2 < ..... < θK
such that the pieces join smoothly at the knots.
• A spline of degree p can be represented as a power series:
f (x) = β0 +β1x +β2x2 +β3x3 +....+βpxp + K
k=1 β1k(x −θk)p
+,
where (x −θk)+ = x −θk,x > θk and 0 otherwise
Example: f (x) = β0 +β1x + K
k=1 β1k(x −θk)+ (linear spline)
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 17
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
• How many knots need to be used?
• Where those knots should be located?
• Number of parameters is 1+p +K that we need big number
of observations.
Possible solution: use penalized spline smoothing
Consider fitting a spline with knots of every data point, so it could
fit perfectly, but estimate its parameters by minimizing the usual
sum of squares plus a roughness penalty. A suitable penalty is to
integrate the squared second derivative, leading to penalized sum
of squares criterion:
n
i=1[yi −f (xi )]2 +λ [f (x)]2dx
where λ is a tunning parameter controls smoothness.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 18
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
GO to R Code file to:
1 Apply spline regression on prestige data, and
2 Study the effect of λ on smoothing
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 19
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Semi/nonparametric regression estimation methods.
What if we have more than one explanatory (independent)
variable?
1 Kernel Regression
2 Spline Regression
GO to R Code file
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 20
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Discussion and Recommendations
Pros:
• It is flexible.
• Better in fitting the data than parametric regression models.
Cons:
• Nonparametric regression requires larger sample sizes than
regression based on parametric models because the data must
supply the model structure as well as the model estimates.
limitations
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 21
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
Discussion and Recommendations
Steps of modeling:
• Graphical Analysis
• If you have a nonlinear and unknown relationship between a
response and an explanatory variable:
• use transformation
• add a new variable in the model to capture the relationship.
• If transformation does not work, use nonparametric regression.
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 22
What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and
References
1 D. Ruppert, M. P. Wand, and R. J. Carrol (2003),
“Semiparametric Regression”, Cambridge University Press,
NY.
2 J. S. Simonoff (1996) “Smoothing methods in statistics”,
New York : Springer.
3 Y. Wang (2011) “Smoothing splines : methods and
applications” Boca Raton, FL: CRC Press.
4 M.P. Wand and M.C. Jones (1995) “Kernel smoothing”,
London; New York: Chapman and Hall .
Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 23

More Related Content

What's hot

Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descentSuraj Parmar
 
Stochastic modelling and its applications
Stochastic modelling and its applicationsStochastic modelling and its applications
Stochastic modelling and its applicationsKartavya Jain
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learningParas Kohli
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANNMohamed Talaat
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learningbutest
 
Binary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningBinary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningPaxcel Technologies
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度Masa Kato
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine LearningJoel Graff
 
Simple Introduction to AutoEncoder
Simple Introduction to AutoEncoderSimple Introduction to AutoEncoder
Simple Introduction to AutoEncoderJun Lang
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learningbutest
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning Chandra Meena
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitionsTips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitionsDarius Barušauskas
 
Back propagation
Back propagationBack propagation
Back propagationNagarajan
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithmRashid Ansari
 

What's hot (20)

Linear regression with gradient descent
Linear regression with gradient descentLinear regression with gradient descent
Linear regression with gradient descent
 
Stochastic modelling and its applications
Stochastic modelling and its applicationsStochastic modelling and its applications
Stochastic modelling and its applications
 
Supervised and unsupervised learning
Supervised and unsupervised learningSupervised and unsupervised learning
Supervised and unsupervised learning
 
Artificial Neural Networks - ANN
Artificial Neural Networks - ANNArtificial Neural Networks - ANN
Artificial Neural Networks - ANN
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learning
 
Binary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine LearningBinary Class and Multi Class Strategies for Machine Learning
Binary Class and Multi Class Strategies for Machine Learning
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度敵対的学習に対するラデマッハ複雑度
敵対的学習に対するラデマッハ複雑度
 
Applications in Machine Learning
Applications in Machine LearningApplications in Machine Learning
Applications in Machine Learning
 
Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
Simple Introduction to AutoEncoder
Simple Introduction to AutoEncoderSimple Introduction to AutoEncoder
Simple Introduction to AutoEncoder
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning
 
Probit model
Probit modelProbit model
Probit model
 
supervised learning
supervised learningsupervised learning
supervised learning
 
Tips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitionsTips and tricks to win kaggle data science competitions
Tips and tricks to win kaggle data science competitions
 
Back propagation
Back propagationBack propagation
Back propagation
 
Random forest algorithm
Random forest algorithmRandom forest algorithm
Random forest algorithm
 

Similar to SEMI/NONPARAMETRIC REGRESSION METHODS

Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Maninda Edirisooriya
 
Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Rabby Bhatt
 
Master of Computer Application (MCA) – Semester 4 MC0079
Master of Computer Application (MCA) – Semester 4  MC0079Master of Computer Application (MCA) – Semester 4  MC0079
Master of Computer Application (MCA) – Semester 4 MC0079Aravind NC
 
Mixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete DataMixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete DataLoc Nguyen
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validationgmorishita
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural NetRatul Alahy
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiersKrish_ver2
 
Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Nicky Grant
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
Machine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerMachine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerLearnbay Datascience
 
A comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingA comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingtuxette
 
Bias-Variance_relted_to_ML.pdf
Bias-Variance_relted_to_ML.pdfBias-Variance_relted_to_ML.pdf
Bias-Variance_relted_to_ML.pdfVGaneshKarthikeyan
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 

Similar to SEMI/NONPARAMETRIC REGRESSION METHODS (20)

Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
 
Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02Mc0079 computer based optimization methods--phpapp02
Mc0079 computer based optimization methods--phpapp02
 
Master of Computer Application (MCA) – Semester 4 MC0079
Master of Computer Application (MCA) – Semester 4  MC0079Master of Computer Application (MCA) – Semester 4  MC0079
Master of Computer Application (MCA) – Semester 4 MC0079
 
Mathematical modeling
Mathematical modelingMathematical modeling
Mathematical modeling
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
Matlab: Regression
Matlab: RegressionMatlab: Regression
Matlab: Regression
 
Mixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete DataMixture Regression Model for Incomplete Data
Mixture Regression Model for Incomplete Data
 
Model Selection and Validation
Model Selection and ValidationModel Selection and Validation
Model Selection and Validation
 
2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data 2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural Net
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
Thesis_NickyGrant_2013
Thesis_NickyGrant_2013Thesis_NickyGrant_2013
Thesis_NickyGrant_2013
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
 
Machine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerMachine Learning Interview Question and Answer
Machine Learning Interview Question and Answer
 
A comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leachingA comparison of three learning methods to predict N20 fluxes and N leaching
A comparison of three learning methods to predict N20 fluxes and N leaching
 
Bias-Variance_relted_to_ML.pdf
Bias-Variance_relted_to_ML.pdfBias-Variance_relted_to_ML.pdf
Bias-Variance_relted_to_ML.pdf
 
Machine learning mathematicals.pdf
Machine learning mathematicals.pdfMachine learning mathematicals.pdf
Machine learning mathematicals.pdf
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 
LINEAR REGRESSION.pptx
LINEAR REGRESSION.pptxLINEAR REGRESSION.pptx
LINEAR REGRESSION.pptx
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 

Recently uploaded (20)

CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 

SEMI/NONPARAMETRIC REGRESSION METHODS

  • 1. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Parametric versus Semi/nonparametric Regression Models Hamdy F. F. Mahmoud Virginia Polytechnic Institute and State University Department of Statistics LISA short course series- July 23, 2014 Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 1/
  • 2. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 2/
  • 3. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Outline 1 What is semi/nonparametric regression? 2 When should we use semi/nonparametric regression? 3 semi/nonparametric regression estimation methods: a). Kernel Regression b). Smoothing Spline 4 Other methods in nonparametric regression models estimation. 5 Discussion and Recommendations Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 3/
  • 4. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and What is semi/nonparametric regression? Nonparametric regression is a form of regression analysis in which NONE of the predictors take predetermined forms with the response but are constructed according to information derived from the data. Semiparametric regression is a form of regression analysis in which a PART of the predictors do not take predetermined forms and the other part takes known forms with the response. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 4/
  • 5. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and What is semi/nonparametric regression? Example: Assume that we have a response variable Y and two explanatory variables, x1 and x2. In general the regression model that describes the relationship can be written as: Y = f1(x1)+f2(x2)+ Some parametric regression models: • Y = β0 +β1x1 +β2x2 + (Multiple linear regression model) • Y = β0 +β10x1 +β11x2 1 +β20x2 +β21x2 2 + (Polynomial regression model of second order) • Y = β0 +β1x1 +β2e(β3x2) + (Nonlinear regression model) • log(µ) = β0 +β1x1 +β2x2 (Poisson regression when Y is count) Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 5/
  • 6. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and What is semi/nonparametric regression? • If we do not know f1 and f2 functions, we need to use a NONparametric regression model. • If we do not know f1 and know f2, we need to use SEMIparametric regression model. Example: Y = β0 +β1x1 +f (x2)+ Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 6/
  • 7. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and When we should use semi/nonparametric regression? The FIRST STEP in any analysis is GRAPHICAL ANALYSIS for the response (dependent) variable and the explanatory (independent) variables. Examples: • Boxplots • Area plots • Scatterplots GO TO REAL DATA [Course R Code] Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 7/
  • 8. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and When should we use semi/nonparametric regression? Four principal assumptions which justify the use of linear regression models for purposes of fitting and inferences: • Linearity • Independence • Constant variance • Normality Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 8/
  • 9. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and When should we use semi/nonparametric regression? • Violations of linearity are extremely serious, especially when you extrapolate beyond the range of the sample data. • How to detect: • Nonlinearity is usually most evident in a plot of residuals versus predicted values. • Use Goodness of fit test. • How to fix: • Use a nonlinear transformation to the dependent and/or independent variables such as a log transformation, square root, or power transformation. • Add another regressors which is a nonlinear function of one of the other variables. For example, if you have regressed Y on X, it may make sense to regress Y on both X and X2 (i.e., X-squared). • Use semi(non)parametric regression model. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 9/
  • 10. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and When should we use semi/nonparametric regression? GO to R Code file to: • Practice on identifying the relationship (linear or not linear) using different data sets. • For wage data set, regress log(wage) on age using linear regression model. • Check the linearity assumption. • Try to fix the problem. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 10
  • 11. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Estimation methods Two of the most commonly used approaches to nonparametric regression are: 1 Kernel Regression: estimates the conditional expectation of Y at given value x using a weighted filter to the data. 2 Smoothing splines: minimize the sum of squared residuals plus a term which penalizes the roughness of the fit. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 11
  • 12. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. 1. Nadaraya-Watson Kernel Regression [local constant] Nadaraya and Watson 1964 proposed a method to estimate ˆf (x0) at a given value x0 as a locally weighted average of all y s associated to the values around x. The Nadaraya-Watson estimator is: fh(x) = n i=1 K( x−xi h )yi n i=1 K( x−xi h ) where K is a Kernel function (weight function) with a bandwidth h. Remark: K function should give us weights decline as one moves away from the target value. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 12
  • 13. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 13
  • 14. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Popular choices of weight function are: • Epanechnikov: K(·) = 3 4 (1−d2), d2 < 1, 0 otherwise, • Minimum var: K(·) = 3 8 (1−5d2), d2 < 1, 0 otherwise, • Gaussian density: exp(−x−xi h ) • Tricube function: W (z) = (1−|z|3)3 for |z| < 1 and 0 otherwise. Problem: local constant has one difficulty is that a kernel smoother still exhibits bias at the end points. Solution: Use local linear kernel regression Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 14
  • 15. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. How to choose the bandwidth? • Rule of thumb: If we use Gaussian then it can be shown that the optimal choice for h is h = 4ˆσ5 3n 1 5 ≈ 1.06ˆσn−1/5, where ˆσ is the standard deviation of the samples. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 15
  • 16. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. GO to R Code file to: 1 Apply Kernel regression on wage data 2 Study the effect of bandwidth h on estimation 3 Compare between Kernel regression (nonparametric) and second order polynomial regression (parametric) in terms of fitting and prediction. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 16
  • 17. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. 2. Spline Smoothing • A spline is a piecewise polynomial with pieces defined by a sequence of knots θ1 < θ2 < ..... < θK such that the pieces join smoothly at the knots. • A spline of degree p can be represented as a power series: f (x) = β0 +β1x +β2x2 +β3x3 +....+βpxp + K k=1 β1k(x −θk)p +, where (x −θk)+ = x −θk,x > θk and 0 otherwise Example: f (x) = β0 +β1x + K k=1 β1k(x −θk)+ (linear spline) Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 17
  • 18. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. • How many knots need to be used? • Where those knots should be located? • Number of parameters is 1+p +K that we need big number of observations. Possible solution: use penalized spline smoothing Consider fitting a spline with knots of every data point, so it could fit perfectly, but estimate its parameters by minimizing the usual sum of squares plus a roughness penalty. A suitable penalty is to integrate the squared second derivative, leading to penalized sum of squares criterion: n i=1[yi −f (xi )]2 +λ [f (x)]2dx where λ is a tunning parameter controls smoothness. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 18
  • 19. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. GO to R Code file to: 1 Apply spline regression on prestige data, and 2 Study the effect of λ on smoothing Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 19
  • 20. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Semi/nonparametric regression estimation methods. What if we have more than one explanatory (independent) variable? 1 Kernel Regression 2 Spline Regression GO to R Code file Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 20
  • 21. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Discussion and Recommendations Pros: • It is flexible. • Better in fitting the data than parametric regression models. Cons: • Nonparametric regression requires larger sample sizes than regression based on parametric models because the data must supply the model structure as well as the model estimates. limitations Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 21
  • 22. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and Discussion and Recommendations Steps of modeling: • Graphical Analysis • If you have a nonlinear and unknown relationship between a response and an explanatory variable: • use transformation • add a new variable in the model to capture the relationship. • If transformation does not work, use nonparametric regression. Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 22
  • 23. What is semi/nonparametric regression? When should we use semi/nonparametric regression? Estimation methods. Discussion and References 1 D. Ruppert, M. P. Wand, and R. J. Carrol (2003), “Semiparametric Regression”, Cambridge University Press, NY. 2 J. S. Simonoff (1996) “Smoothing methods in statistics”, New York : Springer. 3 Y. Wang (2011) “Smoothing splines : methods and applications” Boca Raton, FL: CRC Press. 4 M.P. Wand and M.C. Jones (1995) “Kernel smoothing”, London; New York: Chapman and Hall . Hamdy Mahmoud - Email: ehamdy@vt.edu Parametric versus Semi/nonparametric Regression Models 23