SlideShare a Scribd company logo
1 of 25
EXTENSION OF
LINEAR REGRESSION
MULTIPLE LINEAR
REGRESSION
TITLE AND
CONTENT
LAYOUT WITH
LIST
 Multiple Linear Regression
 Estimating the Regression Coefficients
 Some Important Questions
SIMPLE VS
MULTIPLE
LINEAR
REGRESSION
WHAT IS MULTIPLE LINEAR REGRESSION?
Multiple linear regression (MLR), also known simply as multiple
regression, is a statistical technique that uses several
explanatory variables to predict the outcome of a response variable.
goal of multiple linear regression (MLR) is to model the linear
relationship between the explanatory.
To find the best-fit line for each independent variable,
multiple linear regression calculates three things:

The regression coefficients that lead to the smallest overall
model error.
The t-statistic of the overall model.
The associated p-value (how likely it is that the t-statistic
would have occurred by chance if the null hypothesis of no
relationship between the independent and dependent
variables was true).
It then calculates the t-statistic and p-value for each
regression coefficient in the model.
VISUALIZING THE RESULTS IN A GRAPH
ESTIMATING THE REGRESSION COEFFICIENTS
ESTIMATING THE
REGRESSION
COEFFICIENTS
The parameters are estimated using the same least squares approach that
we saw in the context of simple linear regression. We choose β0, β1,...,βp to
minimize the sum of squared residuals
FIGURE
In a three-dimensional setting, with two
predictors and one response, the least squares
regression line becomes a plane. The plane is
chosen to minimize the sum of the squared
vertical distances between each observation
(shown in red) and the plane.
SOME
IMPORTANT
QUESTIONS
When we perform multiple linear regression, we usually are interested in
answering a few important questions.
1. Is at least one of the predictors X1, X2,...,Xp useful in predicting the
response?
2. Do all the predictors help to explain Y , or is only a subset of the
predictors useful?
3. How well does the model fit the data?
4. Given a set of predictor values, what response value should we predict,
and how accurate is our prediction?
TWO: DECIDING ON
IMPORTANT VARIABLES
FORWARD SELECTION. WE BEGIN WITH THE NULL MODEL—A
MODEL THAT CON- FORWARD SELECTION NULL MODEL TAINS AN
INTERCEPT BUT NO PREDICTORS. WE THEN FIT P SIMPLE LINEAR
REGRESSIONS AND ADD TO THE NULL MODEL THE VARIABLE THAT
RESULTS IN THE LOWEST RSS. WE THEN ADD TO THAT MODEL THE
VARIABLE THAT RESULTS IN THE LOWEST RSS FOR THE NEW TWO-
VARIABLE MODEL. THIS APPROACH IS CONTINUED UNTIL SOME
STOPPING RULE IS SATISFIED.
BACKWARD SELECTION. WE START WITH ALL VARIABLES
IN THE MODEL, AND BACKWARD REMOVE THE
VARIABLE WITH THE LARGEST P-VALUE—THAT IS, THE
VARIABLE SELECTION THAT IS THE LEAST STATISTICALLY
SIGNIFICANT. THE NEW (P − 1)-VARIABLE MODEL IS FIT,
AND THE VARIABLE WITH THE LARGEST P-VALUE IS
REMOVED. THIS PROCEDURE CONTINUES UNTIL A
STOPPING RULE IS REACHED. FOR INSTANCE, WE MAY
STOP WHEN ALL REMAINING VARIABLES HAVE A P-
VALUE BELOW SOME THRESHOLD.
MIXED SELECTION. THIS IS A COMBINATION OF FORWARD
AND BACKWARD SE- MIXED LECTION. WE START WITH NO
VARIABLES IN THE MODEL, AND AS WITH FORWARD
SELECTION SELECTION, WE ADD THE VARIABLE THAT
PROVIDES THE BEST FIT. WE CONTINUE TO ADD
VARIABLES ONE-BY-ONE. OF COURSE, AS WE NOTED
WITH THE ADVERTISING EXAMPLE, THE P-VALUES FOR
VARIABLES CAN BECOME LARGER AS NEW PREDICTORS
ARE ADDED TO THE MODEL. HENCE, IF AT ANY POINT THE
P-VALUE FOR ONE OF THE VARIABLES IN THE MODEL
RISES ABOVE A CERTAIN THRESHOLD, THEN WE REMOVE
THAT VARIABLE FROM THE MODEL. WE CONTINUE TO
PERFORM THESE FORWARD AND BACKWARD STEPS UNTIL
ALL VARIABLES IN THE MODEL HAVE A SUFFICIENTLY LOW
P-VALUE, AND ALL VARIABLES OUTSIDE THE MODEL
WOULD HAVE A LARGE P-VALUE IF ADDED TO THE
MODEL.
FIGURE
For the Advertising data, a linear
regression fit to sales using TV and
radio as predictors. From the pattern
of the residuals, we can see that
there is a pronounced non-linear
relationship in the data. The positive
residuals (those visible above the
surface), tend to lie along the 45-
degree line, where TV and Radio
budgets are split evenly. The
negative residuals (most not visible),
tend to lie away from this line, where
budgets are more lopsided.
FOUR: PREDICTIONS
2. OF COURSE, IN PRACTICE ASSUMING A LINEAR MODEL FOR F(X) IS
ALMOST ALWAYS AN APPROXIMATION OF REALITY, SO THERE IS AN
ADDITIONAL SOURCE OF POTENTIALLY REDUCIBLE ERROR WHICH
WE CALL MODEL BIAS. SO WHEN WE USE A LINEAR MODEL, WE ARE
IN FACT ESTIMATING THE BEST LINEAR APPROXIMATION TO THE
TRUE SURFACE. HOWEVER, HERE WE WILL IGNORE THIS
DISCREPANCY, AND OPERATE AS IF THE LINEAR MODEL WERE
CORRECT.
3. EVEN IF WE KNEW F(X)—THAT
IS, EVEN IF WE KNEW THE TRUE
VALUES FOR Β0, Β1,...,ΒP—THE
RESPONSE VALUE CANNOT BE
PREDICTED PERFECTLY BECAUSE
OF THE RANDOM ERROR IN THE
MODEL REFERRED TO THIS AS
THE IRREDUCIBLE ERROR
THANK YOU

More Related Content

What's hot

AT&T Revenue Regression Forecast
AT&T Revenue Regression ForecastAT&T Revenue Regression Forecast
AT&T Revenue Regression Forecast
Craig Jenkins, MBA
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
Saleesh Satheeshchandran
 
IBM401 Lecture 5
IBM401 Lecture 5IBM401 Lecture 5
IBM401 Lecture 5
saark
 
Lecture8 Applied Econometrics and Economic Modeling
Lecture8 Applied Econometrics and Economic ModelingLecture8 Applied Econometrics and Economic Modeling
Lecture8 Applied Econometrics and Economic Modeling
stone55
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
Ulster BOCES
 
Empirical Finance, Jordan Stone- Linkedin
Empirical Finance, Jordan Stone- LinkedinEmpirical Finance, Jordan Stone- Linkedin
Empirical Finance, Jordan Stone- Linkedin
Jordan Stone
 

What's hot (20)

Percents lab
Percents labPercents lab
Percents lab
 
AT&T Revenue Regression Forecast
AT&T Revenue Regression ForecastAT&T Revenue Regression Forecast
AT&T Revenue Regression Forecast
 
Factors affecting customer satisfaction
Factors affecting customer satisfactionFactors affecting customer satisfaction
Factors affecting customer satisfaction
 
Lisrel
LisrelLisrel
Lisrel
 
Bank loan purchase modeling
Bank loan purchase modelingBank loan purchase modeling
Bank loan purchase modeling
 
HackerOne_Report
HackerOne_ReportHackerOne_Report
HackerOne_Report
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
IBM401 Lecture 5
IBM401 Lecture 5IBM401 Lecture 5
IBM401 Lecture 5
 
Lecture8 Applied Econometrics and Economic Modeling
Lecture8 Applied Econometrics and Economic ModelingLecture8 Applied Econometrics and Economic Modeling
Lecture8 Applied Econometrics and Economic Modeling
 
Interpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine LearningInterpreting Regression Results - Machine Learning
Interpreting Regression Results - Machine Learning
 
R - Multiple Regression
R - Multiple RegressionR - Multiple Regression
R - Multiple Regression
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013Structural equation-models-introduction-kimmo-vehkalahti-2013
Structural equation-models-introduction-kimmo-vehkalahti-2013
 
PRM project report
PRM project reportPRM project report
PRM project report
 
Empirical Finance, Jordan Stone- Linkedin
Empirical Finance, Jordan Stone- LinkedinEmpirical Finance, Jordan Stone- Linkedin
Empirical Finance, Jordan Stone- Linkedin
 
Introduction to Regression Analysis
Introduction to Regression AnalysisIntroduction to Regression Analysis
Introduction to Regression Analysis
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Logarithmic transformations
Logarithmic transformationsLogarithmic transformations
Logarithmic transformations
 
Can you Deep Learn the Stock Market?
Can you Deep Learn the Stock Market?Can you Deep Learn the Stock Market?
Can you Deep Learn the Stock Market?
 
Best crime predictor: Linear Regression
Best crime predictor: Linear RegressionBest crime predictor: Linear Regression
Best crime predictor: Linear Regression
 

Similar to statistical learning theory

Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
recepmaz
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
recepmaz
 
My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)
chrisstiff
 
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
Distribution of EstimatesLinear Regression ModelAssume (yt,.docxDistribution of EstimatesLinear Regression ModelAssume (yt,.docx
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
madlynplamondon
 
linearregression-1909240jhgg53948.pptx
linearregression-1909240jhgg53948.pptxlinearregression-1909240jhgg53948.pptx
linearregression-1909240jhgg53948.pptx
bishalnandi2
 

Similar to statistical learning theory (20)

Data Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVAData Science - Part IV - Regression Analysis & ANOVA
Data Science - Part IV - Regression Analysis & ANOVA
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
 
Recep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managersRecep maz msb 701 quantitative analysis for managers
Recep maz msb 701 quantitative analysis for managers
 
Unit 03 - Consolidated.pptx
Unit 03 - Consolidated.pptxUnit 03 - Consolidated.pptx
Unit 03 - Consolidated.pptx
 
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
 
manecohuhuhuhubasicEstimation-1.pptx
manecohuhuhuhubasicEstimation-1.pptxmanecohuhuhuhubasicEstimation-1.pptx
manecohuhuhuhubasicEstimation-1.pptx
 
Machine Learning-Linear regression
Machine Learning-Linear regressionMachine Learning-Linear regression
Machine Learning-Linear regression
 
My regression lecture mk3 (uploaded to web ct)
My regression lecture   mk3 (uploaded to web ct)My regression lecture   mk3 (uploaded to web ct)
My regression lecture mk3 (uploaded to web ct)
 
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
Distribution of EstimatesLinear Regression ModelAssume (yt,.docxDistribution of EstimatesLinear Regression ModelAssume (yt,.docx
Distribution of EstimatesLinear Regression ModelAssume (yt,.docx
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
SEM
SEMSEM
SEM
 
linearregression-1909240jhgg53948.pptx
linearregression-1909240jhgg53948.pptxlinearregression-1909240jhgg53948.pptx
linearregression-1909240jhgg53948.pptx
 
PERFORMANCE_PREDICTION__PARAMETERS[1].pptx
PERFORMANCE_PREDICTION__PARAMETERS[1].pptxPERFORMANCE_PREDICTION__PARAMETERS[1].pptx
PERFORMANCE_PREDICTION__PARAMETERS[1].pptx
 
All PERFORMANCE PREDICTION PARAMETERS.pptx
All PERFORMANCE PREDICTION  PARAMETERS.pptxAll PERFORMANCE PREDICTION  PARAMETERS.pptx
All PERFORMANCE PREDICTION PARAMETERS.pptx
 
Multiple Linear Regression
Multiple Linear Regression Multiple Linear Regression
Multiple Linear Regression
 
Applications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationshipApplications of regression analysis - Measurement of validity of relationship
Applications of regression analysis - Measurement of validity of relationship
 
MF Presentation.pptx
MF Presentation.pptxMF Presentation.pptx
MF Presentation.pptx
 
Use of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for RankingUse of Linear Regression in Machine Learning for Ranking
Use of Linear Regression in Machine Learning for Ranking
 
Statistical analysis in SPSS_
Statistical analysis in SPSS_ Statistical analysis in SPSS_
Statistical analysis in SPSS_
 

Recently uploaded

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 

Recently uploaded (20)

What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
WSO2CON 2024 - Not Just Microservices: Rightsize Your Services!
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration ToolingWSO2Con2024 - Low-Code Integration Tooling
WSO2Con2024 - Low-Code Integration Tooling
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAMWSO2Con2024 - Organization Management: The Revolution in B2B CIAM
WSO2Con2024 - Organization Management: The Revolution in B2B CIAM
 
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
WSO2Con2024 - Simplified Integration: Unveiling the Latest Features in WSO2 L...
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
WSO2CON 2024 - Unlocking the Identity: Embracing CIAM 2.0 for a Competitive A...
 
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdfAzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
AzureNativeQumulo_HPC_Cloud_Native_Benchmarks.pdf
 
WSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - KeynoteWSO2Con204 - Hard Rock Presentation - Keynote
WSO2Con204 - Hard Rock Presentation - Keynote
 
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next IntegrationWSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
WSO2CON2024 - Why Should You Consider Ballerina for Your Next Integration
 
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million PeopleWSO2Con2024 - Unleashing the Financial Potential of 13 Million People
WSO2Con2024 - Unleashing the Financial Potential of 13 Million People
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
WSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in UgandaWSO2CON 2024 - Building a Digital Government in Uganda
WSO2CON 2024 - Building a Digital Government in Uganda
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
WSO2CON 2024 - API Management Usage at La Poste and Its Impact on Business an...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
WSO2Con2024 - GitOps in Action: Navigating Application Deployment in the Plat...
 

statistical learning theory

  • 2. TITLE AND CONTENT LAYOUT WITH LIST  Multiple Linear Regression  Estimating the Regression Coefficients  Some Important Questions
  • 4. WHAT IS MULTIPLE LINEAR REGRESSION? Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory.
  • 5.
  • 6. To find the best-fit line for each independent variable, multiple linear regression calculates three things:  The regression coefficients that lead to the smallest overall model error. The t-statistic of the overall model. The associated p-value (how likely it is that the t-statistic would have occurred by chance if the null hypothesis of no relationship between the independent and dependent variables was true). It then calculates the t-statistic and p-value for each regression coefficient in the model.
  • 9. ESTIMATING THE REGRESSION COEFFICIENTS The parameters are estimated using the same least squares approach that we saw in the context of simple linear regression. We choose β0, β1,...,βp to minimize the sum of squared residuals
  • 10. FIGURE In a three-dimensional setting, with two predictors and one response, the least squares regression line becomes a plane. The plane is chosen to minimize the sum of the squared vertical distances between each observation (shown in red) and the plane.
  • 11. SOME IMPORTANT QUESTIONS When we perform multiple linear regression, we usually are interested in answering a few important questions. 1. Is at least one of the predictors X1, X2,...,Xp useful in predicting the response? 2. Do all the predictors help to explain Y , or is only a subset of the predictors useful? 3. How well does the model fit the data? 4. Given a set of predictor values, what response value should we predict, and how accurate is our prediction?
  • 12.
  • 13.
  • 14.
  • 16. FORWARD SELECTION. WE BEGIN WITH THE NULL MODEL—A MODEL THAT CON- FORWARD SELECTION NULL MODEL TAINS AN INTERCEPT BUT NO PREDICTORS. WE THEN FIT P SIMPLE LINEAR REGRESSIONS AND ADD TO THE NULL MODEL THE VARIABLE THAT RESULTS IN THE LOWEST RSS. WE THEN ADD TO THAT MODEL THE VARIABLE THAT RESULTS IN THE LOWEST RSS FOR THE NEW TWO- VARIABLE MODEL. THIS APPROACH IS CONTINUED UNTIL SOME STOPPING RULE IS SATISFIED.
  • 17. BACKWARD SELECTION. WE START WITH ALL VARIABLES IN THE MODEL, AND BACKWARD REMOVE THE VARIABLE WITH THE LARGEST P-VALUE—THAT IS, THE VARIABLE SELECTION THAT IS THE LEAST STATISTICALLY SIGNIFICANT. THE NEW (P − 1)-VARIABLE MODEL IS FIT, AND THE VARIABLE WITH THE LARGEST P-VALUE IS REMOVED. THIS PROCEDURE CONTINUES UNTIL A STOPPING RULE IS REACHED. FOR INSTANCE, WE MAY STOP WHEN ALL REMAINING VARIABLES HAVE A P- VALUE BELOW SOME THRESHOLD.
  • 18. MIXED SELECTION. THIS IS A COMBINATION OF FORWARD AND BACKWARD SE- MIXED LECTION. WE START WITH NO VARIABLES IN THE MODEL, AND AS WITH FORWARD SELECTION SELECTION, WE ADD THE VARIABLE THAT PROVIDES THE BEST FIT. WE CONTINUE TO ADD VARIABLES ONE-BY-ONE. OF COURSE, AS WE NOTED WITH THE ADVERTISING EXAMPLE, THE P-VALUES FOR VARIABLES CAN BECOME LARGER AS NEW PREDICTORS ARE ADDED TO THE MODEL. HENCE, IF AT ANY POINT THE P-VALUE FOR ONE OF THE VARIABLES IN THE MODEL RISES ABOVE A CERTAIN THRESHOLD, THEN WE REMOVE THAT VARIABLE FROM THE MODEL. WE CONTINUE TO PERFORM THESE FORWARD AND BACKWARD STEPS UNTIL ALL VARIABLES IN THE MODEL HAVE A SUFFICIENTLY LOW P-VALUE, AND ALL VARIABLES OUTSIDE THE MODEL WOULD HAVE A LARGE P-VALUE IF ADDED TO THE MODEL.
  • 19.
  • 20. FIGURE For the Advertising data, a linear regression fit to sales using TV and radio as predictors. From the pattern of the residuals, we can see that there is a pronounced non-linear relationship in the data. The positive residuals (those visible above the surface), tend to lie along the 45- degree line, where TV and Radio budgets are split evenly. The negative residuals (most not visible), tend to lie away from this line, where budgets are more lopsided.
  • 22.
  • 23. 2. OF COURSE, IN PRACTICE ASSUMING A LINEAR MODEL FOR F(X) IS ALMOST ALWAYS AN APPROXIMATION OF REALITY, SO THERE IS AN ADDITIONAL SOURCE OF POTENTIALLY REDUCIBLE ERROR WHICH WE CALL MODEL BIAS. SO WHEN WE USE A LINEAR MODEL, WE ARE IN FACT ESTIMATING THE BEST LINEAR APPROXIMATION TO THE TRUE SURFACE. HOWEVER, HERE WE WILL IGNORE THIS DISCREPANCY, AND OPERATE AS IF THE LINEAR MODEL WERE CORRECT.
  • 24. 3. EVEN IF WE KNEW F(X)—THAT IS, EVEN IF WE KNEW THE TRUE VALUES FOR Β0, Β1,...,ΒP—THE RESPONSE VALUE CANNOT BE PREDICTED PERFECTLY BECAUSE OF THE RANDOM ERROR IN THE MODEL REFERRED TO THIS AS THE IRREDUCIBLE ERROR