SlideShare a Scribd company logo
1 of 46
Multiple Regression
www.HelpWithAssignment.com
 The methods of simple linear regression, discussed in
Chapter 7, apply when we wish to fit a linear model
relating the value of an dependent variable y to the
value of a single independent variable x.
 There are many situations when a single independent
variable is not enough.
 In situations like this, there are several independent
variables, x1,x2,…,xp, that are related to a dependent
variable y.
 Assume that we have a sample of n items
and that on each item we have measured a
dependent variable y and p independent
variables, x1,x2,…,xp.
 The ith sampled item gives rise to the
ordered set (yi,x1i,…,xpi).
 We can then fit the multiple regression
model yi = β0 + β1x1i +…+ βpxpi + εi.
 Polynomial regression model (the independent variables are
all powers of a single variable)
 Quadratic model (polynomial regression of model of degree 2,
and powers of several variables)
 A variable that is the product of two other variables is called an
interaction.
 These models are considered linear models, even though they contain
nonlinear terms in the independent variables. The reason is that they
are linear in the coefficients, βi .
2 2
0 1 1 2 2 3 1 2 4 1 5 2i i i i i i i i
y x x x x x xβ β β β β β ε= + + + + + +
ippii xxxy εββββ ++++= ˆ...ˆˆˆˆ 2
210
 In any multiple regression model, the estimates
are computed by least-squares, just as in simple linear
regression. The equation
is called the least-squares equation or fitted
regression equation.
 The residuals are the quantities

which are the differences between the observed y values
and the y values given by the equation.
 We want to compute so as to minimize the sum of
the squared residuals. This is complicated and we rely on
computers to calculate them.
0 1
ˆ ˆ ˆ, ,..., p
β β β
ˆi i i
e y y= −
0 1
ˆ ˆ ˆ, ,..., p
β β β
pp xxy βββ ˆ...ˆˆˆ 110 ++=
 Much of the analysis in multiple regression is based on three
fundamental quantities.
 regression sum of squares (SSR),
 error sum of squares (SSE)
 total sum of squares (SST)
 these quantities are the same as defined in Chapter 7.
 The analysis of variance identity is
SST = SSR + SSE
Recall: Assumptions for Errors in Linear Models:
In the simplest situation, the following assumptions are
satisfied (notice that these are the same as for simple
linear regression.):
1. The errors ε1,…,εn are random and independent. In
particular, the magnitude of any error εi does not influence
the value of the next error εi+1.
2. The errors ε1,…,εn all have mean 0.
3. The errors ε1,…,εn all have the same variance, which we
denote by σ2
.
4. The errors ε1,…,εn are normally distributed.
 The three statistics most often used in multiple regression
are the estimated error variance s2
, the coefficient of
determination R2
, and the F statistic.
 We have to adjust the estimated standard deviation since we
are estimating p + 1 coefficients,
 The estimated variance of each least-squares coefficient is a
complicated calculation and we can find them using a
computer.
 The value of R2
is calculated in the same way as r2
in simple
linear regression.
2
2 1
ˆ( )
1 1
n
i ii
y y SSE
s
n p n p
=
−
= =
− − − −
∑
SST
SSR
SST
SSE
R =−=12
 When assumptions 1 through 4 are satisfied, the
quantity
has a Student’s t distribution with n – p + 1
degrees of freedom.
 The number of degrees of freedom is equal to the
denominator used to compute the estimated error
variance.
 This statistic is used to compute confidence
intervals and to perform hypothesis tests, as we did
with simple linear regression.
ˆ
ˆ
i
i i
β
β β
s
−
 In simple linear regression, a test of the null hypothesis β1 = 0 is
almost always made. If this hypothesis is not rejected, then the
linear model may not be useful.
 The test is multiple linear regression is H0 =
β1= β2= … = βp= 0. This is a very strong hypothesis. It says that
none of the independent variables has any linear relationship
with the dependent variable.
 The test statistic for this hypothesis is

 This is an F statistic and its null distribution is Fp,n-p-1. Note that the
denominator of the F statistic is s2
. The subscripts p and n-p-1 are
the degrees of freedom for the F statistic.
 Slightly different versions of the F statistics can be used to test
milder null hypotheses.
1−−
=
pn
SSE
p
SSR
F
The regression equation is
Goodput = 96.0 - 1.82 Speed + 0.565 Pause + 0.0247 Speed*Pause + 0.0140 Speed^2
- 0.0118 Pause^2
Predictor Coef StDev T P
Constant 96.024 3.946 24.34 0.000
Speed -1.8245 0.2376 -7.68 0.000
Pause 0.5652 0.2256 2.51 0.022
Speed*Pa 0.024731 0.003249 7.61 0.000
Speed^2 0.014020 0.004745 2.95 0.008
Pause^2 -0.011793 0.003516 -3.35 0.003
S = 2.942 R-Sq = 93.2% R-Sq(adj) = 91.4%
Analysis of Variance
Source DF SS MS F P
Regression 5 2240.49 448.10 51.77 0.000
Residual Error 19 164.46 8.66
Total 24 2404.95
Predicted Values for New Observations
New
Obs Fit SE Fit 95% CI 95% PI
1 74.272 1.175 (71.812, 76.732) (67.641, 80.903)
Values of Predictors for New Observations
New
Obs Speed Pause Speed*Pause Speed^2 Pause^2
1 25.0 15.0 375 625 225
Speed Pause Goodput
5 10 95.111
5 20 94.577
5 30 94.734
5 40 94.317
5 50 94.644
10 10 90.8
10 20 90.183
10 30 91.341
10 40 91.321
10 50 92.104
20 10 72.422
20 20 82.089
20 30 84.937
20 40 87.8
20 50 89.941
30 10 62.963
30 20 76.126
30 30 84.855
30 40 87.694
30 50 90.556
40 10 55.298
40 20 78.262
40 30 84.624
40 40 87.078
40 50 90.101
Use the multiple regression model to predict the
goodput for a network with speed 12 m/s and pause
time 25 s.
For the goodput data, find the residual for the point
Speed = 20, Pause = 30.
Find a 95% confidence interval for the coefficient of
Speed in the multiple regression model.
Test the null hypothesis that the coefficient of Pause
is less than or equal to 0.3.
 It is important in multiple linear regression to test
the validity of the assumptions for errors in the
linear model.
•Errors are random and independent: Residuals vs. run order
•Errors all have mean of zero: Residuals vs. each independent
variable
•Errors all have the same variance: Residuals vs. fitted values
•Errors are normally distributed: Normal or Half-Normal plot of residuals
If the residual plots indicate a violation of assumptions, transformations can
be tried.
Fitting separate models to each variable is not
the same as fitting the multivariate model.
Consider the following example: There are 225
gas wells that received “fracture treatment” in
order to increase production. In this treatment,
fracture fluid, which consists of fluid mixed with
sand, is pumped into the well. The sand holds
open the cracks in the rock, thus increasing the
flow of gas.
We can use sand to predict production or fluid to predict production. If we
fit a simple model, then sand and fluid in their models show up as
important predictors.
We might be tempted to conclude that increasing the volume of fluid or the
volume of sand would increase production.
There is confounding in this situation. If we increase the volume of fluid,
then we also increase the volume of sand.
If production depends only on the volume of sand, there will still be a
relationship in the data between production and fluid, and vice versus.
The following output presents results using only one independent variable (fluid or
sand) in the model. Note that log transformations have been done. Both fluid
and sand have a statistically significant effect.
The regression equation is
ln Prod = - 0.444 + 0.798 ln Fluid
Predictor Coef StDev T P
Constant -0.4442 0.5853 -0.76 0.449
ln Fluid 0.79833 0.08010 9.97 0.000
S = 0.7459 R-Sq = 28.2% R-Sq(adj) = 27.9%
The regression equation is
ln Prod = - 0.778 + 0.748 ln Sand
Predictor Coef StDev T P
Constant -0.7784 0.6912 -1.13 0.261
ln Sand 0.74751 0.08381 8.92 0.000
S = 0.7678 R-Sq = 23.9% R-Sq(adj) = 23.6%
This output presents results from multiple linear regression, in
which both fluid and sand are included in the model. In
contrast to the separate simple linear regression results,
only fluid has a statistically significant effect; sand does
not.
The regression equation is
ln Prod = - 0.729 + 0.670 ln Fluid + 0.148 ln Sand
Predictor Coef StDev T P
Constant -0.7288 0.6719 -1.08 0.279
ln Fluid 0.6701 0.1687 3.97 0.000
ln Sand 0.1481 0.1714 0.86 0.389
S = 0.7463 R-Sq = 28.4% R-Sq(adj) = 27.8%
 When two independent variables are
very strongly correlated, multiple
regression may not be able to
determine which is the important one.
 In this case, the variables are said to be
collinear.
 The word collinear means to lie on the
same line, and when two variables are
highly correlate, their scatterplot is
approximately a straight line.
•The word multicollinearity is sometimes used as well, meaning that
multiple variables are highly correlated with each other.
•When collinearity is present, the set of independent variables is
sometimes said to be ill-conditioned.
 There are many situations in which a large number
of independent variables have been measured, and
we need to decide which of them to include in the
model.
 This is the problem of model selection, and it is not
an easy one.
 Good model selection rests on this basic principle
known as Occam’s razor:
“The best scientific model is the simplest model
that explains the observed data.”
 In terms of linear models, Occam’s razor implies
the principle of parsimony:
“A model should contain the smallest number of
variables necessary to fit the data.”
1. A linear model should always contain an
intercept, unless physical theory dictates
otherwise.
2. If a power xn
of a variable is included in the model,
all lower powers x, x2
, …, xn-1
should be included as
well, unless physical theory dictates otherwise.
3. If a product xy of two variables is included in a
model, then the variables x and y should be
included separately as well, unless physical
theory dictates otherwise.
What is the effect of X on Y?
Draw a smooth curve through the data
of what you would expect a good model
to look like.
 First, check if an entire variable can be eliminated
(including linear, quadratic, and interaction terms)
 Ex: yi=β0 + β1x1 +β2x2 +β3x1
2
+β4x2
2
+β5x1x2
 Can all x1 terms (x1,x1
2
,x1x2) be dropped as a group?
 Next, drop other insignificant terms one at a time,
starting with the term with the highest p value.
 Removing a term will change the coefficients and p-values
of the remaining terms.
 Often called “backward elimination”
 It often happens that one has formed a model
that contains a large number of independent
variables, and one wishes to determine whether
a given subset of them may be dropped from
the model without significantly reducing the
accuracy of the model.
 Assume that we know that the model
yi=β0 + β1x1i +…+βkxki+βk+1xk+1i +… βpxpi + εi is correct.
We will call this the full model.
 We wish to test the null hypothesis H0:βk+1=…
=βp=0.
 If H0is true, the model will remain correct if we
drop the variables xk+1,…xp, so we can replace the
full model with the following reduced model:
yi=β0 + β1x1i +…+βkxki + εi.
 To develop a test statistic for H0, we begin by
computing the error sums of squares for
both the full and reduced models.
 We call this SSfull and SSreduced, respectively.
 The number of degrees of freedom:
 Full Model: n – p – 1
 Reduced Model: n – k – 1.
 If the full model is correct, than the error variance σ2
is
well estimated be SSEfull/(n-p-1).
 Under the null hypothesis (the reduced model is also
correct) and the variance can be estimated by:
SSEreduced/(n-k-1), so
 SSEfull = (n-p-1)σ2
 SSEreduced = (n-k-1)σ2
 The difference between the above is:
 SSEfull – SSEreduced = (k-p)σ2
 So, if the null hypothesis is true, σ2
can be estimated
by:
)(
)(
kp
SSESSE fullreduced
−
−
The test statistic is
 If H0 is true, then f tends to be close to 1. If
H0 is false, then f tends to be larger.
 The test statistic can be thought of as the
variance explained by the dropped terms
divided by our best estimate of the variance.
)1(
)(
)(
−−
−
−
=
pn
SSE
kp
SSESSE
f
full
fullreduced
 You fit the data from a Central-Composite design in 3 Factors with a full
quadratic equation.
 
 The sum of the squared errors from the regression was:
 SSE = 175 vR = 10 (s2
R = 17.50 )
 
 
 When factor X2 was dropped from the model (4 terms), the sum of squares
increased to
 SSE = 323 vR = 14 (s2
R = 23.07 )
 
 
 Is X2 needed in the model?
 This method is very useful for developing
parsimonious models by removing unnecessary
variables. However, the conditions under which it is
formally correct are rarely met.
 More often, a large model is fit, some of the variables
are seen to have fairly large P-values, and the F test is
used to decide whether to drop them from the model.
 It is often the case that there is no one “correct”
model. There are several models that fit equally well.
 When there is little or no physical theory to rely on,
many different models will fit the data about
equally well.
 The methods for choosing a model involve
statistics, whose values depend on the data.
Therefore, if the experiment is repeated, these
statistics will come out differently, and different
models may appear to be “best.”
 Some or all of the independent variables in a
selected model may not really be related to the
dependent variable. Whenever possible,
experiments should be repeated to test these
apparent relationships.
 Model selection is an art, not a science.
A = Temperature
B = Catalyst
concentration
C = Pressure
Y = Yield of ester
Blocks A B C AB AC BC A^2 B^2 C^2 Y
1 -1 -1 -1 1 1 1 1 1 1 17
1 1 -1 -1 -1 -1 1 1 1 1 44
1 -1 1 -1 -1 1 -1 1 1 1 30
1 1 1 -1 1 -1 -1 1 1 1 52
1 -1 -1 1 1 -1 -1 1 1 1 7
1 1 -1 1 -1 1 -1 1 1 1 55
1 -1 1 1 -1 -1 1 1 1 1 27
1 1 1 1 1 1 1 1 1 1 61
1 0 0 0 0 0 0 0 0 0 29
1 0 0 0 0 0 0 0 0 0 29
1 0 0 0 0 0 0 0 0 0 30
2 -1.68 0 0 0 0 0 2.83 0 0 18
2 1.682 0 0 0 0 0 2.83 0 0 80
2 0 -1.68 0 0 0 0 0 2.83 0 21
2 0 1.682 0 0 0 0 0 2.83 0 82
2 0 0 -1.68 0 0 0 0 0 2.83 35
2 0 0 1.682 0 0 0 0 0 2.83 31
2 0 0 0 0 0 0 0 0 0 28
2 0 0 0 0 0 0 0 0 0 27
2 0 0 0 0 0 0 0 0 0 29
  Model Res |Res| q Z
1 18.2 -0.23 0.23 0.51 0.03
2 31.0 -0.50 0.50 0.54 0.09
3 43.0 0.99 0.99 0.56 0.16
4 31.5 -2.15 2.15 0.59 0.22
5 26.0 2.50 2.50 0.61 0.29
6 26.0 3.00 3.00 0.64 0.35
7 30.4 -3.36 3.36 0.66 0.42
8 31.5 -3.45 3.45 0.69 0.49
9 76.0 3.59 3.59 0.71 0.56
10 31.1 3.86 3.86 0.74 0.64
11 26.0 4.00 4.00 0.76 0.71
12 2.6 4.42 4.42 0.79 0.80
13 31.5 -4.45 4.45 0.81 0.89
14 12.2 4.84 4.84 0.84 0.98
15 49.9 5.07 5.07 0.86 1.09
16 58.8 -6.80 6.80 0.89 1.21
17 68.2 -7.21 7.21 0.91 1.36
18 37.4 -7.44 7.44 0.94 1.53
19 31.3 -10.25 10.25 0.96 1.78
20 67.9 13.61 13.61 0.99 2.24
2
)1
½
( +
−
= n
i
q
Half-Normal Plot of Residuals
0.00
0.50
1.00
1.50
2.00
2.50
0.00 5.00 10.00 15.00
|Residuals|
ZScore
→ Remove factor C and all of its terms?
Step 3: Trim Model – Variable ‘C’ Removed
Next: Test individual terms. Is the AB interaction needed?
Test for significance of C block of terms:
→ Only significant terms left in model.
 Table 8.4
 Your book also discusses
 Best subsets regression
 Stepwise regression
 Includes forward selection and backward elimination
 We won’t cover these methods in detail in this
class
 This is the most widely use model selection technique.
 Its main advantage over best subsets regression is
that it is less computationally intensive, so it can be
used in situations where there are a very large
number of candidate independent variables and too
many possible subsets for every one of them to be
examined.
 The user chooses two threshold P-values, αin and αout,
with αin < αout.
 The stepwise regression procedure begins with a step
called a forward selection step, in which the
independent variables with smallest P-value is
selected, provided that P < αin.
 This variable is entered in the model, creating a model
with a single independent variable.
 In the next step, the remaining variables are examined
one at a time as candidates for the second variable in
the model. The one with the smallest P-value is added
to the model, again provided that P < αin.
 Now, it is possible that adding the second variables to
the model increased the P-value of the first variable.
In the next step, called a backward elimination
step, the first variable is dropped from the model if its
P-value has grown to exceed the value αout.
 The algorithm continues by alternating forward
selection steps with backward eliminations steps.
 The algorithm terminates when no variables meet the
criteria for being added to or dropped from the model.
www.HelpWithAssignment.com is an online tutoring and
Live Assigment help company. We provides seamless online
tuitions in sessions of 30 minutes, 60 minutes or 120 minutes
covering a variety of subjects. The specialty of HWA online
tuitions are:
•Conducted by experts in the subject taught
•Tutors selected after rigorous assessment and training
•Tutoring sessions follow a pre-decided structure based on
instructional design best practices
•State-of-the art technology used With a whiteboard,
document sharing facility, video and audio conferencing as
well as chat support HWA’s one-on-one tuitions have a large
following. Several thousand hours of tuitions have already
been delivered to the satisfaction of customers.
In short,HWA’s online tuitions are seamless,personalized and
convenient.
WWW.HELPWITHASSIGNMENT.COM
www.HelpWithAssignment.com

More Related Content

What's hot

Solving stepwise regression problems
Solving stepwise regression problemsSolving stepwise regression problems
Solving stepwise regression problemsSoma Sinha Roy
 
Simple & Multiple Regression Analysis
Simple & Multiple Regression AnalysisSimple & Multiple Regression Analysis
Simple & Multiple Regression AnalysisShailendra Tomar
 
Regression analysis
Regression analysisRegression analysis
Regression analysisSohag Babu
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysisRabin BK
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
Presentation on regression analysis
Presentation on regression analysisPresentation on regression analysis
Presentation on regression analysisSujeet Singh
 
Lesson 6 coefficient of determination
Lesson 6   coefficient of determinationLesson 6   coefficient of determination
Lesson 6 coefficient of determinationMehediHasan1023
 
Basic probability theory and statistics
Basic probability theory and statisticsBasic probability theory and statistics
Basic probability theory and statisticsLearnbay Datascience
 
Polynomial regression
Polynomial regressionPolynomial regression
Polynomial regressionnaveedaliabad
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Random variable,Discrete and Continuous
Random variable,Discrete and ContinuousRandom variable,Discrete and Continuous
Random variable,Discrete and ContinuousBharath kumar Karanam
 

What's hot (20)

Chapter05
Chapter05Chapter05
Chapter05
 
Solving stepwise regression problems
Solving stepwise regression problemsSolving stepwise regression problems
Solving stepwise regression problems
 
Lecture 4
Lecture 4Lecture 4
Lecture 4
 
Simple & Multiple Regression Analysis
Simple & Multiple Regression AnalysisSimple & Multiple Regression Analysis
Simple & Multiple Regression Analysis
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
Cost indexes
Cost indexesCost indexes
Cost indexes
 
Chapter14
Chapter14Chapter14
Chapter14
 
Presentation on regression analysis
Presentation on regression analysisPresentation on regression analysis
Presentation on regression analysis
 
Regression
RegressionRegression
Regression
 
Statistics 1 revision notes
Statistics 1 revision notesStatistics 1 revision notes
Statistics 1 revision notes
 
Lesson 6 coefficient of determination
Lesson 6   coefficient of determinationLesson 6   coefficient of determination
Lesson 6 coefficient of determination
 
Regression analysis in excel
Regression analysis in excelRegression analysis in excel
Regression analysis in excel
 
Basic probability theory and statistics
Basic probability theory and statisticsBasic probability theory and statistics
Basic probability theory and statistics
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Polynomial regression
Polynomial regressionPolynomial regression
Polynomial regression
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
 
Random variable,Discrete and Continuous
Random variable,Discrete and ContinuousRandom variable,Discrete and Continuous
Random variable,Discrete and Continuous
 

Viewers also liked

Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715HelpWithAssignment.com
 
Horario 2012 1_electronica_yopal (1)
Horario 2012 1_electronica_yopal (1)Horario 2012 1_electronica_yopal (1)
Horario 2012 1_electronica_yopal (1)andrea botia
 
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...fruitmax
 
"営業マン"な自社サイトを作る
"営業マン"な自社サイトを作る"営業マン"な自社サイトを作る
"営業マン"な自社サイトを作るDigical Media
 
Personal Memoir Slideshow
Personal Memoir SlideshowPersonal Memoir Slideshow
Personal Memoir SlideshowJosh Maroney
 
Nuliv An Overview Of Nu Liv Science 2010 6 21
Nuliv An Overview Of Nu Liv Science 2010 6 21Nuliv An Overview Of Nu Liv Science 2010 6 21
Nuliv An Overview Of Nu Liv Science 2010 6 21fruitmax
 
Horario 2012_ Ingeniería Electrónica
Horario 2012_ Ingeniería  Electrónica Horario 2012_ Ingeniería  Electrónica
Horario 2012_ Ingeniería Electrónica andrea botia
 
Elk Canada
Elk CanadaElk Canada
Elk Canadafruitmax
 
New England Business Expo 2009
New England Business Expo 2009New England Business Expo 2009
New England Business Expo 2009Dennys_Catering
 
株式会社デジカルメディア事業部 サービス紹介 Ver1.0
株式会社デジカルメディア事業部 サービス紹介 Ver1.0株式会社デジカルメディア事業部 サービス紹介 Ver1.0
株式会社デジカルメディア事業部 サービス紹介 Ver1.0Digical Media
 
Server and Client side comparision
Server and Client side comparisionServer and Client side comparision
Server and Client side comparisionStew Duncan
 

Viewers also liked (16)

Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715Fundamentals of Transport Phenomena ChE 715
Fundamentals of Transport Phenomena ChE 715
 
Horario 2012 1_electronica_yopal (1)
Horario 2012 1_electronica_yopal (1)Horario 2012 1_electronica_yopal (1)
Horario 2012 1_electronica_yopal (1)
 
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...
Astra Gin, A Nu Liv Science Nutraceutical Ingredient For Increasing Nutrient ...
 
Ruby Programming Assignment Help
Ruby Programming Assignment HelpRuby Programming Assignment Help
Ruby Programming Assignment Help
 
Organizational Change
Organizational ChangeOrganizational Change
Organizational Change
 
Satyam Scam
Satyam ScamSatyam Scam
Satyam Scam
 
"営業マン"な自社サイトを作る
"営業マン"な自社サイトを作る"営業マン"な自社サイトを作る
"営業マン"な自社サイトを作る
 
System Programming - Threading
System Programming - ThreadingSystem Programming - Threading
System Programming - Threading
 
Personal Memoir Slideshow
Personal Memoir SlideshowPersonal Memoir Slideshow
Personal Memoir Slideshow
 
Nuliv An Overview Of Nu Liv Science 2010 6 21
Nuliv An Overview Of Nu Liv Science 2010 6 21Nuliv An Overview Of Nu Liv Science 2010 6 21
Nuliv An Overview Of Nu Liv Science 2010 6 21
 
Lan2
Lan2Lan2
Lan2
 
Horario 2012_ Ingeniería Electrónica
Horario 2012_ Ingeniería  Electrónica Horario 2012_ Ingeniería  Electrónica
Horario 2012_ Ingeniería Electrónica
 
Elk Canada
Elk CanadaElk Canada
Elk Canada
 
New England Business Expo 2009
New England Business Expo 2009New England Business Expo 2009
New England Business Expo 2009
 
株式会社デジカルメディア事業部 サービス紹介 Ver1.0
株式会社デジカルメディア事業部 サービス紹介 Ver1.0株式会社デジカルメディア事業部 サービス紹介 Ver1.0
株式会社デジカルメディア事業部 サービス紹介 Ver1.0
 
Server and Client side comparision
Server and Client side comparisionServer and Client side comparision
Server and Client side comparision
 

Similar to Get Multiple Regression Assignment Help

Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 
Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysisIrfan Hussain
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.pptTanyaWadhwani4
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREShriramKargaonkar
 
Heteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxHeteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxDevendraRavindraPati
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural NetRatul Alahy
 
Heteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxHeteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxPatilDevendra5
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisSalim Azad
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Maninda Edirisooriya
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelMehdi Shayegani
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and ApplicationUniversity of Salerno
 

Similar to Get Multiple Regression Assignment Help (20)

Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 
Ders 2 ols .ppt
Ders 2 ols .pptDers 2 ols .ppt
Ders 2 ols .ppt
 
Multivariate reg analysis
Multivariate reg analysisMultivariate reg analysis
Multivariate reg analysis
 
Multiple Regression.ppt
Multiple Regression.pptMultiple Regression.ppt
Multiple Regression.ppt
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HEREREGRESSION ANALYSIS THEORY EXPLAINED HERE
REGRESSION ANALYSIS THEORY EXPLAINED HERE
 
Regression
RegressionRegression
Regression
 
Heteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxHeteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptx
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural Net
 
Heteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptxHeteroscedasticity Remedial Measures.pptx
Heteroscedasticity Remedial Measures.pptx
 
Inorganic CHEMISTRY
Inorganic CHEMISTRYInorganic CHEMISTRY
Inorganic CHEMISTRY
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Unit 03 - Consolidated.pptx
Unit 03 - Consolidated.pptxUnit 03 - Consolidated.pptx
Unit 03 - Consolidated.pptx
 
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
Lecture 4 - Linear Regression, a lecture in subject module Statistical & Mach...
 
Diagnostic methods for Building the regression model
Diagnostic methods for Building the regression modelDiagnostic methods for Building the regression model
Diagnostic methods for Building the regression model
 
2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data 2. diagnostics, collinearity, transformation, and missing data
2. diagnostics, collinearity, transformation, and missing data
 
Statistics for entrepreneurs
Statistics for entrepreneurs Statistics for entrepreneurs
Statistics for entrepreneurs
 
The linear regression model: Theory and Application
The linear regression model: Theory and ApplicationThe linear regression model: Theory and Application
The linear regression model: Theory and Application
 
Matlab:Regression
Matlab:RegressionMatlab:Regression
Matlab:Regression
 
Matlab: Regression
Matlab: RegressionMatlab: Regression
Matlab: Regression
 

Recently uploaded

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 

Recently uploaded (20)

भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 

Get Multiple Regression Assignment Help

  • 2.  The methods of simple linear regression, discussed in Chapter 7, apply when we wish to fit a linear model relating the value of an dependent variable y to the value of a single independent variable x.  There are many situations when a single independent variable is not enough.  In situations like this, there are several independent variables, x1,x2,…,xp, that are related to a dependent variable y.
  • 3.  Assume that we have a sample of n items and that on each item we have measured a dependent variable y and p independent variables, x1,x2,…,xp.  The ith sampled item gives rise to the ordered set (yi,x1i,…,xpi).  We can then fit the multiple regression model yi = β0 + β1x1i +…+ βpxpi + εi.
  • 4.  Polynomial regression model (the independent variables are all powers of a single variable)  Quadratic model (polynomial regression of model of degree 2, and powers of several variables)  A variable that is the product of two other variables is called an interaction.  These models are considered linear models, even though they contain nonlinear terms in the independent variables. The reason is that they are linear in the coefficients, βi . 2 2 0 1 1 2 2 3 1 2 4 1 5 2i i i i i i i i y x x x x x xβ β β β β β ε= + + + + + + ippii xxxy εββββ ++++= ˆ...ˆˆˆˆ 2 210
  • 5.  In any multiple regression model, the estimates are computed by least-squares, just as in simple linear regression. The equation is called the least-squares equation or fitted regression equation.  The residuals are the quantities  which are the differences between the observed y values and the y values given by the equation.  We want to compute so as to minimize the sum of the squared residuals. This is complicated and we rely on computers to calculate them. 0 1 ˆ ˆ ˆ, ,..., p β β β ˆi i i e y y= − 0 1 ˆ ˆ ˆ, ,..., p β β β pp xxy βββ ˆ...ˆˆˆ 110 ++=
  • 6.  Much of the analysis in multiple regression is based on three fundamental quantities.  regression sum of squares (SSR),  error sum of squares (SSE)  total sum of squares (SST)  these quantities are the same as defined in Chapter 7.  The analysis of variance identity is SST = SSR + SSE
  • 7. Recall: Assumptions for Errors in Linear Models: In the simplest situation, the following assumptions are satisfied (notice that these are the same as for simple linear regression.): 1. The errors ε1,…,εn are random and independent. In particular, the magnitude of any error εi does not influence the value of the next error εi+1. 2. The errors ε1,…,εn all have mean 0. 3. The errors ε1,…,εn all have the same variance, which we denote by σ2 . 4. The errors ε1,…,εn are normally distributed.
  • 8.  The three statistics most often used in multiple regression are the estimated error variance s2 , the coefficient of determination R2 , and the F statistic.  We have to adjust the estimated standard deviation since we are estimating p + 1 coefficients,  The estimated variance of each least-squares coefficient is a complicated calculation and we can find them using a computer.  The value of R2 is calculated in the same way as r2 in simple linear regression. 2 2 1 ˆ( ) 1 1 n i ii y y SSE s n p n p = − = = − − − − ∑ SST SSR SST SSE R =−=12
  • 9.  When assumptions 1 through 4 are satisfied, the quantity has a Student’s t distribution with n – p + 1 degrees of freedom.  The number of degrees of freedom is equal to the denominator used to compute the estimated error variance.  This statistic is used to compute confidence intervals and to perform hypothesis tests, as we did with simple linear regression. ˆ ˆ i i i β β β s −
  • 10.  In simple linear regression, a test of the null hypothesis β1 = 0 is almost always made. If this hypothesis is not rejected, then the linear model may not be useful.  The test is multiple linear regression is H0 = β1= β2= … = βp= 0. This is a very strong hypothesis. It says that none of the independent variables has any linear relationship with the dependent variable.  The test statistic for this hypothesis is   This is an F statistic and its null distribution is Fp,n-p-1. Note that the denominator of the F statistic is s2 . The subscripts p and n-p-1 are the degrees of freedom for the F statistic.  Slightly different versions of the F statistics can be used to test milder null hypotheses. 1−− = pn SSE p SSR F
  • 11. The regression equation is Goodput = 96.0 - 1.82 Speed + 0.565 Pause + 0.0247 Speed*Pause + 0.0140 Speed^2 - 0.0118 Pause^2 Predictor Coef StDev T P Constant 96.024 3.946 24.34 0.000 Speed -1.8245 0.2376 -7.68 0.000 Pause 0.5652 0.2256 2.51 0.022 Speed*Pa 0.024731 0.003249 7.61 0.000 Speed^2 0.014020 0.004745 2.95 0.008 Pause^2 -0.011793 0.003516 -3.35 0.003 S = 2.942 R-Sq = 93.2% R-Sq(adj) = 91.4% Analysis of Variance Source DF SS MS F P Regression 5 2240.49 448.10 51.77 0.000 Residual Error 19 164.46 8.66 Total 24 2404.95 Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI 1 74.272 1.175 (71.812, 76.732) (67.641, 80.903) Values of Predictors for New Observations New Obs Speed Pause Speed*Pause Speed^2 Pause^2 1 25.0 15.0 375 625 225 Speed Pause Goodput 5 10 95.111 5 20 94.577 5 30 94.734 5 40 94.317 5 50 94.644 10 10 90.8 10 20 90.183 10 30 91.341 10 40 91.321 10 50 92.104 20 10 72.422 20 20 82.089 20 30 84.937 20 40 87.8 20 50 89.941 30 10 62.963 30 20 76.126 30 30 84.855 30 40 87.694 30 50 90.556 40 10 55.298 40 20 78.262 40 30 84.624 40 40 87.078 40 50 90.101
  • 12. Use the multiple regression model to predict the goodput for a network with speed 12 m/s and pause time 25 s. For the goodput data, find the residual for the point Speed = 20, Pause = 30. Find a 95% confidence interval for the coefficient of Speed in the multiple regression model. Test the null hypothesis that the coefficient of Pause is less than or equal to 0.3.
  • 13.  It is important in multiple linear regression to test the validity of the assumptions for errors in the linear model. •Errors are random and independent: Residuals vs. run order •Errors all have mean of zero: Residuals vs. each independent variable •Errors all have the same variance: Residuals vs. fitted values •Errors are normally distributed: Normal or Half-Normal plot of residuals If the residual plots indicate a violation of assumptions, transformations can be tried.
  • 14. Fitting separate models to each variable is not the same as fitting the multivariate model. Consider the following example: There are 225 gas wells that received “fracture treatment” in order to increase production. In this treatment, fracture fluid, which consists of fluid mixed with sand, is pumped into the well. The sand holds open the cracks in the rock, thus increasing the flow of gas.
  • 15. We can use sand to predict production or fluid to predict production. If we fit a simple model, then sand and fluid in their models show up as important predictors. We might be tempted to conclude that increasing the volume of fluid or the volume of sand would increase production.
  • 16. There is confounding in this situation. If we increase the volume of fluid, then we also increase the volume of sand. If production depends only on the volume of sand, there will still be a relationship in the data between production and fluid, and vice versus.
  • 17. The following output presents results using only one independent variable (fluid or sand) in the model. Note that log transformations have been done. Both fluid and sand have a statistically significant effect. The regression equation is ln Prod = - 0.444 + 0.798 ln Fluid Predictor Coef StDev T P Constant -0.4442 0.5853 -0.76 0.449 ln Fluid 0.79833 0.08010 9.97 0.000 S = 0.7459 R-Sq = 28.2% R-Sq(adj) = 27.9% The regression equation is ln Prod = - 0.778 + 0.748 ln Sand Predictor Coef StDev T P Constant -0.7784 0.6912 -1.13 0.261 ln Sand 0.74751 0.08381 8.92 0.000 S = 0.7678 R-Sq = 23.9% R-Sq(adj) = 23.6%
  • 18. This output presents results from multiple linear regression, in which both fluid and sand are included in the model. In contrast to the separate simple linear regression results, only fluid has a statistically significant effect; sand does not. The regression equation is ln Prod = - 0.729 + 0.670 ln Fluid + 0.148 ln Sand Predictor Coef StDev T P Constant -0.7288 0.6719 -1.08 0.279 ln Fluid 0.6701 0.1687 3.97 0.000 ln Sand 0.1481 0.1714 0.86 0.389 S = 0.7463 R-Sq = 28.4% R-Sq(adj) = 27.8%
  • 19.  When two independent variables are very strongly correlated, multiple regression may not be able to determine which is the important one.  In this case, the variables are said to be collinear.  The word collinear means to lie on the same line, and when two variables are highly correlate, their scatterplot is approximately a straight line. •The word multicollinearity is sometimes used as well, meaning that multiple variables are highly correlated with each other. •When collinearity is present, the set of independent variables is sometimes said to be ill-conditioned.
  • 20.  There are many situations in which a large number of independent variables have been measured, and we need to decide which of them to include in the model.  This is the problem of model selection, and it is not an easy one.  Good model selection rests on this basic principle known as Occam’s razor: “The best scientific model is the simplest model that explains the observed data.”  In terms of linear models, Occam’s razor implies the principle of parsimony: “A model should contain the smallest number of variables necessary to fit the data.”
  • 21. 1. A linear model should always contain an intercept, unless physical theory dictates otherwise. 2. If a power xn of a variable is included in the model, all lower powers x, x2 , …, xn-1 should be included as well, unless physical theory dictates otherwise. 3. If a product xy of two variables is included in a model, then the variables x and y should be included separately as well, unless physical theory dictates otherwise.
  • 22. What is the effect of X on Y? Draw a smooth curve through the data of what you would expect a good model to look like.
  • 23.
  • 24.  First, check if an entire variable can be eliminated (including linear, quadratic, and interaction terms)  Ex: yi=β0 + β1x1 +β2x2 +β3x1 2 +β4x2 2 +β5x1x2  Can all x1 terms (x1,x1 2 ,x1x2) be dropped as a group?  Next, drop other insignificant terms one at a time, starting with the term with the highest p value.  Removing a term will change the coefficients and p-values of the remaining terms.  Often called “backward elimination”
  • 25.  It often happens that one has formed a model that contains a large number of independent variables, and one wishes to determine whether a given subset of them may be dropped from the model without significantly reducing the accuracy of the model.  Assume that we know that the model yi=β0 + β1x1i +…+βkxki+βk+1xk+1i +… βpxpi + εi is correct. We will call this the full model.
  • 26.  We wish to test the null hypothesis H0:βk+1=… =βp=0.  If H0is true, the model will remain correct if we drop the variables xk+1,…xp, so we can replace the full model with the following reduced model: yi=β0 + β1x1i +…+βkxki + εi.
  • 27.  To develop a test statistic for H0, we begin by computing the error sums of squares for both the full and reduced models.  We call this SSfull and SSreduced, respectively.  The number of degrees of freedom:  Full Model: n – p – 1  Reduced Model: n – k – 1.
  • 28.  If the full model is correct, than the error variance σ2 is well estimated be SSEfull/(n-p-1).  Under the null hypothesis (the reduced model is also correct) and the variance can be estimated by: SSEreduced/(n-k-1), so  SSEfull = (n-p-1)σ2  SSEreduced = (n-k-1)σ2  The difference between the above is:  SSEfull – SSEreduced = (k-p)σ2  So, if the null hypothesis is true, σ2 can be estimated by: )( )( kp SSESSE fullreduced − −
  • 29. The test statistic is  If H0 is true, then f tends to be close to 1. If H0 is false, then f tends to be larger.  The test statistic can be thought of as the variance explained by the dropped terms divided by our best estimate of the variance. )1( )( )( −− − − = pn SSE kp SSESSE f full fullreduced
  • 30.  You fit the data from a Central-Composite design in 3 Factors with a full quadratic equation.    The sum of the squared errors from the regression was:  SSE = 175 vR = 10 (s2 R = 17.50 )      When factor X2 was dropped from the model (4 terms), the sum of squares increased to  SSE = 323 vR = 14 (s2 R = 23.07 )      Is X2 needed in the model?
  • 31.  This method is very useful for developing parsimonious models by removing unnecessary variables. However, the conditions under which it is formally correct are rarely met.  More often, a large model is fit, some of the variables are seen to have fairly large P-values, and the F test is used to decide whether to drop them from the model.  It is often the case that there is no one “correct” model. There are several models that fit equally well.
  • 32.  When there is little or no physical theory to rely on, many different models will fit the data about equally well.  The methods for choosing a model involve statistics, whose values depend on the data. Therefore, if the experiment is repeated, these statistics will come out differently, and different models may appear to be “best.”  Some or all of the independent variables in a selected model may not really be related to the dependent variable. Whenever possible, experiments should be repeated to test these apparent relationships.  Model selection is an art, not a science.
  • 33. A = Temperature B = Catalyst concentration C = Pressure Y = Yield of ester Blocks A B C AB AC BC A^2 B^2 C^2 Y 1 -1 -1 -1 1 1 1 1 1 1 17 1 1 -1 -1 -1 -1 1 1 1 1 44 1 -1 1 -1 -1 1 -1 1 1 1 30 1 1 1 -1 1 -1 -1 1 1 1 52 1 -1 -1 1 1 -1 -1 1 1 1 7 1 1 -1 1 -1 1 -1 1 1 1 55 1 -1 1 1 -1 -1 1 1 1 1 27 1 1 1 1 1 1 1 1 1 1 61 1 0 0 0 0 0 0 0 0 0 29 1 0 0 0 0 0 0 0 0 0 29 1 0 0 0 0 0 0 0 0 0 30 2 -1.68 0 0 0 0 0 2.83 0 0 18 2 1.682 0 0 0 0 0 2.83 0 0 80 2 0 -1.68 0 0 0 0 0 2.83 0 21 2 0 1.682 0 0 0 0 0 2.83 0 82 2 0 0 -1.68 0 0 0 0 0 2.83 35 2 0 0 1.682 0 0 0 0 0 2.83 31 2 0 0 0 0 0 0 0 0 0 28 2 0 0 0 0 0 0 0 0 0 27 2 0 0 0 0 0 0 0 0 0 29
  • 34.
  • 35.   Model Res |Res| q Z 1 18.2 -0.23 0.23 0.51 0.03 2 31.0 -0.50 0.50 0.54 0.09 3 43.0 0.99 0.99 0.56 0.16 4 31.5 -2.15 2.15 0.59 0.22 5 26.0 2.50 2.50 0.61 0.29 6 26.0 3.00 3.00 0.64 0.35 7 30.4 -3.36 3.36 0.66 0.42 8 31.5 -3.45 3.45 0.69 0.49 9 76.0 3.59 3.59 0.71 0.56 10 31.1 3.86 3.86 0.74 0.64 11 26.0 4.00 4.00 0.76 0.71 12 2.6 4.42 4.42 0.79 0.80 13 31.5 -4.45 4.45 0.81 0.89 14 12.2 4.84 4.84 0.84 0.98 15 49.9 5.07 5.07 0.86 1.09 16 58.8 -6.80 6.80 0.89 1.21 17 68.2 -7.21 7.21 0.91 1.36 18 37.4 -7.44 7.44 0.94 1.53 19 31.3 -10.25 10.25 0.96 1.78 20 67.9 13.61 13.61 0.99 2.24 2 )1 ½ ( + − = n i q Half-Normal Plot of Residuals 0.00 0.50 1.00 1.50 2.00 2.50 0.00 5.00 10.00 15.00 |Residuals| ZScore
  • 36. → Remove factor C and all of its terms?
  • 37.
  • 38. Step 3: Trim Model – Variable ‘C’ Removed Next: Test individual terms. Is the AB interaction needed? Test for significance of C block of terms:
  • 39.
  • 40. → Only significant terms left in model.
  • 42.  Your book also discusses  Best subsets regression  Stepwise regression  Includes forward selection and backward elimination  We won’t cover these methods in detail in this class
  • 43.  This is the most widely use model selection technique.  Its main advantage over best subsets regression is that it is less computationally intensive, so it can be used in situations where there are a very large number of candidate independent variables and too many possible subsets for every one of them to be examined.  The user chooses two threshold P-values, αin and αout, with αin < αout.  The stepwise regression procedure begins with a step called a forward selection step, in which the independent variables with smallest P-value is selected, provided that P < αin.  This variable is entered in the model, creating a model with a single independent variable.
  • 44.  In the next step, the remaining variables are examined one at a time as candidates for the second variable in the model. The one with the smallest P-value is added to the model, again provided that P < αin.  Now, it is possible that adding the second variables to the model increased the P-value of the first variable. In the next step, called a backward elimination step, the first variable is dropped from the model if its P-value has grown to exceed the value αout.  The algorithm continues by alternating forward selection steps with backward eliminations steps.  The algorithm terminates when no variables meet the criteria for being added to or dropped from the model.
  • 45. www.HelpWithAssignment.com is an online tutoring and Live Assigment help company. We provides seamless online tuitions in sessions of 30 minutes, 60 minutes or 120 minutes covering a variety of subjects. The specialty of HWA online tuitions are: •Conducted by experts in the subject taught •Tutors selected after rigorous assessment and training •Tutoring sessions follow a pre-decided structure based on instructional design best practices •State-of-the art technology used With a whiteboard, document sharing facility, video and audio conferencing as well as chat support HWA’s one-on-one tuitions have a large following. Several thousand hours of tuitions have already been delivered to the satisfaction of customers. In short,HWA’s online tuitions are seamless,personalized and convenient. WWW.HELPWITHASSIGNMENT.COM