SlideShare a Scribd company logo
1 of 50
Download to read offline
MAL1303: STATISTICAL
HYDROLOGY
Multiple Regression
Dr. Shamsuddin Shahid
Associate Professor
Department of Hydraulics and Hydrology
Faculty of Civil Engineering
Room No.: M46-332;
Phone: 07-5531624; Mobile: 0182051586
Email: sshahid@utm.my
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Simple Linear Regression
Simple Linear Regression (SLR) is a statistical
technique that is used to determine the
functional relationship between two variables.
Regression gives an equation that best describes
the relationship between two variables.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression (MLR)
Multiple linear regression is a statistical technique where a
dependent variable is predicted from a set of predictors
Multiple regression is a statistical technique that is used to
identify relationship between a dependent variable and a
combination of independent variables.
The relationship is valid when few assumptions are fulfilled.
Failing to satisfy the assumptions does not mean that
relationship is not correct. It means that the relationship may
not be strong enough.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
• The variables should be measure in interval/ratio scale.
• Dependent variable, Y must be normally distributed (no
skewness or outliers)
• Predictors, X’s do not need to be normally distributed, but
if they are it makes for a stronger interpretation.
• There should be linear relationship between Y and all X
• no outliers among Xs predicting Y
• Variance on Y is the same at all values of X
(homoscedastic)
Linear Multiple Regression: Assumptions
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Linear Multiple Regression: Outliers
• Outliers can distort the regression results in multiple regression as
like simple linear regression. When an outlier is included in the
analysis, it pulls the regression line towards itself. This can result in a
solution that is more accurate for the outlier, but less accurate for all
of the other cases in the data set.
• It is necessary to check for outliers in the dependent variable and in
the independent variables.
• Removing an outlier may improve the distribution of a variable.
• Transforming a variable may reduce the likelihood that the value for a
case will be characterized as an outlier.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
1. Decide dependent and independent variables.
2. Test for normality, linearity, homoscedasticity.
3. In necessary, remove the outliers.
4. If it does not satisfy the criteria for normality, transformation
is required. Decide which transformations should be used.
5. Substitute transformations and run regression entering all
independent variables.
6. Do multiple regression analysis with variables specified in the
problem.
7. Test the significance of the regression equation.
Linear Multiple Regression: Steps
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Simple Linear Regression
In Simple Linear Regression (SLR), the functional relationship
between two variables X and Y are determined.
Regression equation is the equation of a straight line that best
describes the relationship between two variables.
When the equation is used to calculate Y from observed X, it
gives an error ε in the prediction. Therefore, the Y equals to
predicted value plus error.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression (MLR): Basics
A multiple linear regression model is called “linear” because only
linear coefficients {β} are used. However, transforms of the
regressor variables are permitted in an MLR model like SLR.
In Multiple Linear Regression (SLR), the functional relationship of
dependent variable Y with more than one independent variables are
determined.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression (MLR): Basics
1 11 21
2 12 22 1
3 13 23 2
4 14 24
*
4 1 4 2 * 2 1
*
y x x
y x x b
y x x b
y x x
x x x
data design matrix parameters



11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression (MLR): Basics
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Basics
Create the design Matrix
Calculate the parameters:
Where, XT is the transpose of Matrix X
X-1 is the inverse of Matrix X
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
The Goodness of Fit of the Regression Model
One measure of how well a statistical model explains the observed
data is the coefficient of determination, that is, the square of the
Pearson correlation coefficient, r2, between y and x.
When x is replaced by ,
it gives the correlation between actual and predicted value, R2
It can also be measure by,
yˆ
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Distinction between r and R are:
• r is a measure of association between two random variables
whereas R is a measure between a random variable y and its
prediction from a regression model.
• r lies in the interval - 1  r -1 while the multiple correlation R
cannot be negative; that is, it lies in the interval 0  R  1.
• R is always well defined, regardless of whether the independent
variable is assumed to be random or fixed. In contrast, calculating
the correlation between a random variable, Y, and a fixed predictor
variable, X, that is, a variable that is not considered random, makes
no sense.
The Goodness of Fit of the Regression Model
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
It is well known that groundwater recharge is directly related to
Rainfall and Soil Moisture Holding Capacity (SMHC). Instrumental
data of groundwater recharge, Rainfall and SMHC at six sites has
been collected. Find a empirical equation that related groundwater
recharge with Rainfall and SMHC
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Solution
Create the design matrix
Get solution by:
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Solution
Excel commands:
Matrix Inversion: MINV(array)
Matrix Multiplication: MMULT(array1, array2)
Matrix Transpose: Copy Matrix -> Past Special with tick on
transpose radio button.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Recharge = 1.38 + 0.12Rainfall – 0.01SMHC
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Recharge = 1.38 + 0.12Rainfall – 0.01SMHC
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Basic assumptions about the errors:
1. The mean of the errors is zero
2. The errors are normally distributed.
3. The variances of the errors for all observations are
constant
4. The errors are independent of each other (uncorrelated)
Gross violations of these basic assumptions will yield a
poor or biased model. However, if the variances of the
errors are unequal and can be estimated, weighted
regression schemes can sometimes be used to obtain a
better model.
Multiple Linear Regression (MLR): Assumptions
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
is the Variance of residuals
Is the corresponding diagonal value of matrix
(XTX)-1
Multiple Linear Regression: Confidence Interval
Recharge = 1.38 + 0.12Rainfall – 0.01SMHC
The parameter values have range. We can find the range of a
parameter at a certain level of confidence by using following
formula:
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Recharge = 1.38 + 0.12Rainfall – 0.01SMHC
Multiple Linear Regression: Confidence Interval
n = 6, p = 3
At α = 0.05,
t(0.025, 3) = 4.18
s2 = 0.084
-0.35 ≤ β0 ≤ 3.11
-0.10 ≤ β1 ≤ 0.35
-0.16 ≤ β2 ≤ 0.14
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
• An estimator with lower variance is more efficient, in the
sense that it is likely to be closer to the true value over
samples.
• The “best” estimator is the one with minimum variance of all
estimators
Multiple Linear Regression: Efficient Estimator
Recharge = 1.38 + 0.12Rainfall – 0.01SMHC
-0.35 ≤ β0 ≤ 3.11
-0.10 ≤ β1 ≤ 0.35
-0.16 ≤ β2 ≤ 0.14
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
SST = SSE + SSR
Sum of Square Total (SST) = Total variability in the observed responses
Sum of Square Error (SSE) = Total error by the model, or variability that is not
explained by the model
Sum of Square Residual (SSR) = Systematic variability that is explained by the
regression model.
Multiple Linear Regression: Strength
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Mean variation in observations, MST = SST / n-1
Mean Error, MSE = SSE / n-p
Mean regression, MSR = SSR / 1
Higher values of R2 indicate a better fit of the model to the sample
observations.
Disadvantage of R2: Adding any regressor variable to an MLR
model, even an irrelevant regressor, yields a smaller SSE and
greater R2. For this reason, R2 by itself is not a good measure of
the quality of fit.
Multiple Linear Regression: Strength
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Strength
To overcome this deficiency in R2, an adjusted value can be used.
The adjusted coefficient of multiple determination ( ) is defined
as,
Because the number of model coefficients (p) is used in
computing, the value will not necessarily increase with the
addition of any regressor. Hence, is a more reliable indicator
of model quality.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
SST = 1.27; SSR = 0.85; SSE = 0.42
MST = 0.26; MSR = 0.85; MSE = 0.14
= 0.67
= 0.45
SST = SSE + SSR
Multiple Linear Regression: Strength (Example)
Mean variation in observations, MST = SST / n-1
Mean Error, MSE = SSE / n-p
Mean regression, MSR = SSR / 1
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
 F-test is used to assess the overall ability of a model.
 When testing for the significance of the goodness of fit, our null hypothesis is
that the explanatory variables jointly equal 0.
 If our F-statistic is below the critical value we fail to reject the null and
therefore we say the goodness of fit is not significant.
Multiple Linear Regression: F-statistics
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
 The F-test is useful for testing a number of hypotheses and is often
used to test for single, global and the joint significance of a group of
variables.
 Joint test often refer to ‘testing a restriction’.
 This restriction is that a group of explanatory variables are jointly
equal to 0
Multiple Linear Regression: F-statistics
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
The global F-test is used to assess the overall ability of a model to
explain at least some of the observed variability in the sample
responses. The global F-test is performed in the following steps:
Null hypothesis: β1 = β2 = …. = βk = 0
The global F-statistics is calculated as,
F0 = MSR/MSE
If F(calculated) > F (critical) (α, k, n-p),
(where k = number of regressors; n = data points; p = parameters to
be estimated).
Reject the null hypothesis and conclude that at least one βj≠0 and at
least one model regressor explains some of the response variation.
Multiple Linear Regression: F-statistics
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Recharge = 1.38 + 0.12Rainfall –
0.01SMHC
Multiple Linear Regression: Example
SST = 1.27 MST = 0.26
SSR = 0.85 MSR = 0.85
SSE = 0.42 MSE = 0.14
SST = SSE + SSR
F0 = MSR/MSE
= 6.07
F (critical) (α, k, n-p)
F (critical) (0.05, 2, 3)
= 9.55
F(calculated) < F (critical) (α, k, n-
p)
Null hypothesis can not
be rejected.
No model regressor
explains some of the
response variation.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR
Multiple Linear Regression: Example
Null hypothesis:
β1 = β2 = β3 = β4 = 0
= 0.9865
F0 = MSR/MSE
= 7.68
F (critical) (α, k, n-p) =
F (critical) (0.05, 4, 7) = 4.12
F(calculated) > F (critical) (α, k,
n-p)
Null hypothesis
rejected.
Decision: At least one βj≠0 and at least one model regressor
explains some of the response variation.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
Discharge = 33.50 – 0.28ET + 1.53BF + 0.28R
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Discharge = 33.50 – 0.28ET + 1.53BF + 0.28R
Multiple Linear Regression: Example
Null hypothesis:
β1 = β2 = β3 = 0
F0 = MSR/MSE
= 6.3
F (critical) (α, k, n-p) =
F (critical) (0.05, 3, 8) = 4.07
F(calculated) > F (critical) (α, k,
n-p)
Null hypothesis
rejected.
Decision: Groundwater recharge has no significant impact on
Discharge.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Example
Discharge = ? + ? ET + ? BF + ? GWR
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
 To carry out this test you need to conduct two separate regression,
one with all the explanatory variables in (unrestricted equation),
the other with the variables whose joint significance is being
tested, removed.
 Then collect the RSS from both equations.
 Put the values in the formula
 Find the critical value and compare with the test statistic. The null
hypothesis is that the variables jointly equal 0.
Multiple Linear Regression: Joint Significance
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
The test for joint significance has its own formula, which takes
the following form:
RSSrestrictedRSS
RSSedunrestrictRSS
equationedunrestrictinparametersk
nsrestrictioofnumberm
knRSS
mRSSRSS
F
R
u
u
uR
/
/







Multiple Linear Regression: Joint Significance
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Multiple Linear Regression: Joint Significance
Obs. No. Y X1 X2 x3
1 5.1 2.3 2.5 4.2
2 6.2 1.9 2.8 3.3
3 4.8 2.0 3.1 4.0
. . . . .
. . . . .
. . . . .
60 5.9 2.4 3.8 4.6
3322110 xαxαxααy 
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
If we have a model consists of three explanatory variables. We wish to
test for the joint significance of 2 of the variables (x2 and x3), we need
to run the following restricted and unrestricted models:
restrictedxααy
edunrestrictxαxαxααy
t
t


110
3322110
Multiple Linear Regression: Joint Significance
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Given the following model, we wish to test the joint significance of x2
and x3. Having estimated them, we collect their respective RSSs (n=60).
51
750
110
3322110
.RSS
restrictedxββy
.RSS
edunrestrictxαxαxααy
R
t
u
t




Multiple Linear Regression: Joint Significance
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
RSSrestrictedRSS
RSSedunrestrictRSS
equationedunrestrictinparametersk
nsrestrictioofnumberm
knRSS
mRSSRSS
F
R
u
u
uR
/
/







28
01340
3750
460750
275051




.
.
/.
/..
F
Multiple Linear Regression: Joint Significance
F (critical) (0.05, 2, 56) = 3.16
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
As the F statistic is greater than the critical value (28>3.15), we
reject the null hypothesis and conclude that the variables x2 and x3
are jointly significant and should remain in the model.
0:,
0:,
32
320




AHHypothesiseAlternativ
HHypothesisNull
Multiple Linear Regression: Joint Significance
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Choosing the Best MLR Model
• One of the major issues in multiple regression is the appropriate
approach to variable selection.
• To make a appropriate regression model, we need to
subsequently add or delete variables from model.
• The benefit of adding additional variables to a multiple
regression model is to account for or explain more of the
variance of the response variable. The cost of adding additional
variables is that the degrees of freedom decreases, making it
more difficult to find significance in hypothesis tests and
increasing the width of confidence intervals.
A good model will explain as much of the variance of y as
possible with a small number of explanatory variables.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
The choice of whether to add a variable is based on a "cost-benefit
analysis", and variables enter the model only if they make a
significant improvement in the model.
There are at least two types of approaches for evaluating whether
a new variable sufficiently improves the model. The first approach
uses partial F-tests, and when automated is often called a
"stepwise" procedure.
The second approach uses some overall measure of model
quality. The latter has many advantages.
Choosing the Best MLR Model
Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
Choosing the Best MLR Model
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

More Related Content

Viewers also liked

Top 100 best hollywood actors of all time copied by samir rafla from im db
Top 100 best hollywood actors of all time copied by samir rafla from im dbTop 100 best hollywood actors of all time copied by samir rafla from im db
Top 100 best hollywood actors of all time copied by samir rafla from im dbAlexandria University, Egypt
 
Array in c language
Array in c language Array in c language
Array in c language umesh patil
 
Sistema de informacion institucional,ok
Sistema de informacion institucional,okSistema de informacion institucional,ok
Sistema de informacion institucional,okjohnnyhp14
 
ABET_WORKSHOP_CERTIFICATION2
ABET_WORKSHOP_CERTIFICATION2ABET_WORKSHOP_CERTIFICATION2
ABET_WORKSHOP_CERTIFICATION2WAJID HUSSAIN
 
TSHOOT Solution for CISCO DEMO
TSHOOT Solution for CISCO DEMOTSHOOT Solution for CISCO DEMO
TSHOOT Solution for CISCO DEMOBiswadip Goswami
 
Your First ASP_Net project part 1
Your First ASP_Net project part 1Your First ASP_Net project part 1
Your First ASP_Net project part 1Biswadip Goswami
 
Company Vehicle Use Agreement
Company Vehicle Use AgreementCompany Vehicle Use Agreement
Company Vehicle Use AgreementJohn Keller
 
The End of Security as We Know It - Shannon Lietz
The End of Security as We Know It - Shannon LietzThe End of Security as We Know It - Shannon Lietz
The End of Security as We Know It - Shannon LietzSeniorStoryteller
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL ServerLynn Langit
 

Viewers also liked (13)

Top 100 best hollywood actors of all time copied by samir rafla from im db
Top 100 best hollywood actors of all time copied by samir rafla from im dbTop 100 best hollywood actors of all time copied by samir rafla from im db
Top 100 best hollywood actors of all time copied by samir rafla from im db
 
Array in c language
Array in c language Array in c language
Array in c language
 
Sistema de informacion institucional,ok
Sistema de informacion institucional,okSistema de informacion institucional,ok
Sistema de informacion institucional,ok
 
ABET_WORKSHOP_CERTIFICATION2
ABET_WORKSHOP_CERTIFICATION2ABET_WORKSHOP_CERTIFICATION2
ABET_WORKSHOP_CERTIFICATION2
 
TSHOOT Solution for CISCO DEMO
TSHOOT Solution for CISCO DEMOTSHOOT Solution for CISCO DEMO
TSHOOT Solution for CISCO DEMO
 
WORKSHOPS_FIE2016
WORKSHOPS_FIE2016WORKSHOPS_FIE2016
WORKSHOPS_FIE2016
 
2012 pe review__hyd_
2012 pe review__hyd_2012 pe review__hyd_
2012 pe review__hyd_
 
Your First ASP_Net project part 1
Your First ASP_Net project part 1Your First ASP_Net project part 1
Your First ASP_Net project part 1
 
Fagan Inspection
Fagan InspectionFagan Inspection
Fagan Inspection
 
Company Vehicle Use Agreement
Company Vehicle Use AgreementCompany Vehicle Use Agreement
Company Vehicle Use Agreement
 
The End of Security as We Know It - Shannon Lietz
The End of Security as We Know It - Shannon LietzThe End of Security as We Know It - Shannon Lietz
The End of Security as We Know It - Shannon Lietz
 
CLbf-cvbf--2016
CLbf-cvbf--2016CLbf-cvbf--2016
CLbf-cvbf--2016
 
Redis Labs and SQL Server
Redis Labs and SQL ServerRedis Labs and SQL Server
Redis Labs and SQL Server
 

Similar to Shahid Lecture-8- MKAG1273

Shahid Lecture-9- MKAG1273
Shahid Lecture-9- MKAG1273Shahid Lecture-9- MKAG1273
Shahid Lecture-9- MKAG1273nchakori
 
DEFECT PREDICTION USING ORDER STATISTICS
DEFECT PREDICTION USING ORDER STATISTICSDEFECT PREDICTION USING ORDER STATISTICS
DEFECT PREDICTION USING ORDER STATISTICSIAEME Publication
 
Quantitative Risk Assessment - Road Development Perspective
Quantitative Risk Assessment - Road Development PerspectiveQuantitative Risk Assessment - Road Development Perspective
Quantitative Risk Assessment - Road Development PerspectiveSUBIR KUMAR PODDER
 
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...IRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET Journal
 
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdf
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdfNumerical_Analysis_of_Semiconductor_PN_Junctions_U.pdf
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdfDeveshSinghal13
 
Blood Transfusion success rate prediction using Artificial Intelligence
Blood Transfusion success rate prediction using Artificial IntelligenceBlood Transfusion success rate prediction using Artificial Intelligence
Blood Transfusion success rate prediction using Artificial IntelligenceIRJET Journal
 
A Comparative Analysis of Slicing for Structured Programs
A Comparative Analysis of Slicing for Structured ProgramsA Comparative Analysis of Slicing for Structured Programs
A Comparative Analysis of Slicing for Structured ProgramsEditor IJCATR
 
CFD-CH01-Rao-2021-1.pdf
CFD-CH01-Rao-2021-1.pdfCFD-CH01-Rao-2021-1.pdf
CFD-CH01-Rao-2021-1.pdfSyfy2
 
survey of different data dependence analysis techniques
 survey of different data dependence analysis techniques survey of different data dependence analysis techniques
survey of different data dependence analysis techniquesINFOGAIN PUBLICATION
 
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUES
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUESANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUES
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUESIRJET Journal
 
Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...
 Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio... Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...
Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...hydrologyproject001
 
Course Title: Introduction to Machine Learning, Chapter 2- Supervised Learning
Course Title: Introduction to Machine Learning,  Chapter 2- Supervised LearningCourse Title: Introduction to Machine Learning,  Chapter 2- Supervised Learning
Course Title: Introduction to Machine Learning, Chapter 2- Supervised LearningShumet Tadesse
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series AnalysisAmanda Reed
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 

Similar to Shahid Lecture-8- MKAG1273 (20)

Shahid Lecture-9- MKAG1273
Shahid Lecture-9- MKAG1273Shahid Lecture-9- MKAG1273
Shahid Lecture-9- MKAG1273
 
DEFECT PREDICTION USING ORDER STATISTICS
DEFECT PREDICTION USING ORDER STATISTICSDEFECT PREDICTION USING ORDER STATISTICS
DEFECT PREDICTION USING ORDER STATISTICS
 
Quantitative Risk Assessment - Road Development Perspective
Quantitative Risk Assessment - Road Development PerspectiveQuantitative Risk Assessment - Road Development Perspective
Quantitative Risk Assessment - Road Development Perspective
 
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...
Forecasting Municipal Solid Waste Generation Using a Multiple Linear Regressi...
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
IRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms ComparisonIRJET- Supervised Learning Classification Algorithms Comparison
IRJET- Supervised Learning Classification Algorithms Comparison
 
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdf
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdfNumerical_Analysis_of_Semiconductor_PN_Junctions_U.pdf
Numerical_Analysis_of_Semiconductor_PN_Junctions_U.pdf
 
Qt unit i
Qt unit   iQt unit   i
Qt unit i
 
Blood Transfusion success rate prediction using Artificial Intelligence
Blood Transfusion success rate prediction using Artificial IntelligenceBlood Transfusion success rate prediction using Artificial Intelligence
Blood Transfusion success rate prediction using Artificial Intelligence
 
ai.pptx
ai.pptxai.pptx
ai.pptx
 
A Comparative Analysis of Slicing for Structured Programs
A Comparative Analysis of Slicing for Structured ProgramsA Comparative Analysis of Slicing for Structured Programs
A Comparative Analysis of Slicing for Structured Programs
 
CFD-CH01-Rao-2021-1.pdf
CFD-CH01-Rao-2021-1.pdfCFD-CH01-Rao-2021-1.pdf
CFD-CH01-Rao-2021-1.pdf
 
survey of different data dependence analysis techniques
 survey of different data dependence analysis techniques survey of different data dependence analysis techniques
survey of different data dependence analysis techniques
 
water-13-00495-v3.pdf
water-13-00495-v3.pdfwater-13-00495-v3.pdf
water-13-00495-v3.pdf
 
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUES
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUESANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUES
ANALYSIS AND PREDICTION OF RAINFALL USING MACHINE LEARNING TECHNIQUES
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...
 Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio... Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...
Download-manuals-surface water-waterlevel-37howtodohydrologicaldatavalidatio...
 
Course Title: Introduction to Machine Learning, Chapter 2- Supervised Learning
Course Title: Introduction to Machine Learning,  Chapter 2- Supervised LearningCourse Title: Introduction to Machine Learning,  Chapter 2- Supervised Learning
Course Title: Introduction to Machine Learning, Chapter 2- Supervised Learning
 
Time Series Analysis
Time Series AnalysisTime Series Analysis
Time Series Analysis
 
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 3 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 3 Semester 3 MSc IT Part 2 Mumbai University
 

Recently uploaded

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 

Recently uploaded (20)

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 

Shahid Lecture-8- MKAG1273

  • 1. MAL1303: STATISTICAL HYDROLOGY Multiple Regression Dr. Shamsuddin Shahid Associate Professor Department of Hydraulics and Hydrology Faculty of Civil Engineering Room No.: M46-332; Phone: 07-5531624; Mobile: 0182051586 Email: sshahid@utm.my 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 2. Simple Linear Regression Simple Linear Regression (SLR) is a statistical technique that is used to determine the functional relationship between two variables. Regression gives an equation that best describes the relationship between two variables. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 3. Multiple Linear Regression (MLR) Multiple linear regression is a statistical technique where a dependent variable is predicted from a set of predictors Multiple regression is a statistical technique that is used to identify relationship between a dependent variable and a combination of independent variables. The relationship is valid when few assumptions are fulfilled. Failing to satisfy the assumptions does not mean that relationship is not correct. It means that the relationship may not be strong enough. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 4. • The variables should be measure in interval/ratio scale. • Dependent variable, Y must be normally distributed (no skewness or outliers) • Predictors, X’s do not need to be normally distributed, but if they are it makes for a stronger interpretation. • There should be linear relationship between Y and all X • no outliers among Xs predicting Y • Variance on Y is the same at all values of X (homoscedastic) Linear Multiple Regression: Assumptions 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 5. Linear Multiple Regression: Outliers • Outliers can distort the regression results in multiple regression as like simple linear regression. When an outlier is included in the analysis, it pulls the regression line towards itself. This can result in a solution that is more accurate for the outlier, but less accurate for all of the other cases in the data set. • It is necessary to check for outliers in the dependent variable and in the independent variables. • Removing an outlier may improve the distribution of a variable. • Transforming a variable may reduce the likelihood that the value for a case will be characterized as an outlier. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 6. 1. Decide dependent and independent variables. 2. Test for normality, linearity, homoscedasticity. 3. In necessary, remove the outliers. 4. If it does not satisfy the criteria for normality, transformation is required. Decide which transformations should be used. 5. Substitute transformations and run regression entering all independent variables. 6. Do multiple regression analysis with variables specified in the problem. 7. Test the significance of the regression equation. Linear Multiple Regression: Steps 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 7. Simple Linear Regression In Simple Linear Regression (SLR), the functional relationship between two variables X and Y are determined. Regression equation is the equation of a straight line that best describes the relationship between two variables. When the equation is used to calculate Y from observed X, it gives an error ε in the prediction. Therefore, the Y equals to predicted value plus error. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 8. Multiple Linear Regression (MLR): Basics A multiple linear regression model is called “linear” because only linear coefficients {β} are used. However, transforms of the regressor variables are permitted in an MLR model like SLR. In Multiple Linear Regression (SLR), the functional relationship of dependent variable Y with more than one independent variables are determined. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 9. Multiple Linear Regression (MLR): Basics 1 11 21 2 12 22 1 3 13 23 2 4 14 24 * 4 1 4 2 * 2 1 * y x x y x x b y x x b y x x x x x data design matrix parameters    11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 10. Multiple Linear Regression (MLR): Basics 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 11. Multiple Linear Regression: Basics Create the design Matrix Calculate the parameters: Where, XT is the transpose of Matrix X X-1 is the inverse of Matrix X 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 12. The Goodness of Fit of the Regression Model One measure of how well a statistical model explains the observed data is the coefficient of determination, that is, the square of the Pearson correlation coefficient, r2, between y and x. When x is replaced by , it gives the correlation between actual and predicted value, R2 It can also be measure by, yˆ 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 13. Distinction between r and R are: • r is a measure of association between two random variables whereas R is a measure between a random variable y and its prediction from a regression model. • r lies in the interval - 1  r -1 while the multiple correlation R cannot be negative; that is, it lies in the interval 0  R  1. • R is always well defined, regardless of whether the independent variable is assumed to be random or fixed. In contrast, calculating the correlation between a random variable, Y, and a fixed predictor variable, X, that is, a variable that is not considered random, makes no sense. The Goodness of Fit of the Regression Model 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 14. Multiple Linear Regression: Example It is well known that groundwater recharge is directly related to Rainfall and Soil Moisture Holding Capacity (SMHC). Instrumental data of groundwater recharge, Rainfall and SMHC at six sites has been collected. Find a empirical equation that related groundwater recharge with Rainfall and SMHC 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 15. Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 16. Multiple Linear Regression: Solution Create the design matrix Get solution by: 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 17. Multiple Linear Regression: Solution Excel commands: Matrix Inversion: MINV(array) Matrix Multiplication: MMULT(array1, array2) Matrix Transpose: Copy Matrix -> Past Special with tick on transpose radio button. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 18. Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 19. Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 20. Recharge = 1.38 + 0.12Rainfall – 0.01SMHC Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 21. Recharge = 1.38 + 0.12Rainfall – 0.01SMHC Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 22. Basic assumptions about the errors: 1. The mean of the errors is zero 2. The errors are normally distributed. 3. The variances of the errors for all observations are constant 4. The errors are independent of each other (uncorrelated) Gross violations of these basic assumptions will yield a poor or biased model. However, if the variances of the errors are unequal and can be estimated, weighted regression schemes can sometimes be used to obtain a better model. Multiple Linear Regression (MLR): Assumptions 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 23. is the Variance of residuals Is the corresponding diagonal value of matrix (XTX)-1 Multiple Linear Regression: Confidence Interval Recharge = 1.38 + 0.12Rainfall – 0.01SMHC The parameter values have range. We can find the range of a parameter at a certain level of confidence by using following formula: 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 24. Recharge = 1.38 + 0.12Rainfall – 0.01SMHC Multiple Linear Regression: Confidence Interval n = 6, p = 3 At α = 0.05, t(0.025, 3) = 4.18 s2 = 0.084 -0.35 ≤ β0 ≤ 3.11 -0.10 ≤ β1 ≤ 0.35 -0.16 ≤ β2 ≤ 0.14 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 25. • An estimator with lower variance is more efficient, in the sense that it is likely to be closer to the true value over samples. • The “best” estimator is the one with minimum variance of all estimators Multiple Linear Regression: Efficient Estimator Recharge = 1.38 + 0.12Rainfall – 0.01SMHC -0.35 ≤ β0 ≤ 3.11 -0.10 ≤ β1 ≤ 0.35 -0.16 ≤ β2 ≤ 0.14 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 26. SST = SSE + SSR Sum of Square Total (SST) = Total variability in the observed responses Sum of Square Error (SSE) = Total error by the model, or variability that is not explained by the model Sum of Square Residual (SSR) = Systematic variability that is explained by the regression model. Multiple Linear Regression: Strength 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 27. Mean variation in observations, MST = SST / n-1 Mean Error, MSE = SSE / n-p Mean regression, MSR = SSR / 1 Higher values of R2 indicate a better fit of the model to the sample observations. Disadvantage of R2: Adding any regressor variable to an MLR model, even an irrelevant regressor, yields a smaller SSE and greater R2. For this reason, R2 by itself is not a good measure of the quality of fit. Multiple Linear Regression: Strength 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 28. Multiple Linear Regression: Strength To overcome this deficiency in R2, an adjusted value can be used. The adjusted coefficient of multiple determination ( ) is defined as, Because the number of model coefficients (p) is used in computing, the value will not necessarily increase with the addition of any regressor. Hence, is a more reliable indicator of model quality. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 29. SST = 1.27; SSR = 0.85; SSE = 0.42 MST = 0.26; MSR = 0.85; MSE = 0.14 = 0.67 = 0.45 SST = SSE + SSR Multiple Linear Regression: Strength (Example) Mean variation in observations, MST = SST / n-1 Mean Error, MSE = SSE / n-p Mean regression, MSR = SSR / 1 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 30.  F-test is used to assess the overall ability of a model.  When testing for the significance of the goodness of fit, our null hypothesis is that the explanatory variables jointly equal 0.  If our F-statistic is below the critical value we fail to reject the null and therefore we say the goodness of fit is not significant. Multiple Linear Regression: F-statistics 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 31.  The F-test is useful for testing a number of hypotheses and is often used to test for single, global and the joint significance of a group of variables.  Joint test often refer to ‘testing a restriction’.  This restriction is that a group of explanatory variables are jointly equal to 0 Multiple Linear Regression: F-statistics 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 32. The global F-test is used to assess the overall ability of a model to explain at least some of the observed variability in the sample responses. The global F-test is performed in the following steps: Null hypothesis: β1 = β2 = …. = βk = 0 The global F-statistics is calculated as, F0 = MSR/MSE If F(calculated) > F (critical) (α, k, n-p), (where k = number of regressors; n = data points; p = parameters to be estimated). Reject the null hypothesis and conclude that at least one βj≠0 and at least one model regressor explains some of the response variation. Multiple Linear Regression: F-statistics 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 33. Recharge = 1.38 + 0.12Rainfall – 0.01SMHC Multiple Linear Regression: Example SST = 1.27 MST = 0.26 SSR = 0.85 MSR = 0.85 SSE = 0.42 MSE = 0.14 SST = SSE + SSR F0 = MSR/MSE = 6.07 F (critical) (α, k, n-p) F (critical) (0.05, 2, 3) = 9.55 F(calculated) < F (critical) (α, k, n- p) Null hypothesis can not be rejected. No model regressor explains some of the response variation. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 34. Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 35. Multiple Linear Regression: Example 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 36. Multiple Linear Regression: Example Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 37. Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR Multiple Linear Regression: Example Null hypothesis: β1 = β2 = β3 = β4 = 0 = 0.9865 F0 = MSR/MSE = 7.68 F (critical) (α, k, n-p) = F (critical) (0.05, 4, 7) = 4.12 F(calculated) > F (critical) (α, k, n-p) Null hypothesis rejected. Decision: At least one βj≠0 and at least one model regressor explains some of the response variation. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 38. Multiple Linear Regression: Example Discharge = 33.50 – 0.28ET + 1.53BF + 0.28R 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 39. Discharge = 33.50 – 0.28ET + 1.53BF + 0.28R Multiple Linear Regression: Example Null hypothesis: β1 = β2 = β3 = 0 F0 = MSR/MSE = 6.3 F (critical) (α, k, n-p) = F (critical) (0.05, 3, 8) = 4.07 F(calculated) > F (critical) (α, k, n-p) Null hypothesis rejected. Decision: Groundwater recharge has no significant impact on Discharge. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 40. Multiple Linear Regression: Example Discharge = ? + ? ET + ? BF + ? GWR 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 41.  To carry out this test you need to conduct two separate regression, one with all the explanatory variables in (unrestricted equation), the other with the variables whose joint significance is being tested, removed.  Then collect the RSS from both equations.  Put the values in the formula  Find the critical value and compare with the test statistic. The null hypothesis is that the variables jointly equal 0. Multiple Linear Regression: Joint Significance 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 42. The test for joint significance has its own formula, which takes the following form: RSSrestrictedRSS RSSedunrestrictRSS equationedunrestrictinparametersk nsrestrictioofnumberm knRSS mRSSRSS F R u u uR / /        Multiple Linear Regression: Joint Significance 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 43. Multiple Linear Regression: Joint Significance Obs. No. Y X1 X2 x3 1 5.1 2.3 2.5 4.2 2 6.2 1.9 2.8 3.3 3 4.8 2.0 3.1 4.0 . . . . . . . . . . . . . . . 60 5.9 2.4 3.8 4.6 3322110 xαxαxααy  11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 44. If we have a model consists of three explanatory variables. We wish to test for the joint significance of 2 of the variables (x2 and x3), we need to run the following restricted and unrestricted models: restrictedxααy edunrestrictxαxαxααy t t   110 3322110 Multiple Linear Regression: Joint Significance 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 45. Given the following model, we wish to test the joint significance of x2 and x3. Having estimated them, we collect their respective RSSs (n=60). 51 750 110 3322110 .RSS restrictedxββy .RSS edunrestrictxαxαxααy R t u t     Multiple Linear Regression: Joint Significance 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 46. RSSrestrictedRSS RSSedunrestrictRSS equationedunrestrictinparametersk nsrestrictioofnumberm knRSS mRSSRSS F R u u uR / /        28 01340 3750 460750 275051     . . /. /.. F Multiple Linear Regression: Joint Significance F (critical) (0.05, 2, 56) = 3.16 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 47. As the F statistic is greater than the critical value (28>3.15), we reject the null hypothesis and conclude that the variables x2 and x3 are jointly significant and should remain in the model. 0:, 0:, 32 320     AHHypothesiseAlternativ HHypothesisNull Multiple Linear Regression: Joint Significance 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 48. Choosing the Best MLR Model • One of the major issues in multiple regression is the appropriate approach to variable selection. • To make a appropriate regression model, we need to subsequently add or delete variables from model. • The benefit of adding additional variables to a multiple regression model is to account for or explain more of the variance of the response variable. The cost of adding additional variables is that the degrees of freedom decreases, making it more difficult to find significance in hypothesis tests and increasing the width of confidence intervals. A good model will explain as much of the variance of y as possible with a small number of explanatory variables. 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 49. The choice of whether to add a variable is based on a "cost-benefit analysis", and variables enter the model only if they make a significant improvement in the model. There are at least two types of approaches for evaluating whether a new variable sufficiently improves the model. The first approach uses partial F-tests, and when automated is often called a "stepwise" procedure. The second approach uses some overall measure of model quality. The latter has many advantages. Choosing the Best MLR Model Discharge = 21.97 – 0.19ET + 1.55BF + 0.94R -1.05GWR 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
  • 50. Choosing the Best MLR Model 11/23/2015 Shamsuddin Shahid, FKA, UTM You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)