Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
MAL1303: Regression Analysis for Predicting Seashore Erosion
1. MAL1303: STATISTICAL
HYDROLOGY
Regression Analysis
Dr. Shamsuddin Shahid
Associate Professor
Department of Hydraulics and Hydrology
Faculty of Civil Engineering
Room No.: M46-332;
Phone: 07-5531624; Mobile: 0182051586
Email: sshahid@utm.my
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
2. Regression
Questions:
Two variables are associated with one another. If one
variable is changed, then how much the other one
change?
How can we mathematically formalize the functional
relationship between two variables?
Answer:
Regression Analysis
Definition: Regression is a statistical technique that is used to
determine the functional relationship between two variables.
Regression gives an equation that best describes the relationship
between two variables.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
3. Research Questions: Are two variables related?
Example questions in hydrology:
“Is there any relation between rainfall and river discharge?”
“Is there any relation between low river flow and river water
quality?”
“Is there any relation between elevation and rainfall?”
“Is there any relation between rainfall intensity and landslides?
Test the relationship: Correlation
If you change the questions from “Is” to “How” or “What”, e.g.
“How rainfall and River Discharge is Related?”
To nee to go for: Regression Analysis
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
4. Simple Regression
The dependent variable is the variable for which we
want to make a prediction and independent variable is
the variable that is used to predict.
Simple regression analysis is a statistical tool that gives
us the ability to estimate the mathematical relationship
between a dependent variable (usually called y) and an
independent variable (usually called x).
Regression can be Linear or Non-linear forms, but
simple linear regression models are the most common
in hydrology.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
5. The goal is to find a functional relation between the response
variable y and the predictor variable x.
y = f (x)
Another primary goal of quantitative analysis is to use current
information about a phenomenon to predict its future behavior.
Regression: Main Goals
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
6. What is Regression?
Data of Height of Sea Waves and Erosion in Seashore are collected to find
how much responsible the sea waves are in beach erosion.
We calculated the correlation coefficient between Wave height and Erosion
is 0.79.
Regression calculate the functional relation between Wave height and
Erosion as, Erosion = 7.32 + Height × 0.62
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
7. Pictorial Presentation of Linear Regression Model
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
8. Regression analysis serves Three major purposes:
1.Description
2.Control
3.Prediction
Uses of Regression Analysis
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
9. Difference between Correlation and Regression
Correlation quantifies the degree to which two variables are related.
Correlation does not find functional relation. We simply compute a
correlation coefficient that tells us how much one variable tends to
change when the other one does.
With correlation we don't have to think about cause and effect. We
simply quantify how well two variables relate to each other. With
regression, we do have to think about cause and effect as the regression
line is determined as the best way to predict Y from X.
With correlation, it doesn't matter which of the two variables we call "X"
and which you call "Y". We get the same correlation coefficient if you
swap the two. With linear regression, the decision of which variable you
call "X" and which you call "Y" matters a lot, as you'll get a different best-
fit line if we swap the two. The line that best predicts Y from X is not the
same as the line that predicts X from Y.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
10. Linear and Non-linear Regression
In Linear Regression, the model function is a linear combination of
parameters. Such as y = mx + c, i.e the mode can be represent a
straight line.
In Non-linear Regression, the parameters appears as a non-linear
combination of parameter. Such y = x3 + 5e-3
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
11. Construction of Regression Models
Selection of independent variables
Functional form of regression relation
Scope of model
– Least square and correlation based
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
12. Linear Regression – General Principle
A linear relationship between two
variables x and y can be expressed
by the equation,
y = mx + c
Where,
y is the dependent variable
x is independent variable
m and c are constants
In the general linear equation,
The value of m is called the slope. The slope determines how much the
y variable will change when x is increased or decreased by one point
The value of c in the general equation is called the Y-intercept. It
determines the value of y when x=0
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
13. Least Squares Regression Principle
Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
14. The Least Squares Solution
For each value of x in the data, this equation will determine the point on the
line that gives the best prediction of y
The problem is to find the specific values for m and c that will make this line
the best fitting. Least squares estimate of m
Where:
SP is the sum of products
SSx is the sum of squares for the X scores and
m =
SP
SSx
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
15. Example of Regression Analysis
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
16. Standard Error of Estimate
A regression equation, by itself, allows you to make predictions, but it does not
provide any information about the accuracy of the predictions
The standard error of estimate gives a measure of the standard distance
between a regression line and the actual data points
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
17. Error Estimation Formula
To calculate the standard error of estimate Find a sum of squared deviations
(SS)
This sum of squares is commonly called SSerror
SSerror = Σ(Y-Ŷ)2
The obtained SS value is then divided by its degrees of freedom to obtain a
measure of variance. The df for standard error of estimate are
df = n – 2
The standard error of estimate provides a measure of how accurately the
regression equation predicts the y value, Standard Error =
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
18. Error Estimation Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
19. • The relationship between the variables is linear.
• Both variables must be at least interval scale.
• The least squares criterion is used to determine the equation.
Assumptions
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
20. Example
It is anticipated that climate change will make the sea more rough than
ever before. It may impact on erosion in Seashore line. Data are collected
about average wave height (in meter) during cyclone and Erosion in
seashore (cm/cyclone event). Try to find out a relation for future
prediction of Seashore erosion due to more rough sea.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
21. Example: Solution
10.0
15.0
20.0
25.0
30.0
35.0
1.5 2.0 2.5 3.0 3.5
Y = mX + c
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
22. Example: Solution Y = mX + c
Calculate m and Calculate c
m = 9.585
c = -1.00
Y = 9.585X – 1.00
Erosion =9.585 x Height – 1.00
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
23. Example: Solution
Erosion =9.585 x Height – 1.00
Error = 4.7778 cm
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
24. Example: Solution
Erosion =9.585 x Height – 1.00
With Error = 4.7778 cm
If Height is 4.0 m
Erosion =
9.585 x Height – 1.00
= 39.36 cm
=34.14 to 44.59 cm
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
25. Regression Analysis – Least Squares Principle
The least squares principle is used to obtain a
and b.
The equations to determine a and b are:
b
n XY X Y
n X X
a
Y
n
b
X
n
( ) ( )( )
( ) ( )
2 2
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
26. Correlation Based Method: Computing the Slope
Y = mX + c
Calculate Slope m;
Calculate Intercept c
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
27. Computing the Y-Intercept
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
28. Illustration of the Least Squares Regression Principle
Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
29. It is anticipated that climate change will make the sea more rough than
ever before. It may impact on erosion in Seashore line. Data are collected
about average wave height (in meter) and Erosion in seashore (cm/year).
Try to find out a relation for future prediction of Seashore erosion due to
more rough sea.
Regression Equation - Example
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
30. Regression Equation - Example
Correlation Coefficient, r = 0.99257
Sx = 0.5243
Sy = 5.0652
m = r (Sy/Sx)
= 0.99257 x (5.0652/0.5243)
= 9.589
c = -1.01
Y = 9.589X - 1.01
Erosion = 9.589 x Height - 1.01
It was by least square method: Erosion =9.585 x Height – 1.00
Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
31. Assumptions in Linear Regression Model
For each value of X, there is a group of Y values, and these
Y values are normally distributed. The means of these normal
distributions of Y values all lie on the straight line of regression.
The standard deviations of these normal distributions are equal.
The Y values are statistically independent. This means that in the
selection of a sample, the Y values chosen for a particular X value
do not depend on the Y values for any other X values.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
32. Confidence Interval Estimates of Y
A confidence interval reports the mean value of Y for a given X.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
33. Confidence Interval Estimates of Y
Erosion = 9.585 x Height – 1.00
If Height is 4.0 m
Erosion = 9.585 x 4.0 – 1.00
= 39.36 cm
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
34. Confidence Interval Estimates of Y
Erosion = 9.585 x Height – 1.00
If Height is 2.5 m
Erosion = 9.585 x Height – 1.00
= 23.0 cm
Degree of Freedom, df = n-2 = 11-2 = 9
t(0.05; 9) = 2.262
Serr = 4.7778
Y(predicted) = 23.0
Confidence Interval = 23.0 ± 3.32
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
35. Confidence Interval Estimates of Y
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
36. Confidence Interval of Y
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
37. Prediction Interval Estimates of Y
A prediction interval reports the range of values of Y for a
particular value of X.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
38. Prediction Interval Estimates of Y
Erosion = 9.585 x Height – 1.00
If Height is 4.0 m
Erosion = 9.585 x Height – 1.00
= 39.365 cm
Degree of Freedom, df = n-2 = 11-2 = 9
t(0.05; 11) = 2.262
Serr = 4.7778
Y(predicted) at 4.0 m height = 22.26
Prediction Interval = 39.365 ± 15.37
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
39. Confidence Interval and Confidence Interval of Y
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
40. Transforming Data
The coefficient of correlation describes the strength of
the linear relationship between two variables. It could be
that two variables are closely related, but there
relationship may not be linear.
Be cautious when you are interpreting the coefficient of
correlation. A value of r may indicate there is no linear
relationship, but it could be there is a relationship of
some other nonlinear or curvilinear form.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
41. Non-linear Data
The correlation between the
Rainfall and River Dischare is
0.782. This is a fairly strong
inverse relationship.
However, when we plot the
data on a scatter diagram the
relationship does not appear
to be linear; it does not seem
to follow a straight line.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
42. Transforming Data
What can we do to explore other (nonlinear) relationships?
One possibility is to transform one of the variables. For
example, instead of using Y as the dependent variable, we
might use its log, reciprocal, square, or square root.
Another possibility is to transform both of the variable in the
same way.
There are many other transformations, but log, reciprocal,
square, or square root are the most common.
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
43. Transforming Data
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
44. After log transformation of River Discharge Data we
got the regression equation as:
Transforming Data
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
45. • The value 6.4372 is the log to the base 10 of winnings.
• The antilog of 6.4372 is 2.736
• Therefore, when rainfall is 70mm, discharge is 2.736 cumec.
Transforming Data
Prediction of River Discharge from Rainfall. What is
discharge when rainfall is 70 mm?
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
46. Interpretation of Regression Equation
Y = mX + c
What does m mean?
What does c mean?
Let we got a regression equation:
Y = 10.2 X + 21.9
How will you interpret it?
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)
47. How will you interpret the following regression equation:
Y = 10.2 X + 21.9
Y = 10.2 X – 21.9
Y = 21.9 – 10.2 X
Interpretation of Regression Equation
11/23/2015 Shamsuddin Shahid, FKA, UTM
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)