4. Definition
Regression is the measure of the average
relationship between two or more variables in
terms of the original units of the data.
It is unquestionable the most widely used statistical
technique in social sciences. It is also widely used
in biological and physical science.
5. Prediction using Regression analysis
Interpolation
If we give prediction within
the range of the given data
value
If we give prediction outside
the range of the given data
value
Extrapolation
7. Relationship
Benefits of Regression analysis
Estimate
It provides estimate of
values of dependent
variables from values
of independent
variables
Extended
It can be extended
to 2 or more
variables, which is
known as multiple
regression
It shows the nature
of relationship
between two or
more variables
8. In the field of Business
The success of a
business depends on the
correctness of the
estimates in predicting
future production,
prices, profits, sales etc.
9. Itβs an important tool for modelling and analysing data. Itβs
used for many purposes like:
β Forecasting
β Predicting
β Finding the causal effect of one variable on another
For example, the effects of price increase on the customerβs
demand or an increase in salary causing a change in
spending etcβ¦
Uses of Regression
11. Requirements
β’ The sample of paired data is a simple random
sample of quantitative data.
β’ The pairs of data π₯, π¦ have a bivariate normal
distribution, meaning the following:
- Visual examination of the scatter plot
confirms that the sample points follow an
approximately straight line.
- No Outliners.
12. Methods of regression study
Graphically
By using free
hand curve or
the least squares
method
Algebrically
By using the Least
Square Method or
the deviation
method from
arithmatic or
Assumed Mean
13. The Linear Equation
Y = a + bX
Where
Y = dependent variable
X = independent variable
a = constant (value of Y when X = 0)
b = the slope of the regression line
14. Least Square Method
βπ = ππ + πβπΏ
βπΏπ = πβπΏ + πβπΏπ
The values of a and b are found with the
help of least of Squares-reference
methodβs normal equations
Y=Dependent variable
X=Independent variable
Y = a + b X
01
02
03
15. Equation parameters
βaβ
β’ a is the point at which the
slope line passes through the
Y axis.
β’ can be positive or negative
β’ may be referred to a as the
intercept.
βbβ
β’ (the slope coefficient)
β’ can be positive or
negative.
β’ denotes a positive
or negative relationship.
21. To Use the Equation in Predictions
1. If the graph of the regression line on the scatter plot confirms
that the line fits the points reasonably well.
2. If the data used for prediction does not go much beyond the
scope of the available sample data.
3. If there is a significant linear correlation indicated between the
two variables, π₯ and π¦.
23. Residual - Error
Residual β for a pair of sample π₯ and π¦ values, the difference
between the observed sample value of π¦ (a true value observed)
and the y-value that is predicted by using the regression equation π¦
is the residual
β’ π ππ πππ’ππ = πππ πππ£ππ - πππππππ‘ππ
= ππΆπππππππ - ππ·πππ πππππ
β’ A residual represents a type of inherent prediction error
β’ The regression equation does not, typically, pass through all the
observed data values that we have
25. Residual Plot
Residual Plot β a scatter plot of the π₯, π¦ values after each
of the y-values has been replaced by the residual
value, βππΆπππππππ - ππ·πππ πππππ β
β’ That is, a residual plot is a graph of the points
π₯, ( ππΆπππππππ - ππ·πππ πππππ )
26. Residula Plot Analysis
When analysing a residual plot, look for a pattern
in the way the points are configured, and use
these criteria:
1. The residual plot should not have any obvious patterns
(not even a straight line pattern). This confirms that the
scatterplot of the sample data is a straight-line pattern.
27. Residula Plot Analysis
When analysing a residual plot, look for a pattern
in the way the points are configured, and use
these criteria:
2. The residual plot should not become thicker (or thinner)
when viewed from left to right. This confirms the
requirement that for different fixed values of x, the
distributions of the corresponding y values all have the
same standard deviation.
29. Regression model is a good
model
Residual Plot Suggesting that
the regression Eqution is a
Good Model
30. Distinct pattern: sample data
may not follow a straight-line
pattern.
Residual Plot with an Obvious
Pattern, Suggesting that the
regression equation Isnβt a
good model.
31. Residual plot becoming
thicker: equal standard
deviations violated
Residual Plot that becomes
thicker. Suggesting that the
regression equation Isnβt a
good model
32. CREDITS: This presentation template was
created by Slidesgo, including icons by
Flaticon, and infographics & images by
Freepik
Many
Thanks!
Presented by: