1. Application No. : 38464073e50d11e99aa6f572b452740e
Name: SAYANTAN SARKAR
Affiliation: Banaras Hindu University
Email: sayantansarkar530@gmail.com
CC BY-SA-NC
2. Acknowledgement
Course Name: Academic Writing
Every author owes a great deal to others, and I am no exception. First I would
like to acknowledge Dr. Ajay Semalty of Garhwal University, for teaching us
academic writing in such a nice manner in the SWAYAM programme. Next I
would like to acknowledge the assistance of books, Google scholar and my
friend Bapi Biswas. Finally, I am deeply appreciative and dedicate to my
supportive family.
CC BY-SA-NC
4. REGRESSION
The term regression was first used by
Francis Galton with reference to
inheritance of stature. Galton found
that children of tall parents tend to be
less tall and children of short parents
less short , than their parents. In other
words ,the height of the offspring tends
to "move back" towards the mean height
of general population. This tendency
towards the " mean height".
This principle is known as
REGRESSION.
CC BY-SA-NC
5. What is Regression
Regression is a prediction statistics.
It predicts the most likely value of a variable (criterion variable) on the
basis of given values of another variable or variables( predictor).
The variable whose values are predicted is known as dependent or
criterion variable and whose values from the basis of prediction is
called independent variable or predictor.
Regression can be worked out only if the dependent variable and the
independent variable possess significant correlation with each other.
It translate the relation between two or more variables in to an
expression showing one of them as function of others(summary
line).
CC BY-SA-NC
6. What is Regression
Regression ,like correlation holds good only in particular population to
which the sample belongs, and only for that limited range of scores of
the variable from which it has been derived; it cannot be extend
beyond this limit.
The regression line is called the "best-fitting" line because "on
average" it passes through the center of criterion variable.
CC BY-SA-NC
7. Types of Regression
SIMPLE REGRESSION-The criterion or dependent variable is a
function of single independent variable or predictor.
The scores of the criterion are predicted from the given scores of the
single predictor.
e.g., the regression of examination marks of a candidate in
mathematics on his/her numerical aptitude test score.
MULTIPLE REGRESSION-The criterion is function of two or more
predictors.
Its scores are predicted from the scores of more than one predictor.
e.g., The regression of math’s mark of an examinee on his her
numerical aptitude and abstract reasoning test score.
CC BY-SA-NC
8. Models of Regression
MODEL 1 – It is the regression of dependent variable or criterion(y) on
an independent or predictor(x) which is fixed treatment variable.
It is a value of y for the specified value of x when the predictor is varied
by the investigators at precise and predetermined manner and rate.
The value of y suffers from error due to random variation but the value
of x are free from random errors due to because they vary under plan
and control of investigators and are not random.
It can explore causation ( cause and effect relation).
e.g., the regression of blood sugar level on pre determined doses of
injecting insulin.
CC BY-SA-NC
9. Models of Regression
MODEL 2 – It is the regression of criterion Y on the predictor X which
is "classified variable" which is beyond the control of the investigator.
Predict mostly likely value of Y on the basis of already existing value
of X in the individual. Here X is measured but not applied by the
investigator.
Its value of both X and Y suffers from random errors.
It cannot explore the cause and effect relationship between the
variable.
e.g.; examination makes in a language on the verbal abilty test of an
score of an examinee.
CC BY-SA-NC
10. Models of regression
MODEL 3 – this is always a multiple regression predicting value of
dependent variable, from given value of two or more predictors, where
the predictor variable can be either fixed treatment variable or
classification variable.
e.g.: regression of surface area on height and weight of the individual.
CC BY-SA-NC
11. Assumptions of simple linear regression
The variables involved in regression are continuous measurement
variable.
Both variables have unimodal and fairly symmetrical distribution in
population.
The scores of criterion variable is in linear function of scores of the
predictor variable.
The Y scores of criterion, measured in a large number of the predictors
are distributed normally independent of each other.
The predictor variable is either "fixed" experimental treatment or a
classification variable.
CC BY-SA-NC
12. Computation simple linear equation
Simple linear regression expresses a single dependent variable Y as the
linear function of a single Independent variable X.
First, we establish the relationship in a sample .Then we use the
regression line to determine the Y' (likely Y score with respect to X
score) for each X..
We can measure the X scores of individual who are not in our sample,
and the corresponding Y' is the best prediction of Y scores.
e.g.: when SAT score are used to predict a student's future college
grades, SAT scores are predictor variable and college grades is criterion
variable.
CC BY-SA-NC
13. Computation simple linear equation
Regression Equation: The equation describes two characteristics of regression
line.
1. The slope
2. Y intercept
SLOPE:- The slope is the number that indicate how slanted the regression line is
and the direction in which it slants.
Y intercept:- It is the vaue of Y at the point where regression line crosses the Y-
axis. So, the intercept is the value of Y score where x equals 0.
CC BY-SA-NC
14. Computation simple linear equation
LINEAR REGRESSION EQUATION
Y' = bX + a
Here, b = regression coefficient(slope of the line),
Regression coefficient: regression coefficient is the average rate of
increase or decrease in score of criterion for unit rise in score of
predictor.
a = the y intercept(where the line crosses Y axis,where X=0)
X= value of predictor variable.
Y= value of criterion variable.
STEP 1 - Compute Pearson Product Moment r .
STEP 2 – Computation of slope
Where,
CC BY-SA-NC )()(
))(()(
22
YXN
YXXYN
b
15. Computation simple linear equation
If the b the slope is positive which will be fitting the score we get from
computing the correlation coefficient.
STEP 3 – a = Y"-(b)(X").
Here, Y"= sum of values of Y score divided by the number of
cases N.
, X"=sum of value of X divided by number of cases N.
STEP 4 - After getting the slope of (b) and the Y intercept (a) we can
now compute the Y' score for each value of X score by the formula.
By pacing the score of X (the score we have already) in the equation of
linear regression we can get the Y' score for each value of X .
To graph regression line , we need to plot the data points for the
previous X-Y' pairs and draw the line.
CC BY-SA-NC
16. •
• From the graph the regression line clearly shows
that when we have the DAT (X-predictor) score 70
then the predicted maths (Y-criterion) score is 43
(Y'=predicted score).Again, when we have the
DAT score 110 then the predicted maths score is
61.8.The Y intercept is 10.11 and the slope is
+0.47.
CC BY-SA-NC
17. Multiple Regression
Multiple regression, a method of multivariate statistics, predict likely
value of a variable (criterion or independent variable) from values of
two or more variable ( predictor or independent variable).
Can be only computed if the variable possess a significant co-
relationship with each other.
It shows criterion variable as a function of the predictor variables.
The predictor variables can be "fixed" treatment variable or
classification variable.
CC BY-SA-NC
18. Multiple regression
Multiple regression even limited to two predictors can be relatively
complex.
Usually, different predictor variables are related to each other which
means they are often measuring and predicting the same thing.
Because variables may overlap with each other adding another
predictor variable does not always add to accuracy of prediction.
CC BY-SA-NC
19. Multiple regression and partial correlation
A direct procedure for controlling correlation third variable is by
partial correlation.
Its allows researcher to measure the relationship between two
variables while eliminating or holding constant the effect of the third
variable.
Here three variables are X,Y and Z it is possible to compute three
individual Pearson correlation
,rXY meaning correlation between X and Y.
,rXZ meaning correlation between X and Z.
,rYZ meaning correlation between Y and Z.
CC BY-SA-NC
20. Multiple regression and partial correlation
Where , r XY.Z = the pearson correlation product moment r when
X and Y when the Z is hold constant.
, rXY = the correlation between X and Y
, r square = coefficient of determination
CC BY-SA-NC
)1)(1(
)(
22.
YZXZ
XZYZXY
ZXY
rr
rrr
r
21. Assumptions of multiple linear regression
All the variables involved should be continuous measurement
variables.
Their scores have a unimodal and fairly symmetrical distribution in
population.
The paired scores of each pair of variables in an individual are
independent of all other such paired scores in the sample.
There is a linear association between the scores of each pair of variable.
CC BY-SA-NC
22. Computation multiple linear regression
Computation of linear regression with three variables-
Regression predicts the most likely value of Y' of the criterion Y from the given
value of two predictors X1 and X2.The general regression equation for the
straight line showing Y as the linear function of X1 and X2 is as
Y' = a + b1X1 + b2X2
Here, Y' = predicted value of Y
, a = Y intercept
, b1 = slope of regression line of Y on X1 when X2 is constant
Partial regression coefficient.
, b2 = slope of regression line of Y on X2 when X1 is constant
Partial regression coefficient.
,X1 = predictor variable 1
,X2 = predictor variable 2
CC BY-SA-NC
23. Computation of multiple linear regression
Here, b1 and b2 are computed using the SDs (SY,S1 and S2) of respective
variable and their beta coefficient ( β1 and β2).
So here ,
And
So, a = Y" - b1X"1 – b2X"2.
Here, a= the y intercept.
,Y" = the mean score of the variable Y
,X"1 = the mean score of the variable X1
,X"2 = the mean score of the variable X2.
Putting the values in the linear regression equation we can get the predicted
Y' value with respect to both the independent or predictors variables.
CC BY-SA-NC
2
12
1221
1
1 r
rrr YY
2
12
1212
2
1 r
rrr YY
2
21
1
11
SD
SD
b
SD
SD
b
Y
Y
24. Computation of multiple regression
Here β is known as standard regression coefficient which are in
normalized units. So one standard score unit increase in IV is
associated with how many standard score units in DV. So β are those
proportion of total variance of criterion variable that are associated
with variance of respective predictor variables.
CC BY-SA-NC
25. Computation of multiple regression
Computation of multiple regression more than three variable- so if we
consider that number of variable involved is "n" then the regression
equation will be
Y' = a + b1X1+b2X2+….....+bnXn
Where, a = the Y intercept,
, b1x1 = the regression slope of X1 on Y while all other predictor
variables are hold constant
,bnxn = the regression slope of Xn on Y while all other predictor
variables are hold constant
CC BY-SA-NC
26. Further Reading
Garrett, H.E. Statistics In Psychology And Education (6th ed.).
Gravetter, F.J., & Wallnau, L.B. Statistics For The Behavioral Science (10th ed.).
CC BY-SA-NC
27. REFERENCES
Das, D., & Das, A.(2017). Statistics In Biology And Psychology (6th ed.). pg. 193-
204. Kolkata: Academic Publisher.
Tabachnick, B.J., & Fidell, L.S. (2012). Using Multivariate Statistics (6th ed.). Pg.
117-118. London: Pearson.
Image source: Das, D., & Das, A.(2017). Statistics In Biology And Psychology (6th
ed.). pg. 201. Kolkata: Academic Publisher.
CC BY-SA-NC
28. FEEDBACK
The course academic writing was a great experience for me. It helped
me a lot to learn about various scope of academic writing. It met my
expectations very beautifully. I appreciate the time to time lectures
and self assessment quiz in every week, which resulted in better
understanding of the topic. An additional gain was from the week
study of OERs. I would like to thank the whole team of academic
writing for organizing such a important course in such a
comprehensive manner.
CC BY-SA-NC