SlideShare a Scribd company logo
1 of 63
Correlation and RegressionCorrelation and Regression
Correlation and RegressionCorrelation and Regression
Are two statistical techniques that are usedAre two statistical techniques that are used
to examine the nature and strength of theto examine the nature and strength of the
relationships between two variables.relationships between two variables.
CorrelationCorrelation
Correlation analysis is concerned withCorrelation analysis is concerned with
measuring the strength of the relationshipmeasuring the strength of the relationship
between variables.between variables.
When we compute measures of correlationWhen we compute measures of correlation
from a set of data, we are interested in thefrom a set of data, we are interested in the
degree of the correlation between variables.degree of the correlation between variables.
Relationship Between VariablesRelationship Between Variables
Examples of two variablesExamples of two variables
Blood pressure and ageBlood pressure and age
Height and weightHeight and weight
The concentration of an injected drug andThe concentration of an injected drug and
heart rateheart rate
TheThe consumption level of some nutrient andconsumption level of some nutrient and
weight gain.weight gain.
Correlation coefficient
Correlation coefficient of variables X and Y
shows how strongly the values of these
variables are related to one another.
It is denoted by r and r [-1, 1].∈
If the correlation coefficient is positive, then both
variables are simultaneously increasing (or
simultaneously decreasing).
If the correlation coefficient is negative, then
when one variable increases while the other
decreases, and reciprocally.
Coefficient of Correlation ValuesCoefficient of Correlation Values
-1.0-1.0 +1.0+1.000
PerfectPerfect
PositivePositive
CorrelationCorrelation
-.5-.5 +.5+.5
PerfectPerfect
NegativeNegative
CorrelationCorrelation
NoNo
CorrelationCorrelation
Increasing degree ofIncreasing degree of
positive correlationpositive correlation
Increasing degree ofIncreasing degree of
negative correlationnegative correlation
Although there is no fixed rule or interpretation ofAlthough there is no fixed rule or interpretation of
the strength of a correlation, we will say thatthe strength of a correlation, we will say that
the correlation isthe correlation is
Strong ifStrong if
Moderate ifModerate if
Weak ifWeak if
We will also add the words positive or negative toWe will also add the words positive or negative to
indicate the type of correlation.indicate the type of correlation.
0.8r ≥
0.5 0.8r≤ ≤
0 0.5r≤ ≤
Correlation coefficient
Positive when large values of one variable
are associated with large values of the
other.
Correlation coefficient
Negative when large values of one
variable are associated with small values
of the other.
Correlation coefficient
Simple Correlation coefficient (Simple Correlation coefficient (rr))
It is also calledIt is also called Pearson's correlationPearson's correlation, it, it
measures the nature and strength between twomeasures the nature and strength between two
variables of the quantitative type.variables of the quantitative type.
The simple correlation coefficient is obtainedThe simple correlation coefficient is obtained
using the following formula:using the following formula:
wherewhere nn is the sample size,is the sample size, xx is the independentis the independent
variable andvariable and yy is the dependent variable.is the dependent variable.
1111






∑
∑
−





∑
∑
−
∑
∑ ∑
−
=
n
y)(
y.
n
x)(
x
n
yx
xy
r
2
2
2
2






∑
∑
−





∑
∑
−
∑
∑ ∑
−
=
n
y)(
y.
n
x)(
x
n
yx
xy
r
2
2
2
2
Coefficient of CorrelationCoefficient of Correlation
An alternative formula for computing the coefficient ofAn alternative formula for computing the coefficient of
correlation,correlation, rr
( )( )
( ) ( )
2 22 2
i i i i
i i i i
n x y x y
r
n x x n y y
−
=
− −
∑ ∑ ∑
∑ ∑ ∑ ∑
pH (x)pH (x) Optical density (y)Optical density (y)
xy X2 y2
33 0.10.1 0.3 9 0.01
44 0.20.2 0.8 16 0.04
4.54.5 0.250.25 1.125 20.25 0.0625
55 0.320.32 1.6 25 0.1024
5.55.5 0.330.33 1.815 30.25 0.1089
66 0.350.35 2.1 36 0.1225
6.56.5 0.470.47 3.055 42.25 0.2209
77 0.490.49 3.43 49 0.2401
7.57.5 0.530.53 3.975 56.25 0.2809
Total 49 3.04 18.2 284 1.1882
Computation TableComputation Table
Coefficient of CorrelationCoefficient of Correlation
( )( )
( ) ( )
2 22 2
i i i i
i i i i
n x y x y
r
n x x n y y
−
=
− −
∑ ∑ ∑
∑ ∑ ∑ ∑
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
2 2
9 18.2 49 3.04
0.9891
9 284 49 9 1.1882 3.04
r
−
= =
− −
rr = 0.989= 0.989
rr ≈≈ 0. 990. 99
WarningWarning
The correlation coefficient (The correlation coefficient ( rr) measures) measures
the strength of the relationship betweenthe strength of the relationship between
two variables.two variables.
Just because two variables are relatedJust because two variables are related
does not imply that there is a cause-and-does not imply that there is a cause-and-
effect relationship between them.effect relationship between them.
Spearman’s Correlation Coefficient
It is a non-parametric measure of
correlation used in the case of ordinal or
qualitative ( ratio or relative) variables.
This procedure makes use of the two sets
of ranks that may be assigned to the
sample values of x and y.
Spearman’s Correlation Coefficient
Spearman Rank correlation coefficient
could be computed in the following cases:
1.Both variables are quantitative.
2.Both variables are qualitative ordinal.
3.One variable is quantitative and the other
is qualitative ordinal.
Spearman’s Correlation Coefficient
Procedures:
1. Rank the values of X from 1 to n , where n is the numbers of pairs of values
of X and Y in the sample.
2. Rank the values of Y from 1 to n.
3. Compute the value of for each pair of observations by subtracting the
rank of from the rank of
4. Square each and compute which is the sum of the squared values.
5. Apply the following formula
The value of rs denotes the magnitude and nature of association giving the same
interpretation as simple r.
Spearman’s Correlation Coefficient
Example: In a study of the relationship between education level and health
awareness, the following data was obtained. Find the relationship between
them and comment.
Health awareness ( )Education level ( )No.
25preparatory.1
10primary.2
8university.3
10secondary4
15secondary5
50illiterate6
60university.7
Spearman’s Correlation Coefficient
Solution:
Rank ( )Rank ( )( )( )No.
423525Preparatory1
0.250.55.5610Primary2
30.25-5.571.58University3
4-25.53.510secondary4
0.25-0.543.515secondary5
2552750illiterate6
0.250.511.560university7
64Total
Comment: There is an indirect weak correlation between education level
and health awareness.
Spearman’s Correlation Coefficient
Would you justify your findings in the
previous example?!
Non-Parametric Correlations
Spearman’s Correlation Coefficient:
Firstly, it ranks the data and then applies
Pearson’s correlation to these ranks
where rs is Spearman’s correlation coefficient, d2
is
the difference between the ranks and n is the
number of cases.
)1(
6
1 2
2
−
×
−=
∑
nn
d
rs
RegressionRegression analysisanalysis
RegressionRegression analysis is helpful in ascertaining theanalysis is helpful in ascertaining the
probable formprobable form of the relationship between variables.of the relationship between variables.
The ultimate objectives when this method of analysisThe ultimate objectives when this method of analysis
is employed usually is tois employed usually is to predictpredict oror estimateestimate the valuethe value
of one variable corresponding to a given value ofof one variable corresponding to a given value of
another variable.another variable.
Simple linear regressionSimple linear regression
Simple linear regressionSimple linear regression
In simple linear regression we are interested inIn simple linear regression we are interested in
two variablestwo variables xx andand yy..
The variableThe variable xx is usually referred to as theis usually referred to as the
independent variableindependent variable, since frequently it is, since frequently it is
controlled by the investigator; that is; values ofcontrolled by the investigator; that is; values of
xx may be selected by the investigator and,may be selected by the investigator and,
corresponding to each preselected value of x,corresponding to each preselected value of x,
one -or more- value ofone -or more- value of yy is obtained.is obtained.
The other variable, y, accordingly, is called theThe other variable, y, accordingly, is called the
dependent variabledependent variable, and we speak of the, and we speak of the
regression ofregression of yy onon xx..
The regression equationThe regression equation
In simple linear regression the object of theIn simple linear regression the object of the
researcher’s interest is theresearcher’s interest is the regression equationregression equation
that describes the true relationship betweenthat describes the true relationship between
the dependent variable y and the independentthe dependent variable y and the independent
variable x.variable x.
Scatter diagramScatter diagram
A first step that is usually useful in studying theA first step that is usually useful in studying the
relationship between two variables is torelationship between two variables is to
prepare aprepare a scatter diagramscatter diagram of the data.of the data.
The points are plotted by assigning values ofThe points are plotted by assigning values of
the independent variable x to the horizontalthe independent variable x to the horizontal
axis and values of the dependent variable y toaxis and values of the dependent variable y to
the vertical axis.the vertical axis.
The pattern made by the points plotted on theThe pattern made by the points plotted on the
scatter diagram usually suggests the basicscatter diagram usually suggests the basic
nature and the strength of the relationshipnature and the strength of the relationship
between two variables.between two variables.
ExampleExample
pHpH Optical densityOptical density
33 0.10.1
44 0.20.2
4.54.5 0.250.25
55 0.320.32
5.55.5 0.330.33
66 0.350.35
6.56.5 0.470.47
77 0.490.49
7.57.5 0.530.53
Relationship between pH and optical density
Scatter DiagramScatter Diagram
Relationship between pH and optical density
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
NotesNotes
The points in the figureThe points in the figure
seems to be scatteredseems to be scattered
around an invisible straightaround an invisible straight
line.line.
The scatter diagram alsoThe scatter diagram also
shows that, in general, highshows that, in general, high
pH also has high opticalpH also has high optical
density reading.density reading.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
These impressionsThese impressions
suggest that thesuggest that the
relationship betweenrelationship between
points in the two variablespoints in the two variables
may be described by amay be described by a
straight line crossing the y-straight line crossing the y-
axis near the origin andaxis near the origin and
making approximately a 45making approximately a 45
degree angle with the x-degree angle with the x-
axis .axis .
It looks as if it would beIt looks as if it would be
simple to draw, freehand,simple to draw, freehand,
through the data points thethrough the data points the
line that describe theline that describe the
relationship between x andrelationship between x and
y.y.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
It is highly unlikely, however, that the linesIt is highly unlikely, however, that the lines
drawn by any two people would be the same.drawn by any two people would be the same.
In other words, for every person drawing suchIn other words, for every person drawing such
a line by eye, or freehand, we would expect aa line by eye, or freehand, we would expect a
slightly different line.slightly different line.
Thinking ChallengeThinking Challenge
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
Thinking ChallengeThinking Challenge
For every person drawing such a line by eye, orFor every person drawing such a line by eye, or
freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
Thinking ChallengeThinking Challenge
For every person drawing such a line by eye, orFor every person drawing such a line by eye, or
freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
Thinking ChallengeThinking Challenge
For every person drawing such a line by eye, orFor every person drawing such a line by eye, or
freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
Thinking ChallengeThinking Challenge
For every person drawing such a line by eye, orFor every person drawing such a line by eye, or
freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
Thinking ChallengeThinking Challenge
Which line best describes relationship between the variables?Which line best describes relationship between the variables?
What is needed for obtaining the desired line?What is needed for obtaining the desired line?
AnswerAnswer
We need to employ a method known as theWe need to employ a method known as the
method of least squaresmethod of least squares for obtaining thefor obtaining the
desired line, and the resulting line is called thedesired line, and the resulting line is called the
least-square lineleast-square line..
The reason for calling the method by this nameThe reason for calling the method by this name
will be explained in the discussion that follow.will be explained in the discussion that follow.
Equation for straight lineEquation for straight line
Now, recall from algebra that the generalNow, recall from algebra that the general
equation for straight line is given byequation for straight line is given by
y = a + bxy = a + bx
a = the y-intercept b = the slope
Linear EquationsLinear Equations
Y
Y = a + bx
a = Y-intercept
X
a is the point where the line crosses the vertical axis,
and referred to as y-intercept.
Linear EquationsLinear Equations
Y
Y = a + bx
a = Y-intercept
X
Change
in Y
Change in X
b = Slope
b shows the amount by which y changes for each unit
change in x and referred to as the slope of the line.
To draw a line based on the equation, we need theTo draw a line based on the equation, we need the
numerical values of the constantsnumerical values of the constants aa andand bb..
Given these constants, weGiven these constants, we may substitute variousmay substitute various
values of x into the equation to obtain correspondingvalues of x into the equation to obtain corresponding
values of y.values of y.
The resulting points may be plotted.The resulting points may be plotted.
Linear EquationsLinear Equations
y = a + bxy = a + bx
Computation TableComputation Table
Xi Yi Xi
2
Yi
2
XiYi
X1 Y1 X1
2
Y1
2
X1Y1
X2 Y2 X2
2
Y2
2
X2Y2
: : : : :
Xn Yn Xn
2
Yn
2
XnYn
ΣXi ΣYi ΣXi
2
ΣYi
2
ΣXiYi
   pHpH
(x)(x)
OpticalOptical
density (y)density (y)
xx22
yy22
xyxy
   33 0.10.1 99 0.010.01 0.30.3
   44 0.20.2 1616 0.040.04 0.80.8
   4.54.5 0.250.25 20.2520.25 0.06250.0625 1.1251.125
   55 0.320.32 2525 0.10240.1024 1.61.6
   5.55.5 0.330.33 30.2530.25 0.10890.1089 1.8151.815
   66 0.350.35 3636 0.12250.1225 2.12.1
   6.56.5 0.470.47 42.2542.25 0.22090.2209 3.0553.055
   77 0.490.49 4949 0.2400.240 3.433.43
   7.57.5 0.530.53 56.2556.25 0.2810.281 3.9753.975
TotalTotal ΣΣ x = 49x = 49 ΣΣ y = 3.04y = 3.04 ΣΣ xx22
= 284= 284 ΣΣyy22
==
1.18821.1882
ΣΣ xy = 18.2xy = 18.2
MeanMean = 5.444= 5.444 = 0.3378= 0.3378         yx
Computation TableComputation Table
Finding the b-valueFinding the b-value
( )( )
( )
22
n xy x y
b
n x x
−
=
−
∑ ∑ ∑
∑ ∑
( ) ( )
( ) ( ) ( )
2
9 18.2 -(49)(3.04)
0.0958
9 284 49
b = =
−
Finding the y-interceptFinding the y-intercept
a y bx= −
where y mean of y values
and x mean of x values
=
=
y b x
a
n
−
=
∑ ∑Alternatively
Finding the y-interceptFinding the y-intercept
( ) ( )0.3378 0.0958 5.444 -0.1837a = − =
3.04
0.3378
9
49
5.444
9
y
x
= =
= =
a y bx= −
The equation for the least squaresThe equation for the least squares
line is:line is:
y a bx
∧
= +
y a bx
∧
= +
- 0.1837+0.095  8x y
∧
=
Note that we use the symbol because this value is computed
from the equation and is not an observed value of y.
y
∧
0.0958x - 0. 37 18y
∧
=
Now, weNow, we can substitute various values of x into thecan substitute various values of x into the
equation to obtain corresponding values of y.equation to obtain corresponding values of y.
The resulting points may be plotted.The resulting points may be plotted.
0.0958x - 0. 3718y
∧
=
Using the Regression EquationUsing the Regression Equation
Predicting y for a given xPredicting y for a given x
Choose a value for x (within the range of xChoose a value for x (within the range of x
values).values).
Substitute the selected x in the regressionSubstitute the selected x in the regression
equation.equation.
Determine corresponding value of y.Determine corresponding value of y.
The regression equation:The regression equation:
Substitute x = 6.8:Substitute x = 6.8:
According to the equation, a pH of 6.8 wouldAccording to the equation, a pH of 6.8 would
has a 0.4625 optical density.has a 0.4625 optical density.
0.0958x - 0. 3718y
∧
=
0.0958 6.8 - 0.1837=0.4625y
∧
= ×
InterpolationInterpolation
Using the regression equation toUsing the regression equation to
predict y values for x values thatpredict y values for x values that
fall between the points in thefall between the points in the
scatter diagramscatter diagram
ExtrapolationExtrapolation
Prediction beyond the range ofPrediction beyond the range of
observationsobservations
Since any two such coordinates determine aSince any two such coordinates determine a
straight line, we maystraight line, we may
select any twoselect any two values in the range of xvalues in the range of x,,
compute two corresponding y values,compute two corresponding y values,
locate them on a graph,locate them on a graph,
and connect themand connect them with a straight linewith a straight line to obtainto obtain
the line corresponding the equation.the line corresponding the equation.
The least-squares lineThe least-squares line
Y
X
DeviationDeviation
DeviationDeviation
DeviationDeviation
DeviationDeviation
The line that we have drawn is best in this sense:The line that we have drawn is best in this sense:
The sum of the squared vertical deviations of theThe sum of the squared vertical deviations of the
observed data points (yobserved data points (yii) from the least square line is) from the least square line is
smaller than the sum of the squared vertical deviations ofsmaller than the sum of the squared vertical deviations of
the observed data points from any other line.the observed data points from any other line.
yiyi
yiyi
yiyi
yiyi
The least-squares lineThe least-squares line
The least-squares lineThe least-squares line
In other words, if weIn other words, if we squaresquare the verticalthe vertical
distance from the observed point (ydistance from the observed point (yii) to the) to the
least-squares line andleast-squares line and addadd these squaredthese squared
values for all points,values for all points, the resulting total will bethe resulting total will be
smaller than the similarly computed total forsmaller than the similarly computed total for
any other line that can be drawn through theany other line that can be drawn through the
pointspoints..
For this reason the line we have drawn isFor this reason the line we have drawn is
called the least-squares line.called the least-squares line.
The coefficient of determinationThe coefficient of determination rr22
The coefficient of determinationThe coefficient of determination rr22
One way to evaluate theOne way to evaluate the
strength of the regressionstrength of the regression
equationequation is to compare theis to compare the
scatter of the points aboutscatter of the points about
the regression line with thethe regression line with the
scatter about , the meanscatter about , the mean
of the values of y.of the values of y.
y
y = 0.0957x - 0.1835
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
= 0.3378y
The coefficient of determinationThe coefficient of determination rr22
Draw through the points aDraw through the points a
line that intersects the y-line that intersects the y-
axis at and is parallel toaxis at and is parallel to
the x-axis, we may obtainthe x-axis, we may obtain
a visual impression of thea visual impression of the
relative magnitudes of therelative magnitudes of the
scatter of the points aboutscatter of the points about
this line and the regressionthis line and the regression
line.line.
y
y = 0.0957x - 0.1835
0
0.1
0.2
0.3
0.4
0.5
0.6
2 3 4 5 6 7 8
pH
Opticaldensity
= 0.3378y
Interpretation ofInterpretation of rr22
Thus, the coefficient of determinationThus, the coefficient of determination
measures the closeness of fit of themeasures the closeness of fit of the
regression equation to observed values of y.regression equation to observed values of y.
Interpretation of r2
• If r2
= 0.978
• Approximately 98 percent of the variation in
Optical density (y) is explained by the linear
relationship with x, pH change.
• Less than five percent is explained by other causes.
Y
X
Y
X
Y
X
Coefficient of DeterminationCoefficient of Determination
ExamplesExamples
Y
X
r2
= 1 r2
= 1
r2
= .8 r2
= 0
Limitations of Correlation and
Regression
linearity:
– can’t describe non-linear relationships
e.g., relation between anxiety & performance
truncation of range:
– Under estimate strength of relationship if you
can’t see full range of x value.
no proof of causation:
– third variable problem:
could be 3rd
variable causing change in both
variables
directionality: can’t be sure which way causality

More Related Content

What's hot

Linear correlation
Linear correlationLinear correlation
Linear correlation
Tech_MX
 

What's hot (17)

Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Regression and Co-Relation
Regression and Co-RelationRegression and Co-Relation
Regression and Co-Relation
 
Correlation analysis
Correlation analysis  Correlation analysis
Correlation analysis
 
Using Spss Correlations
Using Spss   CorrelationsUsing Spss   Correlations
Using Spss Correlations
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 
Correlation
CorrelationCorrelation
Correlation
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Linear correlation
Linear correlationLinear correlation
Linear correlation
 
Linear Correlation
Linear Correlation Linear Correlation
Linear Correlation
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 
Simple correlation
Simple correlationSimple correlation
Simple correlation
 
Research Methodology Module-06
Research Methodology Module-06Research Methodology Module-06
Research Methodology Module-06
 
Correlation ppt...
Correlation ppt...Correlation ppt...
Correlation ppt...
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
What is Karl Pearson Correlation Analysis and How Can it be Used for Enterpri...
 
Ch 7 correlation_and_linear_regression
Ch 7 correlation_and_linear_regressionCh 7 correlation_and_linear_regression
Ch 7 correlation_and_linear_regression
 
Pearson's correlation
Pearson's  correlationPearson's  correlation
Pearson's correlation
 

Similar to 5 regressionand correlation

Stats For Life Module7 Oc
Stats For Life Module7 OcStats For Life Module7 Oc
Stats For Life Module7 Oc
N Rabe
 
Hph7310week2winter2009narr
Hph7310week2winter2009narrHph7310week2winter2009narr
Hph7310week2winter2009narr
Sarah
 

Similar to 5 regressionand correlation (20)

Measure of Association
Measure of AssociationMeasure of Association
Measure of Association
 
Class 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptxClass 9 Covariance & Correlation Concepts.pptx
Class 9 Covariance & Correlation Concepts.pptx
 
Correlation analysis notes
Correlation analysis notesCorrelation analysis notes
Correlation analysis notes
 
CORRELATION-CMC.PPTX
CORRELATION-CMC.PPTXCORRELATION-CMC.PPTX
CORRELATION-CMC.PPTX
 
Correlation and regression impt
Correlation and regression imptCorrelation and regression impt
Correlation and regression impt
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Simple correlation & Regression analysis
Simple correlation & Regression analysisSimple correlation & Regression analysis
Simple correlation & Regression analysis
 
ch 13 Correlation and regression.doc
ch 13 Correlation  and regression.docch 13 Correlation  and regression.doc
ch 13 Correlation and regression.doc
 
Correlation Analysis for MSc in Development Finance .pdf
Correlation Analysis for MSc in Development Finance .pdfCorrelation Analysis for MSc in Development Finance .pdf
Correlation Analysis for MSc in Development Finance .pdf
 
Stats For Life Module7 Oc
Stats For Life Module7 OcStats For Life Module7 Oc
Stats For Life Module7 Oc
 
Hph7310week2winter2009narr
Hph7310week2winter2009narrHph7310week2winter2009narr
Hph7310week2winter2009narr
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdf
 
Correlation Analysis
Correlation AnalysisCorrelation Analysis
Correlation Analysis
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Correlation
CorrelationCorrelation
Correlation
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and Correlation
 
correlation ;.pptx
correlation ;.pptxcorrelation ;.pptx
correlation ;.pptx
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptx
 

More from Lama K Banna

More from Lama K Banna (20)

The TikTok Masterclass Deck.pdf
The TikTok Masterclass Deck.pdfThe TikTok Masterclass Deck.pdf
The TikTok Masterclass Deck.pdf
 
دليل كتابة المشاريع.pdf
دليل كتابة المشاريع.pdfدليل كتابة المشاريع.pdf
دليل كتابة المشاريع.pdf
 
Investment proposal
Investment proposalInvestment proposal
Investment proposal
 
Funding proposal
Funding proposalFunding proposal
Funding proposal
 
5 incisions
5 incisions5 incisions
5 incisions
 
Lecture 3 facial cosmetic surgery
Lecture 3 facial cosmetic surgery Lecture 3 facial cosmetic surgery
Lecture 3 facial cosmetic surgery
 
lecture 1 facial cosmatic surgery
lecture 1 facial cosmatic surgery lecture 1 facial cosmatic surgery
lecture 1 facial cosmatic surgery
 
Facial neuropathology Maxillofacial Surgery
Facial neuropathology Maxillofacial SurgeryFacial neuropathology Maxillofacial Surgery
Facial neuropathology Maxillofacial Surgery
 
Lecture 2 Facial cosmatic surgery
Lecture 2 Facial cosmatic surgery Lecture 2 Facial cosmatic surgery
Lecture 2 Facial cosmatic surgery
 
Lecture 12 general considerations in treatment of tmd
Lecture 12 general considerations in treatment of tmdLecture 12 general considerations in treatment of tmd
Lecture 12 general considerations in treatment of tmd
 
Lecture 10 temporomandibular joint
Lecture 10 temporomandibular jointLecture 10 temporomandibular joint
Lecture 10 temporomandibular joint
 
Lecture 11 temporomandibular joint Part 3
Lecture 11 temporomandibular joint Part 3Lecture 11 temporomandibular joint Part 3
Lecture 11 temporomandibular joint Part 3
 
Lecture 9 TMJ anatomy examination
Lecture 9 TMJ anatomy examinationLecture 9 TMJ anatomy examination
Lecture 9 TMJ anatomy examination
 
Lecture 7 correction of dentofacial deformities Part 2
Lecture 7 correction of dentofacial deformities Part 2Lecture 7 correction of dentofacial deformities Part 2
Lecture 7 correction of dentofacial deformities Part 2
 
Lecture 8 management of patients with orofacial clefts
Lecture 8 management of patients with orofacial cleftsLecture 8 management of patients with orofacial clefts
Lecture 8 management of patients with orofacial clefts
 
Lecture 5 Diagnosis and management of salivary gland disorders Part 2
Lecture 5 Diagnosis and management of salivary gland disorders Part 2Lecture 5 Diagnosis and management of salivary gland disorders Part 2
Lecture 5 Diagnosis and management of salivary gland disorders Part 2
 
Lecture 6 correction of dentofacial deformities
Lecture 6 correction of dentofacial deformitiesLecture 6 correction of dentofacial deformities
Lecture 6 correction of dentofacial deformities
 
lecture 4 Diagnosis and management of salivary gland disorders
lecture 4 Diagnosis and management of salivary gland disorderslecture 4 Diagnosis and management of salivary gland disorders
lecture 4 Diagnosis and management of salivary gland disorders
 
Lecture 3 maxillofacial trauma part 3
Lecture 3 maxillofacial trauma part 3Lecture 3 maxillofacial trauma part 3
Lecture 3 maxillofacial trauma part 3
 
Lecture 2 maxillofacial trauma
Lecture 2 maxillofacial traumaLecture 2 maxillofacial trauma
Lecture 2 maxillofacial trauma
 

Recently uploaded

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
nirzagarg
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
chadhar227
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
gajnagarg
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
HyderabadDolls
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 

Recently uploaded (20)

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service AvailableVastral Call Girls Book Now 7737669865 Top Class Escort Service Available
Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 

5 regressionand correlation

  • 2. Correlation and RegressionCorrelation and Regression Are two statistical techniques that are usedAre two statistical techniques that are used to examine the nature and strength of theto examine the nature and strength of the relationships between two variables.relationships between two variables.
  • 3. CorrelationCorrelation Correlation analysis is concerned withCorrelation analysis is concerned with measuring the strength of the relationshipmeasuring the strength of the relationship between variables.between variables. When we compute measures of correlationWhen we compute measures of correlation from a set of data, we are interested in thefrom a set of data, we are interested in the degree of the correlation between variables.degree of the correlation between variables.
  • 4. Relationship Between VariablesRelationship Between Variables Examples of two variablesExamples of two variables Blood pressure and ageBlood pressure and age Height and weightHeight and weight The concentration of an injected drug andThe concentration of an injected drug and heart rateheart rate TheThe consumption level of some nutrient andconsumption level of some nutrient and weight gain.weight gain.
  • 5. Correlation coefficient Correlation coefficient of variables X and Y shows how strongly the values of these variables are related to one another. It is denoted by r and r [-1, 1].∈ If the correlation coefficient is positive, then both variables are simultaneously increasing (or simultaneously decreasing). If the correlation coefficient is negative, then when one variable increases while the other decreases, and reciprocally.
  • 6. Coefficient of Correlation ValuesCoefficient of Correlation Values -1.0-1.0 +1.0+1.000 PerfectPerfect PositivePositive CorrelationCorrelation -.5-.5 +.5+.5 PerfectPerfect NegativeNegative CorrelationCorrelation NoNo CorrelationCorrelation Increasing degree ofIncreasing degree of positive correlationpositive correlation Increasing degree ofIncreasing degree of negative correlationnegative correlation
  • 7. Although there is no fixed rule or interpretation ofAlthough there is no fixed rule or interpretation of the strength of a correlation, we will say thatthe strength of a correlation, we will say that the correlation isthe correlation is Strong ifStrong if Moderate ifModerate if Weak ifWeak if We will also add the words positive or negative toWe will also add the words positive or negative to indicate the type of correlation.indicate the type of correlation. 0.8r ≥ 0.5 0.8r≤ ≤ 0 0.5r≤ ≤
  • 8. Correlation coefficient Positive when large values of one variable are associated with large values of the other.
  • 9. Correlation coefficient Negative when large values of one variable are associated with small values of the other.
  • 11. Simple Correlation coefficient (Simple Correlation coefficient (rr)) It is also calledIt is also called Pearson's correlationPearson's correlation, it, it measures the nature and strength between twomeasures the nature and strength between two variables of the quantitative type.variables of the quantitative type. The simple correlation coefficient is obtainedThe simple correlation coefficient is obtained using the following formula:using the following formula: wherewhere nn is the sample size,is the sample size, xx is the independentis the independent variable andvariable and yy is the dependent variable.is the dependent variable. 1111       ∑ ∑ −      ∑ ∑ − ∑ ∑ ∑ − = n y)( y. n x)( x n yx xy r 2 2 2 2       ∑ ∑ −      ∑ ∑ − ∑ ∑ ∑ − = n y)( y. n x)( x n yx xy r 2 2 2 2
  • 12. Coefficient of CorrelationCoefficient of Correlation An alternative formula for computing the coefficient ofAn alternative formula for computing the coefficient of correlation,correlation, rr ( )( ) ( ) ( ) 2 22 2 i i i i i i i i n x y x y r n x x n y y − = − − ∑ ∑ ∑ ∑ ∑ ∑ ∑
  • 13. pH (x)pH (x) Optical density (y)Optical density (y) xy X2 y2 33 0.10.1 0.3 9 0.01 44 0.20.2 0.8 16 0.04 4.54.5 0.250.25 1.125 20.25 0.0625 55 0.320.32 1.6 25 0.1024 5.55.5 0.330.33 1.815 30.25 0.1089 66 0.350.35 2.1 36 0.1225 6.56.5 0.470.47 3.055 42.25 0.2209 77 0.490.49 3.43 49 0.2401 7.57.5 0.530.53 3.975 56.25 0.2809 Total 49 3.04 18.2 284 1.1882 Computation TableComputation Table
  • 14. Coefficient of CorrelationCoefficient of Correlation ( )( ) ( ) ( ) 2 22 2 i i i i i i i i n x y x y r n x x n y y − = − − ∑ ∑ ∑ ∑ ∑ ∑ ∑ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 2 2 9 18.2 49 3.04 0.9891 9 284 49 9 1.1882 3.04 r − = = − − rr = 0.989= 0.989 rr ≈≈ 0. 990. 99
  • 15. WarningWarning The correlation coefficient (The correlation coefficient ( rr) measures) measures the strength of the relationship betweenthe strength of the relationship between two variables.two variables. Just because two variables are relatedJust because two variables are related does not imply that there is a cause-and-does not imply that there is a cause-and- effect relationship between them.effect relationship between them.
  • 16. Spearman’s Correlation Coefficient It is a non-parametric measure of correlation used in the case of ordinal or qualitative ( ratio or relative) variables. This procedure makes use of the two sets of ranks that may be assigned to the sample values of x and y.
  • 17. Spearman’s Correlation Coefficient Spearman Rank correlation coefficient could be computed in the following cases: 1.Both variables are quantitative. 2.Both variables are qualitative ordinal. 3.One variable is quantitative and the other is qualitative ordinal.
  • 18. Spearman’s Correlation Coefficient Procedures: 1. Rank the values of X from 1 to n , where n is the numbers of pairs of values of X and Y in the sample. 2. Rank the values of Y from 1 to n. 3. Compute the value of for each pair of observations by subtracting the rank of from the rank of 4. Square each and compute which is the sum of the squared values. 5. Apply the following formula The value of rs denotes the magnitude and nature of association giving the same interpretation as simple r.
  • 19. Spearman’s Correlation Coefficient Example: In a study of the relationship between education level and health awareness, the following data was obtained. Find the relationship between them and comment. Health awareness ( )Education level ( )No. 25preparatory.1 10primary.2 8university.3 10secondary4 15secondary5 50illiterate6 60university.7
  • 20. Spearman’s Correlation Coefficient Solution: Rank ( )Rank ( )( )( )No. 423525Preparatory1 0.250.55.5610Primary2 30.25-5.571.58University3 4-25.53.510secondary4 0.25-0.543.515secondary5 2552750illiterate6 0.250.511.560university7 64Total Comment: There is an indirect weak correlation between education level and health awareness.
  • 21. Spearman’s Correlation Coefficient Would you justify your findings in the previous example?!
  • 22. Non-Parametric Correlations Spearman’s Correlation Coefficient: Firstly, it ranks the data and then applies Pearson’s correlation to these ranks where rs is Spearman’s correlation coefficient, d2 is the difference between the ranks and n is the number of cases. )1( 6 1 2 2 − × −= ∑ nn d rs
  • 23. RegressionRegression analysisanalysis RegressionRegression analysis is helpful in ascertaining theanalysis is helpful in ascertaining the probable formprobable form of the relationship between variables.of the relationship between variables. The ultimate objectives when this method of analysisThe ultimate objectives when this method of analysis is employed usually is tois employed usually is to predictpredict oror estimateestimate the valuethe value of one variable corresponding to a given value ofof one variable corresponding to a given value of another variable.another variable.
  • 24. Simple linear regressionSimple linear regression
  • 25. Simple linear regressionSimple linear regression In simple linear regression we are interested inIn simple linear regression we are interested in two variablestwo variables xx andand yy.. The variableThe variable xx is usually referred to as theis usually referred to as the independent variableindependent variable, since frequently it is, since frequently it is controlled by the investigator; that is; values ofcontrolled by the investigator; that is; values of xx may be selected by the investigator and,may be selected by the investigator and, corresponding to each preselected value of x,corresponding to each preselected value of x, one -or more- value ofone -or more- value of yy is obtained.is obtained. The other variable, y, accordingly, is called theThe other variable, y, accordingly, is called the dependent variabledependent variable, and we speak of the, and we speak of the regression ofregression of yy onon xx..
  • 26. The regression equationThe regression equation In simple linear regression the object of theIn simple linear regression the object of the researcher’s interest is theresearcher’s interest is the regression equationregression equation that describes the true relationship betweenthat describes the true relationship between the dependent variable y and the independentthe dependent variable y and the independent variable x.variable x.
  • 27. Scatter diagramScatter diagram A first step that is usually useful in studying theA first step that is usually useful in studying the relationship between two variables is torelationship between two variables is to prepare aprepare a scatter diagramscatter diagram of the data.of the data. The points are plotted by assigning values ofThe points are plotted by assigning values of the independent variable x to the horizontalthe independent variable x to the horizontal axis and values of the dependent variable y toaxis and values of the dependent variable y to the vertical axis.the vertical axis. The pattern made by the points plotted on theThe pattern made by the points plotted on the scatter diagram usually suggests the basicscatter diagram usually suggests the basic nature and the strength of the relationshipnature and the strength of the relationship between two variables.between two variables.
  • 28. ExampleExample pHpH Optical densityOptical density 33 0.10.1 44 0.20.2 4.54.5 0.250.25 55 0.320.32 5.55.5 0.330.33 66 0.350.35 6.56.5 0.470.47 77 0.490.49 7.57.5 0.530.53 Relationship between pH and optical density
  • 29. Scatter DiagramScatter Diagram Relationship between pH and optical density 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity
  • 30. NotesNotes The points in the figureThe points in the figure seems to be scatteredseems to be scattered around an invisible straightaround an invisible straight line.line. The scatter diagram alsoThe scatter diagram also shows that, in general, highshows that, in general, high pH also has high opticalpH also has high optical density reading.density reading. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity
  • 31. These impressionsThese impressions suggest that thesuggest that the relationship betweenrelationship between points in the two variablespoints in the two variables may be described by amay be described by a straight line crossing the y-straight line crossing the y- axis near the origin andaxis near the origin and making approximately a 45making approximately a 45 degree angle with the x-degree angle with the x- axis .axis . It looks as if it would beIt looks as if it would be simple to draw, freehand,simple to draw, freehand, through the data points thethrough the data points the line that describe theline that describe the relationship between x andrelationship between x and y.y. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity
  • 32. It is highly unlikely, however, that the linesIt is highly unlikely, however, that the lines drawn by any two people would be the same.drawn by any two people would be the same. In other words, for every person drawing suchIn other words, for every person drawing such a line by eye, or freehand, we would expect aa line by eye, or freehand, we would expect a slightly different line.slightly different line. Thinking ChallengeThinking Challenge
  • 33. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity Thinking ChallengeThinking Challenge For every person drawing such a line by eye, orFor every person drawing such a line by eye, or freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
  • 34. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity Thinking ChallengeThinking Challenge For every person drawing such a line by eye, orFor every person drawing such a line by eye, or freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
  • 35. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity Thinking ChallengeThinking Challenge For every person drawing such a line by eye, orFor every person drawing such a line by eye, or freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
  • 36. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity Thinking ChallengeThinking Challenge For every person drawing such a line by eye, orFor every person drawing such a line by eye, or freehand, we would expect a slightly different line.freehand, we would expect a slightly different line.
  • 37. 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity Thinking ChallengeThinking Challenge Which line best describes relationship between the variables?Which line best describes relationship between the variables? What is needed for obtaining the desired line?What is needed for obtaining the desired line?
  • 38. AnswerAnswer We need to employ a method known as theWe need to employ a method known as the method of least squaresmethod of least squares for obtaining thefor obtaining the desired line, and the resulting line is called thedesired line, and the resulting line is called the least-square lineleast-square line.. The reason for calling the method by this nameThe reason for calling the method by this name will be explained in the discussion that follow.will be explained in the discussion that follow.
  • 39. Equation for straight lineEquation for straight line Now, recall from algebra that the generalNow, recall from algebra that the general equation for straight line is given byequation for straight line is given by y = a + bxy = a + bx a = the y-intercept b = the slope
  • 40. Linear EquationsLinear Equations Y Y = a + bx a = Y-intercept X a is the point where the line crosses the vertical axis, and referred to as y-intercept.
  • 41. Linear EquationsLinear Equations Y Y = a + bx a = Y-intercept X Change in Y Change in X b = Slope b shows the amount by which y changes for each unit change in x and referred to as the slope of the line.
  • 42. To draw a line based on the equation, we need theTo draw a line based on the equation, we need the numerical values of the constantsnumerical values of the constants aa andand bb.. Given these constants, weGiven these constants, we may substitute variousmay substitute various values of x into the equation to obtain correspondingvalues of x into the equation to obtain corresponding values of y.values of y. The resulting points may be plotted.The resulting points may be plotted. Linear EquationsLinear Equations y = a + bxy = a + bx
  • 43. Computation TableComputation Table Xi Yi Xi 2 Yi 2 XiYi X1 Y1 X1 2 Y1 2 X1Y1 X2 Y2 X2 2 Y2 2 X2Y2 : : : : : Xn Yn Xn 2 Yn 2 XnYn ΣXi ΣYi ΣXi 2 ΣYi 2 ΣXiYi
  • 44.    pHpH (x)(x) OpticalOptical density (y)density (y) xx22 yy22 xyxy    33 0.10.1 99 0.010.01 0.30.3    44 0.20.2 1616 0.040.04 0.80.8    4.54.5 0.250.25 20.2520.25 0.06250.0625 1.1251.125    55 0.320.32 2525 0.10240.1024 1.61.6    5.55.5 0.330.33 30.2530.25 0.10890.1089 1.8151.815    66 0.350.35 3636 0.12250.1225 2.12.1    6.56.5 0.470.47 42.2542.25 0.22090.2209 3.0553.055    77 0.490.49 4949 0.2400.240 3.433.43    7.57.5 0.530.53 56.2556.25 0.2810.281 3.9753.975 TotalTotal ΣΣ x = 49x = 49 ΣΣ y = 3.04y = 3.04 ΣΣ xx22 = 284= 284 ΣΣyy22 == 1.18821.1882 ΣΣ xy = 18.2xy = 18.2 MeanMean = 5.444= 5.444 = 0.3378= 0.3378         yx Computation TableComputation Table
  • 45. Finding the b-valueFinding the b-value ( )( ) ( ) 22 n xy x y b n x x − = − ∑ ∑ ∑ ∑ ∑ ( ) ( ) ( ) ( ) ( ) 2 9 18.2 -(49)(3.04) 0.0958 9 284 49 b = = −
  • 46. Finding the y-interceptFinding the y-intercept a y bx= − where y mean of y values and x mean of x values = = y b x a n − = ∑ ∑Alternatively
  • 47. Finding the y-interceptFinding the y-intercept ( ) ( )0.3378 0.0958 5.444 -0.1837a = − = 3.04 0.3378 9 49 5.444 9 y x = = = = a y bx= −
  • 48. The equation for the least squaresThe equation for the least squares line is:line is: y a bx ∧ = + y a bx ∧ = + - 0.1837+0.095  8x y ∧ = Note that we use the symbol because this value is computed from the equation and is not an observed value of y. y ∧ 0.0958x - 0. 37 18y ∧ =
  • 49. Now, weNow, we can substitute various values of x into thecan substitute various values of x into the equation to obtain corresponding values of y.equation to obtain corresponding values of y. The resulting points may be plotted.The resulting points may be plotted. 0.0958x - 0. 3718y ∧ =
  • 50. Using the Regression EquationUsing the Regression Equation Predicting y for a given xPredicting y for a given x Choose a value for x (within the range of xChoose a value for x (within the range of x values).values). Substitute the selected x in the regressionSubstitute the selected x in the regression equation.equation. Determine corresponding value of y.Determine corresponding value of y.
  • 51. The regression equation:The regression equation: Substitute x = 6.8:Substitute x = 6.8: According to the equation, a pH of 6.8 wouldAccording to the equation, a pH of 6.8 would has a 0.4625 optical density.has a 0.4625 optical density. 0.0958x - 0. 3718y ∧ = 0.0958 6.8 - 0.1837=0.4625y ∧ = ×
  • 52. InterpolationInterpolation Using the regression equation toUsing the regression equation to predict y values for x values thatpredict y values for x values that fall between the points in thefall between the points in the scatter diagramscatter diagram
  • 53. ExtrapolationExtrapolation Prediction beyond the range ofPrediction beyond the range of observationsobservations
  • 54. Since any two such coordinates determine aSince any two such coordinates determine a straight line, we maystraight line, we may select any twoselect any two values in the range of xvalues in the range of x,, compute two corresponding y values,compute two corresponding y values, locate them on a graph,locate them on a graph, and connect themand connect them with a straight linewith a straight line to obtainto obtain the line corresponding the equation.the line corresponding the equation. The least-squares lineThe least-squares line
  • 55. Y X DeviationDeviation DeviationDeviation DeviationDeviation DeviationDeviation The line that we have drawn is best in this sense:The line that we have drawn is best in this sense: The sum of the squared vertical deviations of theThe sum of the squared vertical deviations of the observed data points (yobserved data points (yii) from the least square line is) from the least square line is smaller than the sum of the squared vertical deviations ofsmaller than the sum of the squared vertical deviations of the observed data points from any other line.the observed data points from any other line. yiyi yiyi yiyi yiyi The least-squares lineThe least-squares line
  • 56. The least-squares lineThe least-squares line In other words, if weIn other words, if we squaresquare the verticalthe vertical distance from the observed point (ydistance from the observed point (yii) to the) to the least-squares line andleast-squares line and addadd these squaredthese squared values for all points,values for all points, the resulting total will bethe resulting total will be smaller than the similarly computed total forsmaller than the similarly computed total for any other line that can be drawn through theany other line that can be drawn through the pointspoints.. For this reason the line we have drawn isFor this reason the line we have drawn is called the least-squares line.called the least-squares line.
  • 57. The coefficient of determinationThe coefficient of determination rr22
  • 58. The coefficient of determinationThe coefficient of determination rr22 One way to evaluate theOne way to evaluate the strength of the regressionstrength of the regression equationequation is to compare theis to compare the scatter of the points aboutscatter of the points about the regression line with thethe regression line with the scatter about , the meanscatter about , the mean of the values of y.of the values of y. y y = 0.0957x - 0.1835 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity = 0.3378y
  • 59. The coefficient of determinationThe coefficient of determination rr22 Draw through the points aDraw through the points a line that intersects the y-line that intersects the y- axis at and is parallel toaxis at and is parallel to the x-axis, we may obtainthe x-axis, we may obtain a visual impression of thea visual impression of the relative magnitudes of therelative magnitudes of the scatter of the points aboutscatter of the points about this line and the regressionthis line and the regression line.line. y y = 0.0957x - 0.1835 0 0.1 0.2 0.3 0.4 0.5 0.6 2 3 4 5 6 7 8 pH Opticaldensity = 0.3378y
  • 60. Interpretation ofInterpretation of rr22 Thus, the coefficient of determinationThus, the coefficient of determination measures the closeness of fit of themeasures the closeness of fit of the regression equation to observed values of y.regression equation to observed values of y.
  • 61. Interpretation of r2 • If r2 = 0.978 • Approximately 98 percent of the variation in Optical density (y) is explained by the linear relationship with x, pH change. • Less than five percent is explained by other causes.
  • 62. Y X Y X Y X Coefficient of DeterminationCoefficient of Determination ExamplesExamples Y X r2 = 1 r2 = 1 r2 = .8 r2 = 0
  • 63. Limitations of Correlation and Regression linearity: – can’t describe non-linear relationships e.g., relation between anxiety & performance truncation of range: – Under estimate strength of relationship if you can’t see full range of x value. no proof of causation: – third variable problem: could be 3rd variable causing change in both variables directionality: can’t be sure which way causality