CorrelationAnalysis is a
process for establishing the relationships
between two variables.
The measure is best used in variables
that demonstrate a linear relationship
between each other..
CorrelationCoefficient ( R ) or
the Pearson’s product moment
correlation coefficient in honor
of its developer Karl Pearson ).
It is numerical measure of the
linear relationship between two
variables usually labeled x and
y.
Scattergram is composed of the points
plotted in the rectangular coordinate
system, where x and y are respectively the
values of the independent and dependent
variables. It is useful when there are a large
number of data sets. They provide the
following information about the
relationship between two variables:
•Strength
•Shape –linear, curved, etc.
•Direction
•Presence of outliers
Interpretationof “r”orcorrelation
coefficients:
Between ±0.80to ±0.99 High correlation
Between ±0.60to ±0.79 Moderately high correlation
Between ±0.40to ±0.59
Moderately correlation
Between ±0.20to ±0.39 Low correlation
Between ±0.01to±0.19
Negligible correlation
r
The correlation coefficient, r, has a
specific range of values:
Note that:
•r never lies outside this range, therefore r = 2 is a
nonsense answer whose only explanation can be "I
made an arithmetic error".
•r =1 is perfect positive correlation and all the data
points lie exactly on a straight line with positive
gradient.
•r = -1 likewise is perfect negative correlation.
•r is often measured or referred to as a percentage.
In this case, the range is from -100% to 100%
(remembering that 100% is the same as 1)
Steps you need to follow:
1.draw the scatterplot;
2.draw the trend line which describes
the direction of the data;
3.evaluate how closely the cloud of data
points clusters around the line;
4.determine what r value and what word
descriptor best suits the data cloud.
The following diagram has a number line of r
values to help you assigning the numbers
and the word descriptors.
x
y
x2 y2
Student Scores in
Scores
in
Statistic
s
1
English
(x)
36
(y)
21
2 42 18
3 37 15
4 31 11
5 25 15
6 28 9
7 33 10
8 28 20
9 42 16
10 39 11
11 38 21
12 40 14
N =12 ∑x= ∑y= ∑xy= ∑x2= ∑y2=
Studen
t
Scores
in
Scores
in
Statistic
s
x
y
x2 y2
English (x) (y)
1 36 21 756 1296 441
2 42 18 756 1764 324
3 37 15 555 1369 225
4 31 11 341 961 121
5 25 15 375 625 225
6 28 9 252 784 81
7 33 10 330 1089 100
8 28 20 560 784 400
9 42 16 672 1764 256
10 39 11 429 1521 121
11 38 21 798 1444 441
12 40 14 560 1600 196
SUM: 419 181 6384 15001 2931
correlation-analysis.pptx
correlation-analysis.pptx

correlation-analysis.pptx

  • 2.
    CorrelationAnalysis is a processfor establishing the relationships between two variables. The measure is best used in variables that demonstrate a linear relationship between each other..
  • 3.
    CorrelationCoefficient ( R) or the Pearson’s product moment correlation coefficient in honor of its developer Karl Pearson ). It is numerical measure of the linear relationship between two variables usually labeled x and y.
  • 4.
    Scattergram is composedof the points plotted in the rectangular coordinate system, where x and y are respectively the values of the independent and dependent variables. It is useful when there are a large number of data sets. They provide the following information about the relationship between two variables: •Strength •Shape –linear, curved, etc. •Direction •Presence of outliers
  • 12.
    Interpretationof “r”orcorrelation coefficients: Between ±0.80to±0.99 High correlation Between ±0.60to ±0.79 Moderately high correlation Between ±0.40to ±0.59 Moderately correlation Between ±0.20to ±0.39 Low correlation Between ±0.01to±0.19 Negligible correlation
  • 13.
  • 14.
    The correlation coefficient,r, has a specific range of values:
  • 15.
    Note that: •r neverlies outside this range, therefore r = 2 is a nonsense answer whose only explanation can be "I made an arithmetic error". •r =1 is perfect positive correlation and all the data points lie exactly on a straight line with positive gradient. •r = -1 likewise is perfect negative correlation. •r is often measured or referred to as a percentage. In this case, the range is from -100% to 100% (remembering that 100% is the same as 1)
  • 16.
    Steps you needto follow: 1.draw the scatterplot; 2.draw the trend line which describes the direction of the data; 3.evaluate how closely the cloud of data points clusters around the line; 4.determine what r value and what word descriptor best suits the data cloud.
  • 17.
    The following diagramhas a number line of r values to help you assigning the numbers and the word descriptors.
  • 18.
    x y x2 y2 Student Scoresin Scores in Statistic s 1 English (x) 36 (y) 21 2 42 18 3 37 15 4 31 11 5 25 15 6 28 9 7 33 10 8 28 20 9 42 16 10 39 11 11 38 21 12 40 14 N =12 ∑x= ∑y= ∑xy= ∑x2= ∑y2=
  • 20.
    Studen t Scores in Scores in Statistic s x y x2 y2 English (x)(y) 1 36 21 756 1296 441 2 42 18 756 1764 324 3 37 15 555 1369 225 4 31 11 341 961 121 5 25 15 375 625 225 6 28 9 252 784 81 7 33 10 330 1089 100 8 28 20 560 784 400 9 42 16 672 1764 256 10 39 11 429 1521 121 11 38 21 798 1444 441 12 40 14 560 1600 196 SUM: 419 181 6384 15001 2931