Correlation

Elementary Statistics
Correlation and
Regression
Correlation
1

Correlation and Regression
Correlation
2
Objectives:
• Draw a scatter plot for a set of ordered pairs.
• Compute the correlation coefficient.
• Test the hypothesis 𝐻0: ρ = 0.
• Compute the equation of the regression line & the coefficient of determination.
• Compute the standard error of the estimate & a prediction interval.

For population 1 & 2:
Recall: Inferences about Two Proportions
1 2
1 2
X X
p
n n



1q p 
The pooled sample proportion combines the two samples
proportions into one proportion & Test Statistic :
       1 2 1 2 1 2 1 2
1 21 2
ˆ ˆ ˆ ˆ
: Or
1 1
p p p p p p p p
TS z
pq pq
pq
n nn n
     

   
 
Confidence Interval Estimate of p1 − p2
   1 1 2 2 1 1 2 2
1 2 2 1 2 1 2 2
1 2 1 2
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
ˆ ˆ ˆ ˆ
p q p q p q p q
p p z p p p p z
n n n n
         
 1 2
ˆ ˆp p E 
The P-value method and the critical value method are equivalent, but the confidence
interval method is not equivalent to the P-value method or the critical value method.
2
2 2 2
2
ˆ ˆ ˆ, 1
x
p q p
n
  1
1 1 1
1
ˆ ˆ ˆ, 1
x
p q p
n
  
3
TI Calculator:
Confidence Interval: 2
proportion
1. Stat
2. Tests
3. 2-prop ZINT
4. Enter:𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐 & CL
TI Calculator:
2 - Proportion Z - test
1. Stat
2. Tests
3. 2 ‒ PropZTest
4. Enter Data or Stats
𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐
5. Choose RTT, LTT,
or 2TT

Recall: Two Means: Independent Samples: 1. The two samples are
independent. 2. Both samples are simple random samples. 3. Either or both of
these conditions are satisfied: The two sample sizes are both large (with n1 > 30 and n2
> 30) or both samples come from populations having normal distributions.
σ1 and σ2 are known: Use the z test for comparing two means from independent populations
2 2
1 2
2
1 2
E z
n n

 
 1 2
Confidence Inte
)
rv
(
al:
x x E 
1 2 1 2 1 2
2 2 2 2
1 2 1 2
1 2 1 2
( ) ( )
: Or
x x x x
TS z z
n n n n
 
   
   
 
 
2 2
1 2
2
1 2
s s
E t
n n
 
Unequal Variances: σ1 ≠ σ2
df = smaller of n1 – 1 or n2 – 1
Equal Variances :
σ1 = σ2
Pool the Sample
Variances
df = n1 – 1 + n2 – 1
1 2
2 2
1 2
1 2
:
x x
TS t
s s
n n



1 2
2 2
1 2
:
p p
x x
TS t
s s
n n



σ1 and σ2 are unknown: Use the t test for comparing two means from independent populations
2 2
1 1 2 2
1 2
( 1) ( 1)
( 1) ( 1)
p
n s n s
s
n n
  

  
2 2
2
1 2
p ps s
E t
n n
 
4
TI Calculator:
2 - Sample Z - test
1. Stat
2. Tests
3. 2 ‒ SampZTest
𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒏 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Calculate
TI Calculator:
2 - Sample Z - Interval
1. Stat
2. Tests
3. 2 ‒ SampZInt
𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒏 𝟐,
5. Choose RTT, LTT, or
2TT
6. Calculate
TI Calculator:
2 - Sample T - test
1. Stat
2. Tests
3. 2 ‒ SampTTest
𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒔 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Pooled: No / Yes
7. Calculate
TI Calculator:
2 - Sample T - Interval
1. Stat
2. Tests
3. 2 ‒ SampTInt
𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒔 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Pooled: No / Yes
7. Calculate

Key Concept: Testing hypotheses and constructing confidence intervals involving the
mean of the differences of the values from two populations that are dependent in the
sense that the data consist of matched pairs. The pairs must be matched according to
some relationship, such as before/after measurements from the same subjects .
Good Experimental Design: When designing an experiment or planning an
observational study, using dependent samples with matched pairs is generally better than
using two independent samples.
1. Hypothesis Test: Use the differences from two dependent samples (matched pairs)
to test a claim about the mean of the population of all such differences.
2. Confidence Interval: Use the differences from two dependent samples (matched
pairs) to construct a confidence interval estimate of the mean of the population of
all such differences.
• d = individual difference between the two values in a single matched pair
• µd = mean value of the differences d for the population of all matched pairs of data
• 𝑑 = mean value of the differences d for the paired sample data
• sd = standard deviation of the differences d for the paired sample data
• n = number of pairs of sample data
When the values are dependent, do a t test on the differences.
Denote the differences with the symbol d or D, the mean of the population differences
with μd or μD, and the sample standard deviation of the differences with sd or sD.
Recall: Two Means: Two Dependent Samples (Matched Pairs)
5
Use either d or
d.f. = 1
Or
D:
n
D d
D d
n n

 
 
d
d
d
t
s n

 orD E d E 
2
ds
E t
n

2 2 2
( ) ( )
1 ( 1)
D
d d n d d
s
n n n
 
 
 
  
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data in
𝑳 𝟏 & 𝑳 𝟐
5. 𝑳 𝟏 − 𝑳 𝟐
6. Store in 𝑳 𝟑
7. Enter
Mean, SD, 5-number
summary
1. Stat
2. Calc
3. Select 1 for 1 variable
4. Type: L3 (second 3)
5. Calculate
TI Calculator:
T- interval
1. Tests
2. T - Interval
3. Data
4. Enter 𝝁 𝟎 =0, List:
𝑳 𝟑, Freq:1
5. Calculate
TI Calculator:
Matched pair: T ‒ Test
1. Tests
2. T ‒ Test
3. Data
4. Enter 𝝁 𝟎 =0, List: 𝑳 𝟑,
Freq:1
5. Choose RTT, LTT, or 2TT
6. Calculate

For the comparison of two variances or standard deviations, an F test is used.
• The F test should not be confused with the chi-square test, which compares a single
sample variance to a specific population variance.
Characteristics:
1. The values of F cannot be negative, because variances are always positive or zero.
2. The distribution is positively skewed.
3. The mean value of F is approximately equal to 1.
4. The F distribution is a family of curves based on the degrees of freedom of the
variance of the numerator and the degrees of freedom of the variance of the
denominator.
• The larger of the two variances is placed in the numerator regardless of the
subscripts.
• The F test has two terms for the degrees of freedom: that of the numerator, n1 – 1, and
that of the denominator, n2 – 1, where n1 is the sample size from which the larger
variance was obtained.
Recall: Two Variances or Standard Deviations
2
1
2
2

s
F
s
6
TI Calculator:
2 - Sample F - test
1. Stat
2. Tests
3. 2 ‒ SampFTest
𝒔 𝟏 , 𝒏 𝟏, 𝒔 𝟐 , 𝒏 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Calculate

Key Concept: In addition to hypothesis testing and confidence intervals, inferential statistics determines if a
relationship between 2 or more quantitative variables exists.
A correlation exists between two variables when the values of one variable are somehow associated with the
values of the other variable.
A linear correlation exists between two variables when there is a correlation and the plotted points look like a
straight line.
This section considers only linear relationships, which means that when graphed in a scatterplot, the points
approximate a straight-line pattern and methods of conducting a formal hypothesis test that can be used to decide
whether there is a linear correlation between all population values for the two variables.
The linear correlation coefficient r, is a number that measures how well paired sample data fit a straight-line
pattern when graphed (measures the strength of the linear association between the two variables). Use the sample
of paired data (sometimes called bivariate data) to find the value of r, and then use r to decide whether there is a
linear correlation between the two variables.
Correlation
7
Regression is a statistical method used to describe the nature of the relationship between variables—that is,
positive or negative, linear or nonlinear.
Questions: 1. Are two or more variables related?
2. If so, what is the strength of the relationship?
3. What type of relationship exists?
4. What kind of predictions can be made from the relationship?

Scatterplots & The Strength of the Linear
Correlation: r
A scatter plot is a graph of the ordered pairs (x, y) x: the independent
variable
y: the dependent variable y.
a. Distinct straight-line, or linear, pattern. We say that there is a
positive linear correlation between x and y, since as the x values
increase, the corresponding y values also increase.
b. Distinct straight-line, or linear pattern. We say that there is a
negative linear correlation between x and y, since as the x values
increase, the corresponding y values decrease.
c. No distinct pattern, which suggests that there is no correlation
between x and y.
d. Distinct pattern suggesting a correlation between x and y, but the
pattern is not that of a straight line.
8

Linear Correlation Coefficient r
1. Are two or more variables related?
2. If so, what is the strength of the relationship?
To answer these two questions, statisticians use the correlation coefficient, a numerical measure to determine
whether two or more variables are related and to determine the strength of the relationship between or among the
variables.
• Linear Correlation Coefficient r
 The linear correlation coefficient r measures the strength of the linear correlation between the paired quantitative x values and y
values in a sample. It determine whether there is a linear correlation between two variables.
3. What type of relationship exists?
There are two types of relationships: simple and multiple.
In a simple relationship, there are two variables: an independent variable (predictor variable) and a dependent
variable (response variable).
In a multiple relationship, there are two or more independent variables that are used to predict one dependent
variable.
4. What kind of predictions can be made from the relationship?
Predictions are made daily in all areas. Examples include weather forecasting, stock market analyses, sales
predictions, crop predictions, gasoline price predictions, and sports predictions. Some predictions are more accurate
than others, due to the strength of the relationship. That is, the stronger the relationship is between variables, the
more accurate the prediction is. 9

10
Construct a scatter plot for the data
shown for car rental companies in the
United States for a recent year.
Example 1
Step 1: Draw and label the x and y axes.
Step 2: Plot each point on the graph.
TI Calculator:
How to enter data:
1. Stat
2. Edit
4. Type in your data
in 𝑳 𝟏 & 𝑳 𝟐
TI Calculator:
Scatter Plot:
1. Press on Y & clear
2. 2nd y, Enter
3. On, Enter
4. Select X1-list: 𝑳 𝟏
5. Select Y1-list: 𝑳 𝟐
6. Mark: Select
Character
7. Press Zoom & 9 to
get Zoomstat

Calculating and Interpreting the Linear Correlation Coefficient denoted by r
The correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two
variables.
There are several types of correlation coefficients. The one explained in this section is called the Pearson product moment correlation
coefficient (PPMC).
The symbol for the sample correlation coefficient is r. The symbol for the population correlation coefficient is .
The range of the correlation coefficient is from 1 to 1.
If there is a strong positive linear relationship between the variables, the value of r will be close to 1.
If there is a strong negative linear relationship between the variables, the value of r will be close to 1.
n number of pairs of sample data, ∑x = sum of all x’s, ∑x² = Sum of (x values that are squared)
(∑x)² Sum up the x values and square the total. Avoid confusing ∑x² and (∑x)².
∑xy indicates that each x value should first be multiplied by its corresponding y value. After obtaining all such products, find their
sum.
r linear correlation coefficient for sample data
𝝆 (Rho) : linear correlation coefficient for a population of paired data
11

Given any collection of sample paired quantitative data, the linear correlation coefficient r can
always be computed, but the following requirements should be satisfied when using the sample
paired data to make a conclusion about linear correlation in the corresponding population of
paired data.
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. Because results can be strongly affected by the presence of outliers, any outliers must be removed if
they are known to be errors. The effects of any other outliers should be considered by calculating r
with and without the outliers included.
4. In other words, requirements 2 and 3 are simplified attempts at checking that the pairs of (x, y) data
have a bivariate normal distribution.
12
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
( )
1
x yZ Z
r
n



𝒁 𝒙 denotes the z score for an individual sample value x
𝒁 𝒚 denotes the z score for the corresponding sample value y.

Example 2: Finding r Using Technology
The table lists five paired data values. Use technology to find the value of the
correlation coefficient r for the data.
Chocolate 5 6 4 4 5
Nobel 6 9 3 2 11
13
Solution:
The value of r will be automatically calculated with
software or a calculator: r = 0.795

Example 3 a: Finding r Using the following Formula
Use this Formula to find the value of the linear correlation coefficient r for the five
pairs of chocolate/Nobel data listed in the table.
Chocolate 5 6 4 4 5
Nobel 6 9 3 2 11
14
x (Chocolate) y (Nobel) x² y² xy
5 6 25 36 30
6 9 36 81 54
4 3 16 9 12
4 2 16 4 8
5 11 25 121 55
∑x = 24 ∑y = 31 ∑x² = 118 ∑y² = 251 ∑xy = 159
2 2
5(159) 24 31
5(118) 24 5(251) 31
r
 

       
51
14(294)

0.795
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate
TI Calculator:
How to enter data:
1. Stat
2. Edit
𝑳 𝟏 & 𝑳 𝟐

Use Formula to find the value of the linear correlation coefficient r for the five pairs of
chocolate/Nobel data listed in the table.
Solution: The z scores for all of the chocolate values (see the third column) and the z scores
for all of the Nobel values (see the fourth column) are below. The last column lists the
products zx · zy.
x (Chocolate) y (Nobel)
5 6
6 9
4 3
4 2
5 11
blank blank
15
Example 3b: Finding r Using the following Formula
( )
1
x yZ Z
r
n


 3.179746
5 1


0.795
This Formula has the
advantage of making it
easier to understand
how r works. The
variable x is used for
the chocolate values,
and the variable y is
used for the Nobel
values. Each sample
value is replaced by its
corresponding z score.
zx zy zx · zy
0.239046 −0.052164 −0.012470
1.434274 0.730297 1.047446
−0.956183 −0.834625 0.798054
−0.956183 −1.095445 1.047446
0.239046 1.251937 0.299270
blank blank ∑ (zx · zy) = 3.179746
For Example:
Chocolates: 𝑥 =
𝑥
𝑛
= 4.8, 𝑠 𝑥 =
(𝑥− 𝑥)2
𝑛−1
= 0.836660,
𝑥 = 5 → 𝑧 𝑥 =
𝑥 − 𝑥
𝑠 𝑥
=
5 − 4.8
0.83666
= 0.23905

Example 4: Finding r Using the following Formula (Skip)
Find the correlation coefficient for the given data.
16
Company
Cars x
(in 10,000s)
Income y
(in billions) xy x2
y2
A
B
C
D
E
F
63.0
29.0
20.8
19.1
13.4
8.5
7.0
3.9
2.1
2.8
1.4
1.5
441.00
113.10
43.68
53.48
18.76
12.75
3969.00
841.00
432.64
364.81
179.56
72.25
49.00
15.21
4.41
7.84
1.96
2.25
Σx =
153.8
Σy =
18.7
Σxy =
682.77
Σx2
=
5859.26
Σy2
=
80.67
Σx = 153.8, Σy = 18.7, Σxy = 682.77, Σx2 = 5859.26, Σy2 = 80.67, n = 6
2 2
6(682.77) 153.8 18.7
6(5859.26) 153.8 6(80.67) 18.7
r
 

       
strong positive relat0.9 ion82 ( )shipr 
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate
TI Calculator:
How to enter data:
1. Stat
2. Edit
𝑳 𝟏 & 𝑳 𝟐

Null Hypothesis: H0: ρ = 0 (No correlation) Alternative Hypothesis: H1: ρ ≠ 0 (Correlation)
Using P-Value from Technology to Interpret r:
P-value ≤ α: Reject 𝐻0 → Supports the claim of a linear correlation.
P-value > α: Does not support the claim of a linear correlation.
Using Pearson Correlation coefficient table to Interpret r: Consider critical values from
this Table or technology as being both positive and negative:
• Correlation If |r| ≥ critical value ⇾ There is sufficient evidence to support the claim of a linear
correlation.
• No Correlation If |r| < critical value ⇾ There is not sufficient evidence to support the claim of a
linear correlation.
Properties of the Linear Correlation Coefficient r
1. −1 ≤ r ≤ 1.
2. If all values of either variable are converted to a different scale, the value of r does not change.
4. The value of r is not affected by the choice of x or y. Interchange all x values and y values, and
the value of r will not change. r measures the strength of a linear relationship. It is not designed
to measure the strength of a relationship that is not linear.
5. r is very sensitive to outliers in the sense that a single outlier could dramatically affect its value. 17

18
Correlation
A correlation exists between two variables when the values of one variable are
somehow associated with the values of the other variable. A linear correlation exists
between two variables when there is a correlation and the plotted points look like a
straight line. The linear correlation coefficient r, is a number that measures how well
paired sample data fit a straight-line pattern when graphed (measures the strength of the
linear association between paired data called bivariate data). The value of r² is the
proportion of the variation in y that is explained by the linear relationship between x
and y.
Properties of the Linear Correlation Coefficient r: −1 ≤ r ≤ 1
zx denotes the z score for
an individual sample
value x
zy is the z score for the
corresponding sample
value y.
2
2
:
:
, 2
1
T
n
t r d
O
r
r
fS n
r

  

    
       
2 22 2
)
1
:
(
,
x y
n xy x y Z Z
r r
n
n x x n y y
Or

 
          
   
   
Step 1: H0 :𝜌 = 0, H1: 𝜌 ≠ 0
claim & Tails
Step 2: TS: 𝑡 = 𝑟
𝑛−2
1−𝑟2 , OR: r
Step 3: CV using α From the
T-table or r-table
Step 4: Make the decision to
a. Reject or not H0
b. The claim is true or false
c. Restate this decision: There is
/ is not sufficient evidence to
support the claim that…
There is a linear Correlation If
|r| ≥ critical value
There is No Correlation If |r| <
critical value
TI Calculator:
How to enter data:
1. Stat
2. Edit
𝑳 𝟏 & 𝑳 𝟐
TI Calculator:
Scatter Plot:
1. Press on Y & clear
2. 2nd y, Enter
3. On, Enter
4. Select X1-list: 𝑳 𝟏
5. Select Y1-list: 𝑳 𝟐
6. Mark: Select
Character
7. Press Zoom & 9 to
get ZoomStat
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate

19
Test the significance of the given correlation coefficient
using α = 0.05, n = 6 and r = 0.982.
Example 5
Decision:
a. Reject H0
b. The claim is True
c. There is a significant relationship between
the 2 variables.
Step 1: H0 , H1, claim & Tails
Step 2: TS Calculate (TS)
Step 3: CV using α
a. Reject or not H0
c. Restate this decision: There
is / is not sufficient evidence to
H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT
TS: t-distribution 2nd Method: Pearson Correlation
CV: α = 0.05,
𝑑𝑓 = 𝑛 − 2 = 6 − 2 = 4
2 2
2
, 2
1 1
2
n r
t r df n
r r
n

   
 

2
6 2
0.982
1 0.982
t



10.3981 TS: 𝑟 = 0.982
CV: From Pearson
Correlation coefficient
table: 𝑛 = 6,α = 0.05
→ 𝑡 = ±2.776 → 𝑟 = ±0.811

20
Given the value of r = 0.801 for 23 pairs of data regarding chocolate consumption and numbers of Nobel
Laureates, and using a significance level of 0.05; is there sufficient evidence to support a claim that there
is a linear correlation between chocolate consumption and numbers of Nobel Laureates?
Example 6
Decision:
a. Reject H0
b. The claim is True
c. There is sufficient evidence to
support the conclusion that for
countries, there is a linear
correlation between chocolate
consumption and numbers of
Nobel Laureates.
TS:
CV: α = 0.05, 𝑡 = 𝑛 − 2 = 23 − 2 = 21
2
2
, 2
1
n
t r df n
r

  

2
23 2
0.801
1 0.801
t



6.1314 TS: 𝑟 = 0.801
CV: From Pearson Correlation
coefficient table: n= 23, α = 0.05
Interpretation: Although we have found a linear
correlation, it would be absurd to think that eating more
chocolate would help win a Nobel Prize.
Step 1: H0 , H1, claim & Tails
Step 2: TS Calculate (TS)
Step 3: CV using α
a. Reject or not H0
c. Restate this decision: There
is / is not sufficient evidence to
→ 𝑡 = ±2.080 Table: r = 0.396 < CV < r = 0.444
Technology: r = 0.413 → 𝑟 = ±0.413
H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT

Interpreting r: Explained Variation
The value of r² is the proportion of the variation in y that is explained by the linear
relationship between x and y.
Using the 23 pairs of chocolate/Nobel data, we get r = 0.801. What proportion of the
variation in numbers of Nobel Laureates can be explained by the variation in the
consumption of chocolate?
21
Solution
With r = 0.801 we get r² = 0.642.
Interpretation
We conclude that 0.642 (or about 64%) of the variation in numbers of Nobel
Laureates can be explained by the linear relationship between chocolate consumption
and numbers of Nobel Laureates.
This implies that about 36% of the variation in numbers of Nobel Laureates cannot be
explained by rates of chocolate consumption.

When the null hypothesis has been rejected for a specific value, any of the following five possibilities can exist.
1. There is a direct cause-and-effect relationship between the variables. That is, x causes y.
water causes plants to grow poison causes death heat causes ice to melt
2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x.
Suppose a researcher believes excessive coffee consumption causes nervousness, but the researcher fails to consider that the
reverse situation may occur. That is, it may be that an extremely nervous person craves coffee to calm his or her nerves.
3. The relationship between the variables may be caused by a third variable.
If a statistician correlated the number of deaths due to drowning and the number of cans of soft drink consumed daily during
the summer, he or she would probably find a significant relationship. However, the soft drink is not necessarily responsible
for the deaths, since both variables may be related to heat and humidity.
4. There may be a complexity of interrelationships among many variables.
A researcher may find a significant relationship between students’ high school grades and college grades. But there probably
are many other variables involved, such as IQ, hours of study, influence of parents, motivation, age, and instructors.
5. The relationship may be coincidental.
A researcher may be able to find a significant relationship between the increase in the number of people who are exercising
and the increase in the number of people who are committing crimes. But common-sense dictates that any relationship
between these two values must be due to coincidence.
Correlation, Possible Relationships Between Variables
22

Interpreting r with Causation:
Correlation does not imply causality!
We noted previously that we should use common sense when interpreting results.
Clearly, it would be absurd to think that eating more chocolate would help win a
Nobel Prize
23
Common Errors Involving Correlation:
1. Assuming that correlation implies causality
2. Using data based on averages
3. Ignoring the possibility of a nonlinear relationship
Hypotheses If conducting a formal hypothesis test to determine whether there is a significant
linear correlation between two variables, use the following null and alternative hypotheses that
use ρ to represent the linear correlation coefficient of the population:
Null Hypothesis H0: ρ = 0 (No correlation)
Alternative Hypothesis H1: ρ ≠ 0 (Correlation)

Correlation

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Correlation

Similar to Correlation (20)

More from Long Beach City College

More from Long Beach City College (20)

Recently uploaded

Recently uploaded (20)

Correlation