SlideShare a Scribd company logo
Elementary Statistics
Correlation and
Regression
Correlation
1
Correlation and Regression
Correlation
2
Objectives:
• Draw a scatter plot for a set of ordered pairs.
• Compute the correlation coefficient.
• Test the hypothesis 𝐻0: ρ = 0.
• Compute the equation of the regression line & the coefficient of determination.
• Compute the standard error of the estimate & a prediction interval.
For population 1 & 2:
Recall: Inferences about Two Proportions
1 2
1 2
X X
p
n n



1q p 
The pooled sample proportion combines the two samples
proportions into one proportion & Test Statistic :
       1 2 1 2 1 2 1 2
1 21 2
ˆ ˆ ˆ ˆ
: Or
1 1
p p p p p p p p
TS z
pq pq
pq
n nn n
     

   
 
Confidence Interval Estimate of p1 − p2
   1 1 2 2 1 1 2 2
1 2 2 1 2 1 2 2
1 2 1 2
ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
ˆ ˆ ˆ ˆ
p q p q p q p q
p p z p p p p z
n n n n
         
 1 2
ˆ ˆp p E 
The P-value method and the critical value method are equivalent, but the confidence
interval method is not equivalent to the P-value method or the critical value method.
2
2 2 2
2
ˆ ˆ ˆ, 1
x
p q p
n
  1
1 1 1
1
ˆ ˆ ˆ, 1
x
p q p
n
  
3
TI Calculator:
Confidence Interval: 2
proportion
1. Stat
2. Tests
3. 2-prop ZINT
4. Enter:𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐 & CL
TI Calculator:
2 - Proportion Z - test
1. Stat
2. Tests
3. 2 ‒ PropZTest
4. Enter Data or Stats
𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐
5. Choose RTT, LTT,
or 2TT
Recall: Two Means: Independent Samples: 1. The two samples are
independent. 2. Both samples are simple random samples. 3. Either or both of
these conditions are satisfied: The two sample sizes are both large (with n1 > 30 and n2
> 30) or both samples come from populations having normal distributions.
σ1 and σ2 are known: Use the z test for comparing two means from independent populations
2 2
1 2
2
1 2
E z
n n

 
 1 2
Confidence Inte
)
rv
(
al:
x x E 
1 2 1 2 1 2
2 2 2 2
1 2 1 2
1 2 1 2
( ) ( )
: Or
x x x x
TS z z
n n n n
 
   
   
 
 
2 2
1 2
2
1 2
s s
E t
n n
 
Unequal Variances: σ1 ≠ σ2
df = smaller of n1 – 1 or n2 – 1
Equal Variances :
σ1 = σ2
Pool the Sample
Variances
df = n1 – 1 + n2 – 1
1 2
2 2
1 2
1 2
:
x x
TS t
s s
n n



1 2
2 2
1 2
:
p p
x x
TS t
s s
n n



σ1 and σ2 are unknown: Use the t test for comparing two means from independent populations
2 2
1 1 2 2
1 2
( 1) ( 1)
( 1) ( 1)
p
n s n s
s
n n
  

  
2 2
2
1 2
p ps s
E t
n n
 
4
TI Calculator:
2 - Sample Z - test
1. Stat
2. Tests
3. 2 ‒ SampZTest
4. Enter Data or Stats
𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒏 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Calculate
TI Calculator:
2 - Sample Z - Interval
1. Stat
2. Tests
3. 2 ‒ SampZInt
4. Enter Data or Stats
𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒏 𝟐,
5. Choose RTT, LTT, or
2TT
6. Calculate
TI Calculator:
2 - Sample T - test
1. Stat
2. Tests
3. 2 ‒ SampTTest
4. Enter Data or Stats
𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒔 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Pooled: No / Yes
7. Calculate
TI Calculator:
2 - Sample T - Interval
1. Stat
2. Tests
3. 2 ‒ SampTInt
4. Enter Data or Stats
𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐
𝒏 𝟏 , 𝒔 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Pooled: No / Yes
7. Calculate
Key Concept: Testing hypotheses and constructing confidence intervals involving the
mean of the differences of the values from two populations that are dependent in the
sense that the data consist of matched pairs. The pairs must be matched according to
some relationship, such as before/after measurements from the same subjects .
Good Experimental Design: When designing an experiment or planning an
observational study, using dependent samples with matched pairs is generally better than
using two independent samples.
1. Hypothesis Test: Use the differences from two dependent samples (matched pairs)
to test a claim about the mean of the population of all such differences.
2. Confidence Interval: Use the differences from two dependent samples (matched
pairs) to construct a confidence interval estimate of the mean of the population of
all such differences.
• d = individual difference between the two values in a single matched pair
• µd = mean value of the differences d for the population of all matched pairs of data
• 𝑑 = mean value of the differences d for the paired sample data
• sd = standard deviation of the differences d for the paired sample data
• n = number of pairs of sample data
When the values are dependent, do a t test on the differences.
Denote the differences with the symbol d or D, the mean of the population differences
with μd or μD, and the sample standard deviation of the differences with sd or sD.
Recall: Two Means: Two Dependent Samples (Matched Pairs)
5
Use either d or
d.f. = 1
Or
D:
n
D d
D d
n n

 
 
d
d
d
t
s n

 orD E d E 
2
ds
E t
n

2 2 2
( ) ( )
1 ( 1)
D
d d n d d
s
n n n
 
 
 
  
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data in
𝑳 𝟏 & 𝑳 𝟐
5. 𝑳 𝟏 − 𝑳 𝟐
6. Store in 𝑳 𝟑
7. Enter
Mean, SD, 5-number
summary
1. Stat
2. Calc
3. Select 1 for 1 variable
4. Type: L3 (second 3)
5. Calculate
TI Calculator:
T- interval
1. Tests
2. T - Interval
3. Data
4. Enter 𝝁 𝟎 =0, List:
𝑳 𝟑, Freq:1
5. Calculate
TI Calculator:
Matched pair: T ‒ Test
1. Tests
2. T ‒ Test
3. Data
4. Enter 𝝁 𝟎 =0, List: 𝑳 𝟑,
Freq:1
5. Choose RTT, LTT, or 2TT
6. Calculate
For the comparison of two variances or standard deviations, an F test is used.
• The F test should not be confused with the chi-square test, which compares a single
sample variance to a specific population variance.
Characteristics:
1. The values of F cannot be negative, because variances are always positive or zero.
2. The distribution is positively skewed.
3. The mean value of F is approximately equal to 1.
4. The F distribution is a family of curves based on the degrees of freedom of the
variance of the numerator and the degrees of freedom of the variance of the
denominator.
• The larger of the two variances is placed in the numerator regardless of the
subscripts.
• The F test has two terms for the degrees of freedom: that of the numerator, n1 – 1, and
that of the denominator, n2 – 1, where n1 is the sample size from which the larger
variance was obtained.
Recall: Two Variances or Standard Deviations
2
1
2
2

s
F
s
6
TI Calculator:
2 - Sample F - test
1. Stat
2. Tests
3. 2 ‒ SampFTest
4. Enter Data or Stats
𝒔 𝟏 , 𝒏 𝟏, 𝒔 𝟐 , 𝒏 𝟐,
5. Choose RTT, LTT,
or 2TT
6. Calculate
Key Concept: In addition to hypothesis testing and confidence intervals, inferential statistics determines if a
relationship between 2 or more quantitative variables exists.
A correlation exists between two variables when the values of one variable are somehow associated with the
values of the other variable.
A linear correlation exists between two variables when there is a correlation and the plotted points look like a
straight line.
This section considers only linear relationships, which means that when graphed in a scatterplot, the points
approximate a straight-line pattern and methods of conducting a formal hypothesis test that can be used to decide
whether there is a linear correlation between all population values for the two variables.
The linear correlation coefficient r, is a number that measures how well paired sample data fit a straight-line
pattern when graphed (measures the strength of the linear association between the two variables). Use the sample
of paired data (sometimes called bivariate data) to find the value of r, and then use r to decide whether there is a
linear correlation between the two variables.
Correlation
7
Regression is a statistical method used to describe the nature of the relationship between variables—that is,
positive or negative, linear or nonlinear.
Questions: 1. Are two or more variables related?
2. If so, what is the strength of the relationship?
3. What type of relationship exists?
4. What kind of predictions can be made from the relationship?
Scatterplots & The Strength of the Linear
Correlation: r
A scatter plot is a graph of the ordered pairs (x, y) x: the independent
variable
y: the dependent variable y.
a. Distinct straight-line, or linear, pattern. We say that there is a
positive linear correlation between x and y, since as the x values
increase, the corresponding y values also increase.
b. Distinct straight-line, or linear pattern. We say that there is a
negative linear correlation between x and y, since as the x values
increase, the corresponding y values decrease.
c. No distinct pattern, which suggests that there is no correlation
between x and y.
d. Distinct pattern suggesting a correlation between x and y, but the
pattern is not that of a straight line.
8
Linear Correlation Coefficient r
1. Are two or more variables related?
2. If so, what is the strength of the relationship?
To answer these two questions, statisticians use the correlation coefficient, a numerical measure to determine
whether two or more variables are related and to determine the strength of the relationship between or among the
variables.
• Linear Correlation Coefficient r
 The linear correlation coefficient r measures the strength of the linear correlation between the paired quantitative x values and y
values in a sample. It determine whether there is a linear correlation between two variables.
3. What type of relationship exists?
There are two types of relationships: simple and multiple.
In a simple relationship, there are two variables: an independent variable (predictor variable) and a dependent
variable (response variable).
In a multiple relationship, there are two or more independent variables that are used to predict one dependent
variable.
4. What kind of predictions can be made from the relationship?
Predictions are made daily in all areas. Examples include weather forecasting, stock market analyses, sales
predictions, crop predictions, gasoline price predictions, and sports predictions. Some predictions are more accurate
than others, due to the strength of the relationship. That is, the stronger the relationship is between variables, the
more accurate the prediction is. 9
10
Construct a scatter plot for the data
shown for car rental companies in the
United States for a recent year.
Example 1
Step 1: Draw and label the x and y axes.
Step 2: Plot each point on the graph.
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data
in 𝑳 𝟏 & 𝑳 𝟐
TI Calculator:
Scatter Plot:
1. Press on Y & clear
2. 2nd y, Enter
3. On, Enter
4. Select X1-list: 𝑳 𝟏
5. Select Y1-list: 𝑳 𝟐
6. Mark: Select
Character
7. Press Zoom & 9 to
get Zoomstat
Calculating and Interpreting the Linear Correlation Coefficient denoted by r
The correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two
variables.
There are several types of correlation coefficients. The one explained in this section is called the Pearson product moment correlation
coefficient (PPMC).
The symbol for the sample correlation coefficient is r. The symbol for the population correlation coefficient is .
The range of the correlation coefficient is from 1 to 1.
If there is a strong positive linear relationship between the variables, the value of r will be close to 1.
If there is a strong negative linear relationship between the variables, the value of r will be close to 1.
n number of pairs of sample data, ∑x = sum of all x’s, ∑x² = Sum of (x values that are squared)
(∑x)² Sum up the x values and square the total. Avoid confusing ∑x² and (∑x)².
∑xy indicates that each x value should first be multiplied by its corresponding y value. After obtaining all such products, find their
sum.
r linear correlation coefficient for sample data
𝝆 (Rho) : linear correlation coefficient for a population of paired data
11
Given any collection of sample paired quantitative data, the linear correlation coefficient r can
always be computed, but the following requirements should be satisfied when using the sample
paired data to make a conclusion about linear correlation in the corresponding population of
paired data.
1. The sample of paired (x, y) data is a simple random sample of quantitative data.
2. Visual examination of the scatterplot must confirm that the points approximate a straight-line
pattern.
3. Because results can be strongly affected by the presence of outliers, any outliers must be removed if
they are known to be errors. The effects of any other outliers should be considered by calculating r
with and without the outliers included.
4. In other words, requirements 2 and 3 are simplified attempts at checking that the pairs of (x, y) data
have a bivariate normal distribution.
12
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
( )
1
x yZ Z
r
n



𝒁 𝒙 denotes the z score for an individual sample value x
𝒁 𝒚 denotes the z score for the corresponding sample value y.
Calculating and Interpreting the Linear Correlation Coefficient denoted by r
Example 2: Finding r Using Technology
The table lists five paired data values. Use technology to find the value of the
correlation coefficient r for the data.
Chocolate 5 6 4 4 5
Nobel 6 9 3 2 11
13
Solution:
The value of r will be automatically calculated with
software or a calculator: r = 0.795
Example 3 a: Finding r Using the following Formula
Use this Formula to find the value of the linear correlation coefficient r for the five
pairs of chocolate/Nobel data listed in the table.
Chocolate 5 6 4 4 5
Nobel 6 9 3 2 11
14
x (Chocolate) y (Nobel) x² y² xy
5 6 25 36 30
6 9 36 81 54
4 3 16 9 12
4 2 16 4 8
5 11 25 121 55
∑x = 24 ∑y = 31 ∑x² = 118 ∑y² = 251 ∑xy = 159
2 2
5(159) 24 31
5(118) 24 5(251) 31
r
 

       
51
14(294)

0.795
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data in
𝑳 𝟏 & 𝑳 𝟐
Use Formula to find the value of the linear correlation coefficient r for the five pairs of
chocolate/Nobel data listed in the table.
Solution: The z scores for all of the chocolate values (see the third column) and the z scores
for all of the Nobel values (see the fourth column) are below. The last column lists the
products zx · zy.
x (Chocolate) y (Nobel)
5 6
6 9
4 3
4 2
5 11
blank blank
15
Example 3b: Finding r Using the following Formula
( )
1
x yZ Z
r
n


 3.179746
5 1


0.795
This Formula has the
advantage of making it
easier to understand
how r works. The
variable x is used for
the chocolate values,
and the variable y is
used for the Nobel
values. Each sample
value is replaced by its
corresponding z score.
zx zy zx · zy
0.239046 −0.052164 −0.012470
1.434274 0.730297 1.047446
−0.956183 −0.834625 0.798054
−0.956183 −1.095445 1.047446
0.239046 1.251937 0.299270
blank blank ∑ (zx · zy) = 3.179746
For Example:
Chocolates: 𝑥 =
𝑥
𝑛
= 4.8, 𝑠 𝑥 =
(𝑥− 𝑥)2
𝑛−1
= 0.836660,
𝑥 = 5 → 𝑧 𝑥 =
𝑥 − 𝑥
𝑠 𝑥
=
5 − 4.8
0.83666
= 0.23905
Example 4: Finding r Using the following Formula (Skip)
Find the correlation coefficient for the given data.
16
Company
Cars x
(in 10,000s)
Income y
(in billions) xy x2
y2
A
B
C
D
E
F
63.0
29.0
20.8
19.1
13.4
8.5
7.0
3.9
2.1
2.8
1.4
1.5
441.00
113.10
43.68
53.48
18.76
12.75
3969.00
841.00
432.64
364.81
179.56
72.25
49.00
15.21
4.41
7.84
1.96
2.25
Σx =
153.8
Σy =
18.7
Σxy =
682.77
Σx2
=
5859.26
Σy2
=
80.67
Σx = 153.8, Σy = 18.7, Σxy = 682.77, Σx2 = 5859.26, Σy2 = 80.67, n = 6
2 2
6(682.77) 153.8 18.7
6(5859.26) 153.8 6(80.67) 18.7
r
 

       
strong positive relat0.9 ion82 ( )shipr 
2 22 2
( )
( ) ( ) ( ) ( )
n xy x y
r
n x x n y y
 

    
   
  
   
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data in
𝑳 𝟏 & 𝑳 𝟐
Null Hypothesis: H0: ρ = 0 (No correlation) Alternative Hypothesis: H1: ρ ≠ 0 (Correlation)
Using P-Value from Technology to Interpret r:
P-value ≤ α: Reject 𝐻0 → Supports the claim of a linear correlation.
P-value > α: Does not support the claim of a linear correlation.
Using Pearson Correlation coefficient table to Interpret r: Consider critical values from
this Table or technology as being both positive and negative:
• Correlation If |r| ≥ critical value ⇾ There is sufficient evidence to support the claim of a linear
correlation.
• No Correlation If |r| < critical value ⇾ There is not sufficient evidence to support the claim of a
linear correlation.
Properties of the Linear Correlation Coefficient r
1. −1 ≤ r ≤ 1.
2. If all values of either variable are converted to a different scale, the value of r does not change.
4. The value of r is not affected by the choice of x or y. Interchange all x values and y values, and
the value of r will not change. ​​r measures the strength of a linear relationship. It is not designed
to measure the strength of a relationship that is not linear.
5. r is very sensitive to outliers in the sense that a single outlier could dramatically affect its value. 17
Calculating and Interpreting the Linear Correlation Coefficient denoted by r
18
Correlation
A correlation exists between two variables when the values of one variable are
somehow associated with the values of the other variable. A linear correlation exists
between two variables when there is a correlation and the plotted points look like a
straight line. The linear correlation coefficient r, is a number that measures how well
paired sample data fit a straight-line pattern when graphed (measures the strength of the
linear association between paired data called bivariate data). The value of r² is the
proportion of the variation in y that is explained by the linear relationship between x
and y.
Properties of the Linear Correlation Coefficient r: −1 ≤ r ≤ 1
zx denotes the z score for
an individual sample
value x
zy is the z score for the
corresponding sample
value y.
2
2
:
:
, 2
1
T
n
t r d
O
r
r
fS n
r

  

    
       
2 22 2
)
1
:
(
,
x y
n xy x y Z Z
r r
n
n x x n y y
Or

 
          
   
   
Step 1: H0 :𝜌 = 0, H1: 𝜌 ≠ 0
claim & Tails
Step 2: TS: 𝑡 = 𝑟
𝑛−2
1−𝑟2 , OR: r
Step 3: CV using α From the
T-table or r-table
Step 4: Make the decision to
a. Reject or not H0
b. The claim is true or false
c. Restate this decision: There is
/ is not sufficient evidence to
support the claim that…
There is a linear Correlation If
|r| ≥ critical value
There is No Correlation If |r| <
critical value
TI Calculator:
How to enter data:
1. Stat
2. Edit
3. ClrList 𝑳 𝟏 & 𝑳 𝟐
4. Type in your data in
𝑳 𝟏 & 𝑳 𝟐
TI Calculator:
Scatter Plot:
1. Press on Y & clear
2. 2nd y, Enter
3. On, Enter
4. Select X1-list: 𝑳 𝟏
5. Select Y1-list: 𝑳 𝟐
6. Mark: Select
Character
7. Press Zoom & 9 to
get ZoomStat
TI Calculator:
Linear Regression -
test
1. Stat
2. Tests
3. LinRegTTest
4. Enter 𝑳 𝟏 & 𝑳 𝟐
5. Freq = 1
6. Choose ≠
7. Calculate
19
Test the significance of the given correlation coefficient
using α = 0.05, n = 6 and r = 0.982.
Example 5
Decision:
a. Reject H0
b. The claim is True
c. There is a significant relationship between
the 2 variables.
Step 1: H0 , H1, claim & Tails
Step 2: TS Calculate (TS)
Step 3: CV using α
Step 4: Make the decision to
a. Reject or not H0
b. The claim is true or false
c. Restate this decision: There
is / is not sufficient evidence to
support the claim that…
H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT
TS: t-distribution 2nd Method: Pearson Correlation
CV: α = 0.05,
𝑑𝑓 = 𝑛 − 2 = 6 − 2 = 4
2 2
2
, 2
1 1
2
n r
t r df n
r r
n

   
 

2
6 2
0.982
1 0.982
t



10.3981 TS: 𝑟 = 0.982
CV: From Pearson
Correlation coefficient
table: 𝑛 = 6,α = 0.05
→ 𝑡 = ±2.776 → 𝑟 = ±0.811
20
Given the value of r = 0.801 for 23 pairs of data regarding chocolate consumption and numbers of Nobel
Laureates, and using a significance level of 0.05; is there sufficient evidence to support a claim that there
is a linear correlation between chocolate consumption and numbers of Nobel Laureates?
Example 6
Decision:
a. Reject H0
b. The claim is True
c. There is sufficient evidence to
support the conclusion that for
countries, there is a linear
correlation between chocolate
consumption and numbers of
Nobel Laureates.
TS:
CV: α = 0.05, 𝑡 = 𝑛 − 2 = 23 − 2 = 21
2
2
, 2
1
n
t r df n
r

  

2
23 2
0.801
1 0.801
t



6.1314 TS: 𝑟 = 0.801
CV: From Pearson Correlation
coefficient table: n= 23, α = 0.05
Interpretation: Although we have found a linear
correlation, it would be absurd to think that eating more
chocolate would help win a Nobel Prize.
Step 1: H0 , H1, claim & Tails
Step 2: TS Calculate (TS)
Step 3: CV using α
Step 4: Make the decision to
a. Reject or not H0
b. The claim is true or false
c. Restate this decision: There
is / is not sufficient evidence to
support the claim that…
→ 𝑡 = ±2.080 Table: r = 0.396 < CV < r = 0.444
Technology: r = 0.413 → 𝑟 = ±0.413
H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT
Interpreting r: Explained Variation
The value of r² is the proportion of the variation in y that is explained by the linear
relationship between x and y.
Using the 23 pairs of chocolate/Nobel data, we get r = 0.801. What proportion of the
variation in numbers of Nobel Laureates can be explained by the variation in the
consumption of chocolate?
21
Solution
With r = 0.801 we get r² = 0.642.
Interpretation
We conclude that 0.642 (or about 64%) of the variation in numbers of Nobel
Laureates can be explained by the linear relationship between chocolate consumption
and numbers of Nobel Laureates.
This implies that about 36% of the variation in numbers of Nobel Laureates cannot be
explained by rates of chocolate consumption.
When the null hypothesis has been rejected for a specific value, any of the following five possibilities can exist.
1. There is a direct cause-and-effect relationship between the variables. That is, x causes y.
water causes plants to grow poison causes death heat causes ice to melt
2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x.
Suppose a researcher believes excessive coffee consumption causes nervousness, but the researcher fails to consider that the
reverse situation may occur. That is, it may be that an extremely nervous person craves coffee to calm his or her nerves.
3. The relationship between the variables may be caused by a third variable.
If a statistician correlated the number of deaths due to drowning and the number of cans of soft drink consumed daily during
the summer, he or she would probably find a significant relationship. However, the soft drink is not necessarily responsible
for the deaths, since both variables may be related to heat and humidity.
4. There may be a complexity of interrelationships among many variables.
A researcher may find a significant relationship between students’ high school grades and college grades. But there probably
are many other variables involved, such as IQ, hours of study, influence of parents, motivation, age, and instructors.
5. The relationship may be coincidental.
A researcher may be able to find a significant relationship between the increase in the number of people who are exercising
and the increase in the number of people who are committing crimes. But common-sense dictates that any relationship
between these two values must be due to coincidence.
Correlation, Possible Relationships Between Variables
22
Interpreting r with Causation:
Correlation does not imply causality!
We noted previously that we should use common sense when interpreting results.
Clearly, it would be absurd to think that eating more chocolate would help win a
Nobel Prize
23
Common Errors Involving Correlation:
1. Assuming that correlation implies causality
2. Using data based on averages
3. Ignoring the possibility of a nonlinear relationship
Hypotheses If conducting a formal hypothesis test to determine whether there is a significant
linear correlation between two variables, use the following null and alternative hypotheses that
use ρ to represent the linear correlation coefficient of the population:
Null Hypothesis H0: ρ = 0 (No correlation)
Alternative Hypothesis H1: ρ ≠ 0 (Correlation)

More Related Content

What's hot

Small Sampling Theory Presentation1
Small Sampling Theory Presentation1Small Sampling Theory Presentation1
Small Sampling Theory Presentation1
jravish
 
Chi square & related procedure
Chi square & related procedureChi square & related procedure
Chi square & related procedure
Pakistan Gum Industries Pvt. Ltd
 
Chi squared test
Chi squared testChi squared test
Chi squared test
vikas232190
 
Chi square test
Chi square testChi square test
Chi square test
Nayna Azad
 
Chi square mahmoud
Chi square mahmoudChi square mahmoud
Chi square mahmoud
Mohammad Ihmeidan
 
Chi square test
Chi square testChi square test
Chi square test
AmanRathore54
 
Logistic regression analysis
Logistic regression analysisLogistic regression analysis
Logistic regression analysis
Dhritiman Chakrabarti
 
Chi square test
Chi square testChi square test
Chi square test
NidhiGossai
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
smackinnon
 
Chi square
Chi squareChi square
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling Distribution
Dexlab Analytics
 
The chi square test of indep of categorical variables
The chi square test of indep of categorical variablesThe chi square test of indep of categorical variables
The chi square test of indep of categorical variables
Regent University
 
Significance tests
Significance testsSignificance tests
Significance tests
Jinho Choi
 
Chi square test ( x2 )
Chi square test ( x2  )Chi square test ( x2  )
Chi square test ( x2 )
yogesh ingle
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
Transweb Global Inc
 
Overview of Advance Marketing Research
Overview of Advance Marketing ResearchOverview of Advance Marketing Research
Overview of Advance Marketing Research
Enamul Islam
 
Introduction to the t test
Introduction to the t testIntroduction to the t test
Introduction to the t test
Sr Edith Bogue
 
The chi square_test
The chi square_testThe chi square_test
The chi square_test
Anandapadmanabhan Kottiyam
 
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSSQUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
ICFAI Business School
 

What's hot (19)

Small Sampling Theory Presentation1
Small Sampling Theory Presentation1Small Sampling Theory Presentation1
Small Sampling Theory Presentation1
 
Chi square & related procedure
Chi square & related procedureChi square & related procedure
Chi square & related procedure
 
Chi squared test
Chi squared testChi squared test
Chi squared test
 
Chi square test
Chi square testChi square test
Chi square test
 
Chi square mahmoud
Chi square mahmoudChi square mahmoud
Chi square mahmoud
 
Chi square test
Chi square testChi square test
Chi square test
 
Logistic regression analysis
Logistic regression analysisLogistic regression analysis
Logistic regression analysis
 
Chi square test
Chi square testChi square test
Chi square test
 
Generalized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects DesignsGeneralized Linear Models for Between-Subjects Designs
Generalized Linear Models for Between-Subjects Designs
 
Chi square
Chi squareChi square
Chi square
 
Statistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling DistributionStatistical Inference Part II: Types of Sampling Distribution
Statistical Inference Part II: Types of Sampling Distribution
 
The chi square test of indep of categorical variables
The chi square test of indep of categorical variablesThe chi square test of indep of categorical variables
The chi square test of indep of categorical variables
 
Significance tests
Significance testsSignificance tests
Significance tests
 
Chi square test ( x2 )
Chi square test ( x2  )Chi square test ( x2  )
Chi square test ( x2 )
 
General Linear Model | Statistics
General Linear Model | StatisticsGeneral Linear Model | Statistics
General Linear Model | Statistics
 
Overview of Advance Marketing Research
Overview of Advance Marketing ResearchOverview of Advance Marketing Research
Overview of Advance Marketing Research
 
Introduction to the t test
Introduction to the t testIntroduction to the t test
Introduction to the t test
 
The chi square_test
The chi square_testThe chi square_test
The chi square_test
 
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSSQUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
QUANTITATIVE DATA ANALYSIS HOW TO DO A T-TEST ON MS-EXCEL AND SPSS
 

Similar to Correlation

Two dependent samples (matched pairs)
Two dependent samples (matched pairs) Two dependent samples (matched pairs)
Two dependent samples (matched pairs)
Long Beach City College
 
Two Means Independent Samples
Two Means Independent Samples  Two Means Independent Samples
Two Means Independent Samples
Long Beach City College
 
Association between-variables
Association between-variablesAssociation between-variables
Association between-variables
Borhan Uddin
 
Association between-variables
Association between-variablesAssociation between-variables
Association between-variables
Borhan Uddin
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
ANSHU TIWARI
 
T Test For Two Independent Samples
T Test For Two Independent SamplesT Test For Two Independent Samples
T Test For Two Independent Samples
shoffma5
 
Measure of Association
Measure of AssociationMeasure of Association
Measure of Association
Kalahandi University
 
Medical statistics2
Medical statistics2Medical statistics2
Medical statistics2
Amany El-seoud
 
s.analysis
s.analysiss.analysis
s.analysis
kavi ...
 
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptx
Anusuya123
 
13943056.ppt
13943056.ppt13943056.ppt
13943056.ppt
TinasheChatara1
 
Chapter 11 Psrm
Chapter 11 PsrmChapter 11 Psrm
Chapter 11 Psrm
mandrewmartin
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Dexlab Analytics
 
parameter Estimation and effect size
parameter Estimation and effect size parameter Estimation and effect size
parameter Estimation and effect size
hannantahir30
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And Regression
Sumit Prajapati
 
QNT 275 Exceptional Education - snaptutorial.com
QNT 275   Exceptional Education - snaptutorial.comQNT 275   Exceptional Education - snaptutorial.com
QNT 275 Exceptional Education - snaptutorial.com
DavisMurphyB22
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
Southern Range, Berhampur, Odisha
 
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Tests
sanketd1983
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and Correlation
Janet Penilla
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
Dhritiman Chakrabarti
 

Similar to Correlation (20)

Two dependent samples (matched pairs)
Two dependent samples (matched pairs) Two dependent samples (matched pairs)
Two dependent samples (matched pairs)
 
Two Means Independent Samples
Two Means Independent Samples  Two Means Independent Samples
Two Means Independent Samples
 
Association between-variables
Association between-variablesAssociation between-variables
Association between-variables
 
Association between-variables
Association between-variablesAssociation between-variables
Association between-variables
 
marketing research & applications on SPSS
marketing research & applications on SPSSmarketing research & applications on SPSS
marketing research & applications on SPSS
 
T Test For Two Independent Samples
T Test For Two Independent SamplesT Test For Two Independent Samples
T Test For Two Independent Samples
 
Measure of Association
Measure of AssociationMeasure of Association
Measure of Association
 
Medical statistics2
Medical statistics2Medical statistics2
Medical statistics2
 
s.analysis
s.analysiss.analysis
s.analysis
 
Basic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptxBasic Statistical Descriptions of Data.pptx
Basic Statistical Descriptions of Data.pptx
 
13943056.ppt
13943056.ppt13943056.ppt
13943056.ppt
 
Chapter 11 Psrm
Chapter 11 PsrmChapter 11 Psrm
Chapter 11 Psrm
 
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
Basic of Statistical Inference Part-III: The Theory of Estimation from Dexlab...
 
parameter Estimation and effect size
parameter Estimation and effect size parameter Estimation and effect size
parameter Estimation and effect size
 
Lesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And RegressionLesson 8 Linear Correlation And Regression
Lesson 8 Linear Correlation And Regression
 
QNT 275 Exceptional Education - snaptutorial.com
QNT 275   Exceptional Education - snaptutorial.comQNT 275   Exceptional Education - snaptutorial.com
QNT 275 Exceptional Education - snaptutorial.com
 
Simple linear regressionn and Correlation
Simple linear regressionn and CorrelationSimple linear regressionn and Correlation
Simple linear regressionn and Correlation
 
Two Sample Tests
Two Sample TestsTwo Sample Tests
Two Sample Tests
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and Correlation
 
Linear regression analysis
Linear regression analysisLinear regression analysis
Linear regression analysis
 

More from Long Beach City College

Practice test ch 9 inferences from two samples
Practice test ch 9 inferences from two samplesPractice test ch 9 inferences from two samples
Practice test ch 9 inferences from two samples
Long Beach City College
 
Practice Test Ch 8 Hypothesis Testing
Practice Test Ch 8 Hypothesis TestingPractice Test Ch 8 Hypothesis Testing
Practice Test Ch 8 Hypothesis Testing
Long Beach City College
 
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anovaSolution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Long Beach City College
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Long Beach City College
 
Practice test ch 8 hypothesis testing ch 9 two populations
Practice test ch 8 hypothesis testing ch 9 two populationsPractice test ch 8 hypothesis testing ch 9 two populations
Practice test ch 8 hypothesis testing ch 9 two populations
Long Beach City College
 
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populationsSolution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
Long Beach City College
 
Solution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
Solution to the Practice Test 3A, Chapter 6 Normal Probability DistributionSolution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
Solution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
Long Beach City College
 
Practice Test Chapter 6 (Normal Probability Distributions)
Practice Test Chapter 6 (Normal Probability Distributions)Practice Test Chapter 6 (Normal Probability Distributions)
Practice Test Chapter 6 (Normal Probability Distributions)
Long Beach City College
 
Practice Test 2 Solutions
Practice Test 2  SolutionsPractice Test 2  Solutions
Practice Test 2 Solutions
Long Beach City College
 
Practice Test 2 Probability
Practice Test 2 ProbabilityPractice Test 2 Probability
Practice Test 2 Probability
Long Beach City College
 
Practice Test 1 solutions
Practice Test 1 solutions  Practice Test 1 solutions
Practice Test 1 solutions
Long Beach City College
 
Practice Test 1
Practice Test 1Practice Test 1
Practice Test 1
Long Beach City College
 
Stat sample test ch 12 solution
Stat sample test ch 12 solutionStat sample test ch 12 solution
Stat sample test ch 12 solution
Long Beach City College
 
Stat sample test ch 12
Stat sample test ch 12Stat sample test ch 12
Stat sample test ch 12
Long Beach City College
 
Stat sample test ch 11
Stat sample test ch 11Stat sample test ch 11
Stat sample test ch 11
Long Beach City College
 
Stat sample test ch 10
Stat sample test ch 10Stat sample test ch 10
Stat sample test ch 10
Long Beach City College
 
Two-Way ANOVA
Two-Way ANOVATwo-Way ANOVA
One-Way ANOVA
One-Way ANOVAOne-Way ANOVA
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
Long Beach City College
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
Long Beach City College
 

More from Long Beach City College (20)

Practice test ch 9 inferences from two samples
Practice test ch 9 inferences from two samplesPractice test ch 9 inferences from two samples
Practice test ch 9 inferences from two samples
 
Practice Test Ch 8 Hypothesis Testing
Practice Test Ch 8 Hypothesis TestingPractice Test Ch 8 Hypothesis Testing
Practice Test Ch 8 Hypothesis Testing
 
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anovaSolution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
Solution to the practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anovaPractice test ch 10 correlation reg ch 11 gof ch12 anova
Practice test ch 10 correlation reg ch 11 gof ch12 anova
 
Practice test ch 8 hypothesis testing ch 9 two populations
Practice test ch 8 hypothesis testing ch 9 two populationsPractice test ch 8 hypothesis testing ch 9 two populations
Practice test ch 8 hypothesis testing ch 9 two populations
 
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populationsSolution to the practice test ch 8 hypothesis testing ch 9 two populations
Solution to the practice test ch 8 hypothesis testing ch 9 two populations
 
Solution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
Solution to the Practice Test 3A, Chapter 6 Normal Probability DistributionSolution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
Solution to the Practice Test 3A, Chapter 6 Normal Probability Distribution
 
Practice Test Chapter 6 (Normal Probability Distributions)
Practice Test Chapter 6 (Normal Probability Distributions)Practice Test Chapter 6 (Normal Probability Distributions)
Practice Test Chapter 6 (Normal Probability Distributions)
 
Practice Test 2 Solutions
Practice Test 2  SolutionsPractice Test 2  Solutions
Practice Test 2 Solutions
 
Practice Test 2 Probability
Practice Test 2 ProbabilityPractice Test 2 Probability
Practice Test 2 Probability
 
Practice Test 1 solutions
Practice Test 1 solutions  Practice Test 1 solutions
Practice Test 1 solutions
 
Practice Test 1
Practice Test 1Practice Test 1
Practice Test 1
 
Stat sample test ch 12 solution
Stat sample test ch 12 solutionStat sample test ch 12 solution
Stat sample test ch 12 solution
 
Stat sample test ch 12
Stat sample test ch 12Stat sample test ch 12
Stat sample test ch 12
 
Stat sample test ch 11
Stat sample test ch 11Stat sample test ch 11
Stat sample test ch 11
 
Stat sample test ch 10
Stat sample test ch 10Stat sample test ch 10
Stat sample test ch 10
 
Two-Way ANOVA
Two-Way ANOVATwo-Way ANOVA
Two-Way ANOVA
 
One-Way ANOVA
One-Way ANOVAOne-Way ANOVA
One-Way ANOVA
 
Contingency Tables
Contingency TablesContingency Tables
Contingency Tables
 
Goodness of Fit Notation
Goodness of Fit NotationGoodness of Fit Notation
Goodness of Fit Notation
 

Recently uploaded

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
Colégio Santa Teresinha
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
RAHUL
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
Celine George
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
simonomuemu
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
Celine George
 

Recently uploaded (20)

A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
MARY JANE WILSON, A “BOA MÃE” .
MARY JANE WILSON, A “BOA MÃE”           .MARY JANE WILSON, A “BOA MÃE”           .
MARY JANE WILSON, A “BOA MÃE” .
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPLAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UP
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
How to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP ModuleHow to Add Chatter in the odoo 17 ERP Module
How to Add Chatter in the odoo 17 ERP Module
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
Smart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICTSmart-Money for SMC traders good time and ICT
Smart-Money for SMC traders good time and ICT
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
How to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRMHow to Manage Your Lost Opportunities in Odoo 17 CRM
How to Manage Your Lost Opportunities in Odoo 17 CRM
 

Correlation

  • 2. Correlation and Regression Correlation 2 Objectives: • Draw a scatter plot for a set of ordered pairs. • Compute the correlation coefficient. • Test the hypothesis 𝐻0: ρ = 0. • Compute the equation of the regression line & the coefficient of determination. • Compute the standard error of the estimate & a prediction interval.
  • 3. For population 1 & 2: Recall: Inferences about Two Proportions 1 2 1 2 X X p n n    1q p  The pooled sample proportion combines the two samples proportions into one proportion & Test Statistic :        1 2 1 2 1 2 1 2 1 21 2 ˆ ˆ ˆ ˆ : Or 1 1 p p p p p p p p TS z pq pq pq n nn n              Confidence Interval Estimate of p1 − p2    1 1 2 2 1 1 2 2 1 2 2 1 2 1 2 2 1 2 1 2 ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ p q p q p q p q p p z p p p p z n n n n            1 2 ˆ ˆp p E  The P-value method and the critical value method are equivalent, but the confidence interval method is not equivalent to the P-value method or the critical value method. 2 2 2 2 2 ˆ ˆ ˆ, 1 x p q p n   1 1 1 1 1 ˆ ˆ ˆ, 1 x p q p n    3 TI Calculator: Confidence Interval: 2 proportion 1. Stat 2. Tests 3. 2-prop ZINT 4. Enter:𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐 & CL TI Calculator: 2 - Proportion Z - test 1. Stat 2. Tests 3. 2 ‒ PropZTest 4. Enter Data or Stats 𝒏 𝟏 , 𝒏 𝟐, 𝒙 𝟏, 𝒙 𝟐 5. Choose RTT, LTT, or 2TT
  • 4. Recall: Two Means: Independent Samples: 1. The two samples are independent. 2. Both samples are simple random samples. 3. Either or both of these conditions are satisfied: The two sample sizes are both large (with n1 > 30 and n2 > 30) or both samples come from populations having normal distributions. σ1 and σ2 are known: Use the z test for comparing two means from independent populations 2 2 1 2 2 1 2 E z n n     1 2 Confidence Inte ) rv ( al: x x E  1 2 1 2 1 2 2 2 2 2 1 2 1 2 1 2 1 2 ( ) ( ) : Or x x x x TS z z n n n n               2 2 1 2 2 1 2 s s E t n n   Unequal Variances: σ1 ≠ σ2 df = smaller of n1 – 1 or n2 – 1 Equal Variances : σ1 = σ2 Pool the Sample Variances df = n1 – 1 + n2 – 1 1 2 2 2 1 2 1 2 : x x TS t s s n n    1 2 2 2 1 2 : p p x x TS t s s n n    σ1 and σ2 are unknown: Use the t test for comparing two means from independent populations 2 2 1 1 2 2 1 2 ( 1) ( 1) ( 1) ( 1) p n s n s s n n        2 2 2 1 2 p ps s E t n n   4 TI Calculator: 2 - Sample Z - test 1. Stat 2. Tests 3. 2 ‒ SampZTest 4. Enter Data or Stats 𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐 𝒏 𝟏 , 𝒏 𝟐, 5. Choose RTT, LTT, or 2TT 6. Calculate TI Calculator: 2 - Sample Z - Interval 1. Stat 2. Tests 3. 2 ‒ SampZInt 4. Enter Data or Stats 𝝈 𝟏 , 𝝈 𝟐, 𝒙 𝟏 , 𝒏 𝟏, 𝒙 𝟐 𝒏 𝟏 , 𝒏 𝟐, 5. Choose RTT, LTT, or 2TT 6. Calculate TI Calculator: 2 - Sample T - test 1. Stat 2. Tests 3. 2 ‒ SampTTest 4. Enter Data or Stats 𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐 𝒏 𝟏 , 𝒔 𝟐, 5. Choose RTT, LTT, or 2TT 6. Pooled: No / Yes 7. Calculate TI Calculator: 2 - Sample T - Interval 1. Stat 2. Tests 3. 2 ‒ SampTInt 4. Enter Data or Stats 𝒙 𝟏, 𝒔 𝟏 , 𝒏 𝟏, 𝒙 𝟐 𝒏 𝟏 , 𝒔 𝟐, 5. Choose RTT, LTT, or 2TT 6. Pooled: No / Yes 7. Calculate
  • 5. Key Concept: Testing hypotheses and constructing confidence intervals involving the mean of the differences of the values from two populations that are dependent in the sense that the data consist of matched pairs. The pairs must be matched according to some relationship, such as before/after measurements from the same subjects . Good Experimental Design: When designing an experiment or planning an observational study, using dependent samples with matched pairs is generally better than using two independent samples. 1. Hypothesis Test: Use the differences from two dependent samples (matched pairs) to test a claim about the mean of the population of all such differences. 2. Confidence Interval: Use the differences from two dependent samples (matched pairs) to construct a confidence interval estimate of the mean of the population of all such differences. • d = individual difference between the two values in a single matched pair • µd = mean value of the differences d for the population of all matched pairs of data • 𝑑 = mean value of the differences d for the paired sample data • sd = standard deviation of the differences d for the paired sample data • n = number of pairs of sample data When the values are dependent, do a t test on the differences. Denote the differences with the symbol d or D, the mean of the population differences with μd or μD, and the sample standard deviation of the differences with sd or sD. Recall: Two Means: Two Dependent Samples (Matched Pairs) 5 Use either d or d.f. = 1 Or D: n D d D d n n      d d d t s n   orD E d E  2 ds E t n  2 2 2 ( ) ( ) 1 ( 1) D d d n d d s n n n          TI Calculator: How to enter data: 1. Stat 2. Edit 3. ClrList 𝑳 𝟏 & 𝑳 𝟐 4. Type in your data in 𝑳 𝟏 & 𝑳 𝟐 5. 𝑳 𝟏 − 𝑳 𝟐 6. Store in 𝑳 𝟑 7. Enter Mean, SD, 5-number summary 1. Stat 2. Calc 3. Select 1 for 1 variable 4. Type: L3 (second 3) 5. Calculate TI Calculator: T- interval 1. Tests 2. T - Interval 3. Data 4. Enter 𝝁 𝟎 =0, List: 𝑳 𝟑, Freq:1 5. Calculate TI Calculator: Matched pair: T ‒ Test 1. Tests 2. T ‒ Test 3. Data 4. Enter 𝝁 𝟎 =0, List: 𝑳 𝟑, Freq:1 5. Choose RTT, LTT, or 2TT 6. Calculate
  • 6. For the comparison of two variances or standard deviations, an F test is used. • The F test should not be confused with the chi-square test, which compares a single sample variance to a specific population variance. Characteristics: 1. The values of F cannot be negative, because variances are always positive or zero. 2. The distribution is positively skewed. 3. The mean value of F is approximately equal to 1. 4. The F distribution is a family of curves based on the degrees of freedom of the variance of the numerator and the degrees of freedom of the variance of the denominator. • The larger of the two variances is placed in the numerator regardless of the subscripts. • The F test has two terms for the degrees of freedom: that of the numerator, n1 – 1, and that of the denominator, n2 – 1, where n1 is the sample size from which the larger variance was obtained. Recall: Two Variances or Standard Deviations 2 1 2 2  s F s 6 TI Calculator: 2 - Sample F - test 1. Stat 2. Tests 3. 2 ‒ SampFTest 4. Enter Data or Stats 𝒔 𝟏 , 𝒏 𝟏, 𝒔 𝟐 , 𝒏 𝟐, 5. Choose RTT, LTT, or 2TT 6. Calculate
  • 7. Key Concept: In addition to hypothesis testing and confidence intervals, inferential statistics determines if a relationship between 2 or more quantitative variables exists. A correlation exists between two variables when the values of one variable are somehow associated with the values of the other variable. A linear correlation exists between two variables when there is a correlation and the plotted points look like a straight line. This section considers only linear relationships, which means that when graphed in a scatterplot, the points approximate a straight-line pattern and methods of conducting a formal hypothesis test that can be used to decide whether there is a linear correlation between all population values for the two variables. The linear correlation coefficient r, is a number that measures how well paired sample data fit a straight-line pattern when graphed (measures the strength of the linear association between the two variables). Use the sample of paired data (sometimes called bivariate data) to find the value of r, and then use r to decide whether there is a linear correlation between the two variables. Correlation 7 Regression is a statistical method used to describe the nature of the relationship between variables—that is, positive or negative, linear or nonlinear. Questions: 1. Are two or more variables related? 2. If so, what is the strength of the relationship? 3. What type of relationship exists? 4. What kind of predictions can be made from the relationship?
  • 8. Scatterplots & The Strength of the Linear Correlation: r A scatter plot is a graph of the ordered pairs (x, y) x: the independent variable y: the dependent variable y. a. Distinct straight-line, or linear, pattern. We say that there is a positive linear correlation between x and y, since as the x values increase, the corresponding y values also increase. b. Distinct straight-line, or linear pattern. We say that there is a negative linear correlation between x and y, since as the x values increase, the corresponding y values decrease. c. No distinct pattern, which suggests that there is no correlation between x and y. d. Distinct pattern suggesting a correlation between x and y, but the pattern is not that of a straight line. 8
  • 9. Linear Correlation Coefficient r 1. Are two or more variables related? 2. If so, what is the strength of the relationship? To answer these two questions, statisticians use the correlation coefficient, a numerical measure to determine whether two or more variables are related and to determine the strength of the relationship between or among the variables. • Linear Correlation Coefficient r  The linear correlation coefficient r measures the strength of the linear correlation between the paired quantitative x values and y values in a sample. It determine whether there is a linear correlation between two variables. 3. What type of relationship exists? There are two types of relationships: simple and multiple. In a simple relationship, there are two variables: an independent variable (predictor variable) and a dependent variable (response variable). In a multiple relationship, there are two or more independent variables that are used to predict one dependent variable. 4. What kind of predictions can be made from the relationship? Predictions are made daily in all areas. Examples include weather forecasting, stock market analyses, sales predictions, crop predictions, gasoline price predictions, and sports predictions. Some predictions are more accurate than others, due to the strength of the relationship. That is, the stronger the relationship is between variables, the more accurate the prediction is. 9
  • 10. 10 Construct a scatter plot for the data shown for car rental companies in the United States for a recent year. Example 1 Step 1: Draw and label the x and y axes. Step 2: Plot each point on the graph. TI Calculator: How to enter data: 1. Stat 2. Edit 3. ClrList 𝑳 𝟏 & 𝑳 𝟐 4. Type in your data in 𝑳 𝟏 & 𝑳 𝟐 TI Calculator: Scatter Plot: 1. Press on Y & clear 2. 2nd y, Enter 3. On, Enter 4. Select X1-list: 𝑳 𝟏 5. Select Y1-list: 𝑳 𝟐 6. Mark: Select Character 7. Press Zoom & 9 to get Zoomstat
  • 11. Calculating and Interpreting the Linear Correlation Coefficient denoted by r The correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two variables. There are several types of correlation coefficients. The one explained in this section is called the Pearson product moment correlation coefficient (PPMC). The symbol for the sample correlation coefficient is r. The symbol for the population correlation coefficient is . The range of the correlation coefficient is from 1 to 1. If there is a strong positive linear relationship between the variables, the value of r will be close to 1. If there is a strong negative linear relationship between the variables, the value of r will be close to 1. n number of pairs of sample data, ∑x = sum of all x’s, ∑x² = Sum of (x values that are squared) (∑x)² Sum up the x values and square the total. Avoid confusing ∑x² and (∑x)². ∑xy indicates that each x value should first be multiplied by its corresponding y value. After obtaining all such products, find their sum. r linear correlation coefficient for sample data 𝝆 (Rho) : linear correlation coefficient for a population of paired data 11
  • 12. Given any collection of sample paired quantitative data, the linear correlation coefficient r can always be computed, but the following requirements should be satisfied when using the sample paired data to make a conclusion about linear correlation in the corresponding population of paired data. 1. The sample of paired (x, y) data is a simple random sample of quantitative data. 2. Visual examination of the scatterplot must confirm that the points approximate a straight-line pattern. 3. Because results can be strongly affected by the presence of outliers, any outliers must be removed if they are known to be errors. The effects of any other outliers should be considered by calculating r with and without the outliers included. 4. In other words, requirements 2 and 3 are simplified attempts at checking that the pairs of (x, y) data have a bivariate normal distribution. 12 2 22 2 ( ) ( ) ( ) ( ) ( ) n xy x y r n x x n y y                    ( ) 1 x yZ Z r n    𝒁 𝒙 denotes the z score for an individual sample value x 𝒁 𝒚 denotes the z score for the corresponding sample value y. Calculating and Interpreting the Linear Correlation Coefficient denoted by r
  • 13. Example 2: Finding r Using Technology The table lists five paired data values. Use technology to find the value of the correlation coefficient r for the data. Chocolate 5 6 4 4 5 Nobel 6 9 3 2 11 13 Solution: The value of r will be automatically calculated with software or a calculator: r = 0.795
  • 14. Example 3 a: Finding r Using the following Formula Use this Formula to find the value of the linear correlation coefficient r for the five pairs of chocolate/Nobel data listed in the table. Chocolate 5 6 4 4 5 Nobel 6 9 3 2 11 14 x (Chocolate) y (Nobel) x² y² xy 5 6 25 36 30 6 9 36 81 54 4 3 16 9 12 4 2 16 4 8 5 11 25 121 55 ∑x = 24 ∑y = 31 ∑x² = 118 ∑y² = 251 ∑xy = 159 2 2 5(159) 24 31 5(118) 24 5(251) 31 r            51 14(294)  0.795 2 22 2 ( ) ( ) ( ) ( ) ( ) n xy x y r n x x n y y                    TI Calculator: Linear Regression - test 1. Stat 2. Tests 3. LinRegTTest 4. Enter 𝑳 𝟏 & 𝑳 𝟐 5. Freq = 1 6. Choose ≠ 7. Calculate TI Calculator: How to enter data: 1. Stat 2. Edit 3. ClrList 𝑳 𝟏 & 𝑳 𝟐 4. Type in your data in 𝑳 𝟏 & 𝑳 𝟐
  • 15. Use Formula to find the value of the linear correlation coefficient r for the five pairs of chocolate/Nobel data listed in the table. Solution: The z scores for all of the chocolate values (see the third column) and the z scores for all of the Nobel values (see the fourth column) are below. The last column lists the products zx · zy. x (Chocolate) y (Nobel) 5 6 6 9 4 3 4 2 5 11 blank blank 15 Example 3b: Finding r Using the following Formula ( ) 1 x yZ Z r n    3.179746 5 1   0.795 This Formula has the advantage of making it easier to understand how r works. The variable x is used for the chocolate values, and the variable y is used for the Nobel values. Each sample value is replaced by its corresponding z score. zx zy zx · zy 0.239046 −0.052164 −0.012470 1.434274 0.730297 1.047446 −0.956183 −0.834625 0.798054 −0.956183 −1.095445 1.047446 0.239046 1.251937 0.299270 blank blank ∑ (zx · zy) = 3.179746 For Example: Chocolates: 𝑥 = 𝑥 𝑛 = 4.8, 𝑠 𝑥 = (𝑥− 𝑥)2 𝑛−1 = 0.836660, 𝑥 = 5 → 𝑧 𝑥 = 𝑥 − 𝑥 𝑠 𝑥 = 5 − 4.8 0.83666 = 0.23905
  • 16. Example 4: Finding r Using the following Formula (Skip) Find the correlation coefficient for the given data. 16 Company Cars x (in 10,000s) Income y (in billions) xy x2 y2 A B C D E F 63.0 29.0 20.8 19.1 13.4 8.5 7.0 3.9 2.1 2.8 1.4 1.5 441.00 113.10 43.68 53.48 18.76 12.75 3969.00 841.00 432.64 364.81 179.56 72.25 49.00 15.21 4.41 7.84 1.96 2.25 Σx = 153.8 Σy = 18.7 Σxy = 682.77 Σx2 = 5859.26 Σy2 = 80.67 Σx = 153.8, Σy = 18.7, Σxy = 682.77, Σx2 = 5859.26, Σy2 = 80.67, n = 6 2 2 6(682.77) 153.8 18.7 6(5859.26) 153.8 6(80.67) 18.7 r            strong positive relat0.9 ion82 ( )shipr  2 22 2 ( ) ( ) ( ) ( ) ( ) n xy x y r n x x n y y                    TI Calculator: Linear Regression - test 1. Stat 2. Tests 3. LinRegTTest 4. Enter 𝑳 𝟏 & 𝑳 𝟐 5. Freq = 1 6. Choose ≠ 7. Calculate TI Calculator: How to enter data: 1. Stat 2. Edit 3. ClrList 𝑳 𝟏 & 𝑳 𝟐 4. Type in your data in 𝑳 𝟏 & 𝑳 𝟐
  • 17. Null Hypothesis: H0: ρ = 0 (No correlation) Alternative Hypothesis: H1: ρ ≠ 0 (Correlation) Using P-Value from Technology to Interpret r: P-value ≤ α: Reject 𝐻0 → Supports the claim of a linear correlation. P-value > α: Does not support the claim of a linear correlation. Using Pearson Correlation coefficient table to Interpret r: Consider critical values from this Table or technology as being both positive and negative: • Correlation If |r| ≥ critical value ⇾ There is sufficient evidence to support the claim of a linear correlation. • No Correlation If |r| < critical value ⇾ There is not sufficient evidence to support the claim of a linear correlation. Properties of the Linear Correlation Coefficient r 1. −1 ≤ r ≤ 1. 2. If all values of either variable are converted to a different scale, the value of r does not change. 4. The value of r is not affected by the choice of x or y. Interchange all x values and y values, and the value of r will not change. ​​r measures the strength of a linear relationship. It is not designed to measure the strength of a relationship that is not linear. 5. r is very sensitive to outliers in the sense that a single outlier could dramatically affect its value. 17 Calculating and Interpreting the Linear Correlation Coefficient denoted by r
  • 18. 18 Correlation A correlation exists between two variables when the values of one variable are somehow associated with the values of the other variable. A linear correlation exists between two variables when there is a correlation and the plotted points look like a straight line. The linear correlation coefficient r, is a number that measures how well paired sample data fit a straight-line pattern when graphed (measures the strength of the linear association between paired data called bivariate data). The value of r² is the proportion of the variation in y that is explained by the linear relationship between x and y. Properties of the Linear Correlation Coefficient r: −1 ≤ r ≤ 1 zx denotes the z score for an individual sample value x zy is the z score for the corresponding sample value y. 2 2 : : , 2 1 T n t r d O r r fS n r                   2 22 2 ) 1 : ( , x y n xy x y Z Z r r n n x x n y y Or                       Step 1: H0 :𝜌 = 0, H1: 𝜌 ≠ 0 claim & Tails Step 2: TS: 𝑡 = 𝑟 𝑛−2 1−𝑟2 , OR: r Step 3: CV using α From the T-table or r-table Step 4: Make the decision to a. Reject or not H0 b. The claim is true or false c. Restate this decision: There is / is not sufficient evidence to support the claim that… There is a linear Correlation If |r| ≥ critical value There is No Correlation If |r| < critical value TI Calculator: How to enter data: 1. Stat 2. Edit 3. ClrList 𝑳 𝟏 & 𝑳 𝟐 4. Type in your data in 𝑳 𝟏 & 𝑳 𝟐 TI Calculator: Scatter Plot: 1. Press on Y & clear 2. 2nd y, Enter 3. On, Enter 4. Select X1-list: 𝑳 𝟏 5. Select Y1-list: 𝑳 𝟐 6. Mark: Select Character 7. Press Zoom & 9 to get ZoomStat TI Calculator: Linear Regression - test 1. Stat 2. Tests 3. LinRegTTest 4. Enter 𝑳 𝟏 & 𝑳 𝟐 5. Freq = 1 6. Choose ≠ 7. Calculate
  • 19. 19 Test the significance of the given correlation coefficient using α = 0.05, n = 6 and r = 0.982. Example 5 Decision: a. Reject H0 b. The claim is True c. There is a significant relationship between the 2 variables. Step 1: H0 , H1, claim & Tails Step 2: TS Calculate (TS) Step 3: CV using α Step 4: Make the decision to a. Reject or not H0 b. The claim is true or false c. Restate this decision: There is / is not sufficient evidence to support the claim that… H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT TS: t-distribution 2nd Method: Pearson Correlation CV: α = 0.05, 𝑑𝑓 = 𝑛 − 2 = 6 − 2 = 4 2 2 2 , 2 1 1 2 n r t r df n r r n         2 6 2 0.982 1 0.982 t    10.3981 TS: 𝑟 = 0.982 CV: From Pearson Correlation coefficient table: 𝑛 = 6,α = 0.05 → 𝑡 = ±2.776 → 𝑟 = ±0.811
  • 20. 20 Given the value of r = 0.801 for 23 pairs of data regarding chocolate consumption and numbers of Nobel Laureates, and using a significance level of 0.05; is there sufficient evidence to support a claim that there is a linear correlation between chocolate consumption and numbers of Nobel Laureates? Example 6 Decision: a. Reject H0 b. The claim is True c. There is sufficient evidence to support the conclusion that for countries, there is a linear correlation between chocolate consumption and numbers of Nobel Laureates. TS: CV: α = 0.05, 𝑡 = 𝑛 − 2 = 23 − 2 = 21 2 2 , 2 1 n t r df n r      2 23 2 0.801 1 0.801 t    6.1314 TS: 𝑟 = 0.801 CV: From Pearson Correlation coefficient table: n= 23, α = 0.05 Interpretation: Although we have found a linear correlation, it would be absurd to think that eating more chocolate would help win a Nobel Prize. Step 1: H0 , H1, claim & Tails Step 2: TS Calculate (TS) Step 3: CV using α Step 4: Make the decision to a. Reject or not H0 b. The claim is true or false c. Restate this decision: There is / is not sufficient evidence to support the claim that… → 𝑡 = ±2.080 Table: r = 0.396 < CV < r = 0.444 Technology: r = 0.413 → 𝑟 = ±0.413 H0: 𝜌 = 0, H1: 𝜌 ≠ 0, claim, 2TT
  • 21. Interpreting r: Explained Variation The value of r² is the proportion of the variation in y that is explained by the linear relationship between x and y. Using the 23 pairs of chocolate/Nobel data, we get r = 0.801. What proportion of the variation in numbers of Nobel Laureates can be explained by the variation in the consumption of chocolate? 21 Solution With r = 0.801 we get r² = 0.642. Interpretation We conclude that 0.642 (or about 64%) of the variation in numbers of Nobel Laureates can be explained by the linear relationship between chocolate consumption and numbers of Nobel Laureates. This implies that about 36% of the variation in numbers of Nobel Laureates cannot be explained by rates of chocolate consumption.
  • 22. When the null hypothesis has been rejected for a specific value, any of the following five possibilities can exist. 1. There is a direct cause-and-effect relationship between the variables. That is, x causes y. water causes plants to grow poison causes death heat causes ice to melt 2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x. Suppose a researcher believes excessive coffee consumption causes nervousness, but the researcher fails to consider that the reverse situation may occur. That is, it may be that an extremely nervous person craves coffee to calm his or her nerves. 3. The relationship between the variables may be caused by a third variable. If a statistician correlated the number of deaths due to drowning and the number of cans of soft drink consumed daily during the summer, he or she would probably find a significant relationship. However, the soft drink is not necessarily responsible for the deaths, since both variables may be related to heat and humidity. 4. There may be a complexity of interrelationships among many variables. A researcher may find a significant relationship between students’ high school grades and college grades. But there probably are many other variables involved, such as IQ, hours of study, influence of parents, motivation, age, and instructors. 5. The relationship may be coincidental. A researcher may be able to find a significant relationship between the increase in the number of people who are exercising and the increase in the number of people who are committing crimes. But common-sense dictates that any relationship between these two values must be due to coincidence. Correlation, Possible Relationships Between Variables 22
  • 23. Interpreting r with Causation: Correlation does not imply causality! We noted previously that we should use common sense when interpreting results. Clearly, it would be absurd to think that eating more chocolate would help win a Nobel Prize 23 Common Errors Involving Correlation: 1. Assuming that correlation implies causality 2. Using data based on averages 3. Ignoring the possibility of a nonlinear relationship Hypotheses If conducting a formal hypothesis test to determine whether there is a significant linear correlation between two variables, use the following null and alternative hypotheses that use ρ to represent the linear correlation coefficient of the population: Null Hypothesis H0: ρ = 0 (No correlation) Alternative Hypothesis H1: ρ ≠ 0 (Correlation)