4. We can connect these marks to form a line of averages that is apparently
(close to being) a straight line.
5. Questions to
answer
• How much dispersion of INCOME there is
among people with the same level of
EDUCATION?
• How much does the average level of INCOME
change among people with different levels of
education?
6. Dispersion
• In scattergram A, there is almost no dispersion within each
vertical strip (and almost no dispersion around the line of
averages as a whole).
• In scattergram B, there is a lot of dispersion within each
vertical strip (and around the line of averages as a whole).
• In other language:
• People in society A will be more certain about the reward
they would gain by improving education.
7. • Correlation measures the strength and
direction of the linear relationship between
two variables. It assesses whether and to
what extent two variables are related but
does not imply causation. Correlation is
used to determine the degree of
association between variables. (The
dispersion around the line of averages)
9. Correlation Coefficient
• We can quantify the correlation between two variables using a measure called
correlation coefficient shown by r:
• The correlation coefficient between X and Y is:
• 𝑟 =
𝑐𝑜𝑣(𝑋,𝑌)
𝑉𝑎𝑟(𝑋)× 𝑉𝑎𝑟(𝑌)
• 𝑐𝑜𝑣 𝑋, 𝑌 =
1
𝑛 𝑖(𝑋𝑖 − 𝑋)(𝑌𝑖 − 𝑌)
• 𝑉𝑎𝑟 𝑋 =
1
𝑛 𝑖(𝑋𝑖 − 𝑋)2
10. Correlation Coefficient
• r is a number between -1 and 1
• When 𝑟 < 0 we have a negative correlation between X and Y
• When 𝑟 > 0 we have a positive correlation between X and Y
• When 𝑟 = 0 there is no correlation between X and Y
• When r is very close to 1 or -1, it means that a straight line is a perfect fit to
the relationship between X and y or there is a strong correlation between X
and Y
13. Analysis ToolPak
1. Click the File tab and select Options.
2. In the Manage box click Excel Add-Ins, and choose Go.
3. In the Add-Ins dialog box, select Analysis ToolPak and then click OK.
14. Correlation
by Excel
• Using Analysis toolpak…..Correlation option
• CORREL function
Bowl Price Bowls Soda Beer
Bowl
Price 1
Bowls -0.71186 1
Soda -0.58095 0.831008 1
Beer -0.19367 0.338691 0.246803 1
15. Interpretation
1. Bowls and Soda: The value of 0.83 is relatively close to 1, which indicates a very strong association between
the number of "bowls" and "soda sales. The positive sign of the correlation coefficient means that as the
number of "bowls" increases, "soda sales" tend to increase with a strong likelihood (0.83). The conclusion is
that complementary goods have a positive correlation.
2. Bowl price and bowls sale: A correlation coefficient of -0.71 between "bowls price" and "bowls sales"
indicates a strong negative relationship, signifying that as the price of "bowls" rises, the quantity of "bowls
sales" tends to decline, and conversely, as prices decrease, sales tend to increase. The conclusion is that price
and sales have a negative correlation.