Upcoming SlideShare
×

# Chapter 08correlation

• 3,259 views

• Comment goes here.
Are you sure you want to
Be the first to comment
Be the first to like this

Total Views
3,259
On Slideshare
0
From Embeds
0
Number of Embeds
0

Shares
0
0
Likes
0

No embeds

### Report content

No notes for slide

### Transcript

• 1. Chapter 8 Simple Linear Correlation
• 2.
• Up to now, the statistical methods you have learnt concern with single variable only
• Such as
• To estimate the average height among high school students
• To compare the average height of high school students between city and country side
• However, the relationship between two variables is often concerned in practice:
• Example: For high school students,
• Height and Age – linear relation?
• Height and Weight – linear relation?
• 3.
• In this chapter, we are going to study:
• two variables,
• linear relationship between two variables
• Two types of questions:
• Whether there is a linear relationship?
• -- Linear correlation
• How to predict one variable by another variable?
• -- Linear regression
• 4.
• Example 7.1   To explore the correlation between
• systolic pressure and diastolic pressure (mmHg),
• 665 girls aged from 6 to 10 years were measured.
• Two random variables X and Y
• Sample: 665 girls
• The individuals in the sample should be
• independent each other.
• 5.
• Example 7.2 To explore the correlation between
• the heights of father and son, 20 graduate male
• students were randomly selected from a name
• list of graduates in a high school. The heights
• (cm) of fathers and sons were measured
• Table 7.1 Heights (cm) of 20 pairs of father and son
• 6. 12.1 Linear correlation
• 7. Scatter Diagram : Fig. 7.1 Scatter diagram of systolic and diastolic blood pressures (mmHg) of 665 girls of 6 to 10 years old
• 8.
• 9.
• 10. 7.2 Correlation Coefficient 7.2.1 Population correlation coefficient
• Pearson’s product-moment linear correlation coefficient:
• The mean of “product of the two standardized variables”
• ---- simple correlation coefficient
-- covariance between X and Y
• 11. Pearson’s product moment sample correlation coefficient r 7.2.2 Sample correlation coefficient
• 12. A measurement of linear relationship: 1) Whether there is a correlation If the correlation coefficient is 0 or not big enough -- no correlation 2) If correlation coefficient is big enough The direction of correlation? -- positive or negative The strength of correlation? high or not? -- Is the absolute value big enough? Complete correlation : +1 or -1,
• 13. Example 7.3 Calculate the correlation coefficient between the heights of father and son.
• 14.
• r is sample correlation coefficient, change from sample to sample
• There is a population correlation coefficient, denoted by ρ
• Question : Whether ρ =0 or not?
• Assumption:
• X and Y follow a bi-variable normal distribution
7.3 Inference on Correlation Coefficient 7.3.1 Hypothesis test
• 15.
• H 0 : ρ =0, H 1 : ρ ≠0 α =0.05
• (1) Checking a special table (Table 8 in appendix 2 )
• H 0 is rejected
• ---- positive correlation between the heights of father and son.
• Question:
• Since , very small, can we say
• the correlation is very strong ?
• Does a small P value mean that the correlation
• is strong ?
• 16. Question : If r =0.90, can you claim the two variables are correlated each other? Table 8 Critical values for r
• 17. (2) t test (Assume normal distribution) H 0 : ρ =0, H 1 : ρ ≠0
• If P -value < α , then reject H 0 , conclude that
• the population correlation coefficient is significantly different from 0.
•  =20-2=18,
• The population correlation coefficient might not be 0.
• 18. 7.3.2 Interval estimation
• Assumption: X and Y follow a bi-variable normal distribution.
• Pre-knowledge:
• (1) hyperbolic tangent and its inverse
• Hyperbolic tangent ( 双曲正切 )
• Inverse  of hyperbolic tangent ( 反双曲正切 )
• 19.
• approximately follows a normal distribution
• Confidence interval of :
• or
• Taking a transformation of
• 20. Example 7.5 After getting , please find out a 95% confidence interval for population correlation coefficient .
• 21.
• It is useful to:
• ranked data
• As well as measurement data
• ---- not follow a normal distribution;
• or not sure about the distribution;
• or not precisely measured
• or X or Y are ordinal variables
7.4 Rank Correlation 7.4.1 Spearman ’ s rank correlation coefficient
• 22. Spearman ’ s rank correlation coefficient
• sort (x 1 ,x 2 ,…,x n ), get rank p i for x i
• sort (y 1 ,y 2 ,…,y n ), get rank q i for y i
• n pairs of observations, (x 1 ,y 1 ), …, (x n ,y n )
• 23. Example 7.6 An etiology study on liver cancer has collected data on liver-cancer-specific death rate and the relative content of aflatoxin in certain food for 10 Counties. Putting the ranks into the formula of Spearman’s correlation coefficient
• 24. Table 9 Critical value for Spearman’s rank correlation coefficient 7.4.2 Hypothesis test for (1) Checking a special table (Table 9) P <0.02 and it is significant
• 25.
• (2) t test
• Same as the t test for Pearson’s correlation
• coefficient
• If p is small, then reject
• 26.