Your SlideShare is downloading. ×
0
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Chapter 08correlation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Chapter 08correlation

3,309

Published on

Published in: Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
3,309
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Chapter 8 Simple Linear Correlation
  • 2. <ul><li>Up to now, the statistical methods you have learnt concern with single variable only </li></ul><ul><li>Such as </li></ul><ul><li>To estimate the average height among high school students </li></ul><ul><li>To compare the average height of high school students between city and country side </li></ul><ul><li>However, the relationship between two variables is often concerned in practice: </li></ul><ul><li>Example: For high school students, </li></ul><ul><li>Height and Age – linear relation? </li></ul><ul><li>Height and Weight – linear relation? </li></ul>
  • 3. <ul><li>In this chapter, we are going to study: </li></ul><ul><li>two variables, </li></ul><ul><li>linear relationship between two variables </li></ul><ul><li>Two types of questions: </li></ul><ul><li>Whether there is a linear relationship? </li></ul><ul><li>-- Linear correlation </li></ul><ul><li>How to predict one variable by another variable? </li></ul><ul><li>-- Linear regression </li></ul>
  • 4. <ul><li>Example 7.1   To explore the correlation between </li></ul><ul><li>systolic pressure and diastolic pressure (mmHg), </li></ul><ul><li>665 girls aged from 6 to 10 years were measured. </li></ul><ul><li>Two random variables X and Y </li></ul><ul><li>Sample: 665 girls </li></ul><ul><li>The individuals in the sample should be </li></ul><ul><li>independent each other. </li></ul>
  • 5. <ul><li>Example 7.2 To explore the correlation between </li></ul><ul><li>the heights of father and son, 20 graduate male </li></ul><ul><li>students were randomly selected from a name </li></ul><ul><li>list of graduates in a high school. The heights </li></ul><ul><li>(cm) of fathers and sons were measured </li></ul><ul><li>Table 7.1 Heights (cm) of 20 pairs of father and son </li></ul>
  • 6. 12.1 Linear correlation
  • 7. Scatter Diagram : Fig. 7.1 Scatter diagram of systolic and diastolic blood pressures (mmHg) of 665 girls of 6 to 10 years old
  • 8.  
  • 9.  
  • 10. 7.2 Correlation Coefficient 7.2.1 Population correlation coefficient <ul><li>Pearson’s product-moment linear correlation coefficient: </li></ul><ul><li>The mean of “product of the two standardized variables” </li></ul><ul><li>---- simple correlation coefficient </li></ul>-- covariance between X and Y
  • 11. Pearson’s product moment sample correlation coefficient r 7.2.2 Sample correlation coefficient
  • 12. A measurement of linear relationship: 1) Whether there is a correlation If the correlation coefficient is 0 or not big enough -- no correlation 2) If correlation coefficient is big enough The direction of correlation? -- positive or negative The strength of correlation? high or not? -- Is the absolute value big enough? Complete correlation : +1 or -1,
  • 13. Example 7.3 Calculate the correlation coefficient between the heights of father and son.
  • 14. <ul><li>r is sample correlation coefficient, change from sample to sample </li></ul><ul><li>There is a population correlation coefficient, denoted by ρ </li></ul><ul><li>Question : Whether ρ =0 or not? </li></ul><ul><li>Assumption: </li></ul><ul><li>X and Y follow a bi-variable normal distribution </li></ul>7.3 Inference on Correlation Coefficient 7.3.1 Hypothesis test
  • 15. <ul><li>H 0 : ρ =0, H 1 : ρ ≠0 α =0.05 </li></ul><ul><li>(1) Checking a special table (Table 8 in appendix 2 ) </li></ul><ul><li>H 0 is rejected </li></ul><ul><li>---- positive correlation between the heights of father and son. </li></ul><ul><li>Question: </li></ul><ul><li>Since , very small, can we say </li></ul><ul><li>the correlation is very strong ? </li></ul><ul><li>Does a small P value mean that the correlation </li></ul><ul><li>is strong ? </li></ul>
  • 16. Question : If r =0.90, can you claim the two variables are correlated each other? Table 8 Critical values for r
  • 17. (2) t test (Assume normal distribution) H 0 : ρ =0, H 1 : ρ ≠0 <ul><li>If P -value < α , then reject H 0 , conclude that </li></ul><ul><li>the population correlation coefficient is significantly different from 0. </li></ul><ul><li> =20-2=18, </li></ul><ul><li>The population correlation coefficient might not be 0. </li></ul>
  • 18. 7.3.2 Interval estimation <ul><li>Assumption: X and Y follow a bi-variable normal distribution. </li></ul><ul><li>Pre-knowledge: </li></ul><ul><li>(1) hyperbolic tangent and its inverse </li></ul><ul><li>Hyperbolic tangent ( 双曲正切 ) </li></ul><ul><li>Inverse  of hyperbolic tangent ( 反双曲正切 ) </li></ul>
  • 19. <ul><li>approximately follows a normal distribution </li></ul><ul><li>Confidence interval of : </li></ul><ul><li>or </li></ul><ul><li>Taking a transformation of </li></ul>
  • 20. Example 7.5 After getting , please find out a 95% confidence interval for population correlation coefficient .
  • 21. <ul><li>It is useful to: </li></ul><ul><li>ranked data </li></ul><ul><li>As well as measurement data </li></ul><ul><li>---- not follow a normal distribution; </li></ul><ul><li>or not sure about the distribution; </li></ul><ul><li>or not precisely measured </li></ul><ul><li>or X or Y are ordinal variables </li></ul>7.4 Rank Correlation 7.4.1 Spearman ’ s rank correlation coefficient
  • 22. Spearman ’ s rank correlation coefficient <ul><li>sort (x 1 ,x 2 ,…,x n ), get rank p i for x i </li></ul><ul><li>sort (y 1 ,y 2 ,…,y n ), get rank q i for y i </li></ul><ul><li>n pairs of observations, (x 1 ,y 1 ), …, (x n ,y n ) </li></ul>
  • 23. Example 7.6 An etiology study on liver cancer has collected data on liver-cancer-specific death rate and the relative content of aflatoxin in certain food for 10 Counties. Putting the ranks into the formula of Spearman’s correlation coefficient
  • 24. Table 9 Critical value for Spearman’s rank correlation coefficient 7.4.2 Hypothesis test for (1) Checking a special table (Table 9) P <0.02 and it is significant
  • 25. <ul><li>(2) t test </li></ul><ul><li>Same as the t test for Pearson’s correlation </li></ul><ul><li>coefficient </li></ul><ul><li>If p is small, then reject </li></ul>
  • 26.  

×