• Like
  • Save
Chapter 08correlation
Upcoming SlideShare
Loading in...5
×

Chapter 08correlation

  • 3,259 views
Uploaded on

 

More in: Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
3,259
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Chapter 8 Simple Linear Correlation
  • 2.
    • Up to now, the statistical methods you have learnt concern with single variable only
    • Such as
    • To estimate the average height among high school students
    • To compare the average height of high school students between city and country side
    • However, the relationship between two variables is often concerned in practice:
    • Example: For high school students,
    • Height and Age – linear relation?
    • Height and Weight – linear relation?
  • 3.
    • In this chapter, we are going to study:
    • two variables,
    • linear relationship between two variables
    • Two types of questions:
    • Whether there is a linear relationship?
    • -- Linear correlation
    • How to predict one variable by another variable?
    • -- Linear regression
  • 4.
    • Example 7.1   To explore the correlation between
    • systolic pressure and diastolic pressure (mmHg),
    • 665 girls aged from 6 to 10 years were measured.
    • Two random variables X and Y
    • Sample: 665 girls
    • The individuals in the sample should be
    • independent each other.
  • 5.
    • Example 7.2 To explore the correlation between
    • the heights of father and son, 20 graduate male
    • students were randomly selected from a name
    • list of graduates in a high school. The heights
    • (cm) of fathers and sons were measured
    • Table 7.1 Heights (cm) of 20 pairs of father and son
  • 6. 12.1 Linear correlation
  • 7. Scatter Diagram : Fig. 7.1 Scatter diagram of systolic and diastolic blood pressures (mmHg) of 665 girls of 6 to 10 years old
  • 8.  
  • 9.  
  • 10. 7.2 Correlation Coefficient 7.2.1 Population correlation coefficient
    • Pearson’s product-moment linear correlation coefficient:
    • The mean of “product of the two standardized variables”
    • ---- simple correlation coefficient
    -- covariance between X and Y
  • 11. Pearson’s product moment sample correlation coefficient r 7.2.2 Sample correlation coefficient
  • 12. A measurement of linear relationship: 1) Whether there is a correlation If the correlation coefficient is 0 or not big enough -- no correlation 2) If correlation coefficient is big enough The direction of correlation? -- positive or negative The strength of correlation? high or not? -- Is the absolute value big enough? Complete correlation : +1 or -1,
  • 13. Example 7.3 Calculate the correlation coefficient between the heights of father and son.
  • 14.
    • r is sample correlation coefficient, change from sample to sample
    • There is a population correlation coefficient, denoted by ρ
    • Question : Whether ρ =0 or not?
    • Assumption:
    • X and Y follow a bi-variable normal distribution
    7.3 Inference on Correlation Coefficient 7.3.1 Hypothesis test
  • 15.
    • H 0 : ρ =0, H 1 : ρ ≠0 α =0.05
    • (1) Checking a special table (Table 8 in appendix 2 )
    • H 0 is rejected
    • ---- positive correlation between the heights of father and son.
    • Question:
    • Since , very small, can we say
    • the correlation is very strong ?
    • Does a small P value mean that the correlation
    • is strong ?
  • 16. Question : If r =0.90, can you claim the two variables are correlated each other? Table 8 Critical values for r
  • 17. (2) t test (Assume normal distribution) H 0 : ρ =0, H 1 : ρ ≠0
    • If P -value < α , then reject H 0 , conclude that
    • the population correlation coefficient is significantly different from 0.
    •  =20-2=18,
    • The population correlation coefficient might not be 0.
  • 18. 7.3.2 Interval estimation
    • Assumption: X and Y follow a bi-variable normal distribution.
    • Pre-knowledge:
    • (1) hyperbolic tangent and its inverse
    • Hyperbolic tangent ( 双曲正切 )
    • Inverse  of hyperbolic tangent ( 反双曲正切 )
  • 19.
    • approximately follows a normal distribution
    • Confidence interval of :
    • or
    • Taking a transformation of
  • 20. Example 7.5 After getting , please find out a 95% confidence interval for population correlation coefficient .
  • 21.
    • It is useful to:
    • ranked data
    • As well as measurement data
    • ---- not follow a normal distribution;
    • or not sure about the distribution;
    • or not precisely measured
    • or X or Y are ordinal variables
    7.4 Rank Correlation 7.4.1 Spearman ’ s rank correlation coefficient
  • 22. Spearman ’ s rank correlation coefficient
    • sort (x 1 ,x 2 ,…,x n ), get rank p i for x i
    • sort (y 1 ,y 2 ,…,y n ), get rank q i for y i
    • n pairs of observations, (x 1 ,y 1 ), …, (x n ,y n )
  • 23. Example 7.6 An etiology study on liver cancer has collected data on liver-cancer-specific death rate and the relative content of aflatoxin in certain food for 10 Counties. Putting the ranks into the formula of Spearman’s correlation coefficient
  • 24. Table 9 Critical value for Spearman’s rank correlation coefficient 7.4.2 Hypothesis test for (1) Checking a special table (Table 9) P <0.02 and it is significant
  • 25.
    • (2) t test
    • Same as the t test for Pearson’s correlation
    • coefficient
    • If p is small, then reject
  • 26.