Upcoming SlideShare
×

# Chapter 6 simple regression and correlation

705 views

Published on

biometry

Published in: Education
1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
705
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
39
0
Likes
1
Embeds 0
No embeds

No notes for slide

### Chapter 6 simple regression and correlation

1. 1. CORRELATION(r) and REGRESSION (b)
2. 2. SIMPLE REGRESSION AND CORRELATION Both regression and correlation between two sets of variables measure strength of relationship. In the case of linear regression, we will examine the amount of variability in one variable (Y, the dependent variable) that is explained by changes in another variable (X, the independent variable). Specifically, we will look for straight line or linear changes in Y as X changes. Regression analysis is usually done in situation in which we have control of the X variable and can measure it essentially without error. For simplicity, we will avoid discussing curvilinear relationship between variables.
3. 3. Regression and correlation.., Correlation analysis is used when both the variables are experimental and measured with error. It is more preliminary than regression analysis and generally measure the correlationship between two variables of interest. Let us consider two examples to further highlight the differences between regression and correlation analysis.
4. 4. Example 10.1 A biology student wishes to determine the relationship between temperature and heart rate (heart beat/minute) in the common leopard frog. He manipulates the temperature in 20 C increments ranging from 2 to 180C and records the heart rate at each interval. His data are presented in table form below Rec. No. 1 2 3 4 5 6 7 8 9 Temp (X) 2 4 6 8 10 12 14 16 18 Heart rate (Y) 5 11 11 14 22 23 32 29 32
5. 5. Example 10.1 … How should he proceed to describe the relationship between these variables (temp. and heart rate) ? Clearly the two variables have functional dependence – as the temperature increases the heart rate increases. Here the temperature is controlled by the student and can take exactly the same values in another experiment with a different frog. Temperature is the INDEPENDENT or “predictor” variable (X). Heart rate is determined by temperature and is, therefore, the DEPENDENT variable or “response” variable (Y).
6. 6. Example 10.2 A biologist interested in the morphology of west Indian Chitons and he measured the length and width of each of 10 chitons as Anima l 1 Length (cm) Width (cm) 2 3 4 5 6 7 8 9 10 10.7 11.0 9.5 11.1 10.3 10.7 9.9 10.6 10.0 12.0 5.8 5.0 6.0 5.3 5.8 5.2 5.7 5.3 6.3 6.0
7. 7. Example 10.2…. This data set is fundamentally different from the data in Example 10.1 because neither variable is under biologist’s control. To try to predict length from width is as logical as to try to predict width from length. Both variables are free to vary (Fig 10.2). A correlational study is more appropriate here than a regression analysis. Because some of the calculations are similar, regression and correlation are often confused.
8. 8. Fig 10.2
9. 9. SIMPLE LINEAR REGRESSION We assume X Y 1. Independent 1. Dependent variable variable 2. Measured 2. Free to vary without error, fixed and repeatable
10. 10. Linear Model Assumptions 1. X’s are fixed and measured without error 2. The expected or mean value for the variable Y for a given value of X is described by a linear function Y    X where  and  are constant real numbers and   0 .  and  represent the intercept and slope, respectively, of the linear relationship between X and Y.
11. 11. Linear Model Assumptions 3. For any fixed value of X, there may be several corresponding values of the dependent variable Y. For example, for fixed temperature several frogs may show several results. However, we assume that for any such given below Xi , the Yi ‘s are independent of each other and normally distributed. We can represent each Yi value as Y     X  e i i i Y is described as the expected value (    X i ) plus a deviation (ei) from that expectation. We assume ei s are normally distributed error terms with a mean of zero.
12. 12. Linear Model Assumptions… 4. The variances of the distributions of Y for different values of X are assumed to be equal. To describe the experimental regression relationship between Y and X we need to do the following a) Graph the data to ascertain that an apparent linear relationship exists b) Find the best fitting straight line for the data set. c) Test whether or not the fitted line explains a significant portion of the variability in Y i.e. test whether the linear relationship is real or not.
13. 13. Regression coefficient The regression coefficient or slope (b) b  XY   X 2 (  X )(  Y )  n 2 ( X ) n Y changes for every unit change in X. Therefore, b has unit as the original data set have. If we have the value of ‘b’ we can calculate the value of ‘a’ from Y  a  b X
14. 14. Calculation of b Referring to the example of temperature and heart rate relationship in frog we have n=9  X  90 X  10 . 0  b= 1.78. X 2  1140 Y  179 Y  19 . 9  Y 2  4365  XY  2216 THIS MEANS, FOR EVERY 1 DEGREE CHANGE IN TEMP., THERE IS 1.78 BIT/MIN HEART RATE INCREASES OR DECREASES.
15. 15. Simple Linear Correlation Analysis Correlation analysis is used to measure the intensity of association observed between any pair of variables. We are largely concerned with whether two variables are interdependent or co-vary. Here we do not express one variable as a function of the other and do not imply that Y is dependent on X as we did with regression analysis. Both X and Y are measured with error and we wish to estimate the degree to which these variables vary together.
16. 16. …Correlation A widely used index of the association of two quantitative variables is Pearson ProductMoment Correlation Coefficient, usually called correlation coefficient (r). r     X   2 XY  (  X )(  Y ) n 2 2 ( X )   ( Y ) 2    Y  n n      
17. 17. ….Correlation Explainabl e var iability -1≤r≤1 , r2 = Total var iability r2 = Coefficient of determination. 0.00 ±0.10 ±0.20 ±0.30 ±0.40 ±0.50 ±0.60 ±0.70 ±0.80 ±0.90 ±1.00 0.01 0.04 0.09 0.16 0.25 0.36 0.49 0.64 0.89 1.00 r 0.00 Correlation coefficients and the corresponding coefficients of determination r2
18. 18. Correlation… The standard error of the coefficient is 1 r sr = n  2 Using this standard error we can develop a test of hypothesis for  Ho:  = 0 r0 r Ha:  ≠ 0 with the test statistic t  s  1  r With v = n-2 n2 2 2 r
19. 19. Example 10.4 : Analysis of example 10.2 as a correlation problem Let X be the chiton length (cm) and Y be the chiton width (cm). The data for the problem and the preliminary calculations Length Width Length Width 10.7 5.8 10.7 5.8 11.0 6.0 9.9 5.2 9.5 5.0 10.6 5.7 11.1 6.0 10.0 5.3 10.3 5.3 12.0 6.3
20. 20. Example 10.4  X  105 . 8 , X  10 . 58 Y  X 2  56 . 4 , Y  5 . 64  1123 . 9 ,  Y 2  319 . 68 ,  XY  599 . 31 n  10 r     X   XY  (  X )(  Y ) n ( X )   ( Y )  2    Y   n n     2 2 2  0 . 969
21. 21. Correlation…. Test whether there is a significant correlation with α = 0.05 and v = n-2 = 10-2 = 8 Ho :   0 Ha :   0 Sr  So the test statistic is t r0 Sr  0 . 969  0 0 . 087  11 . 14 1 r 2 n2  1  ( 0 . 969 ) 10  2 2  0 . 087
22. 22. Correlation The critical values from Table C.4 for v = 8 with α = 0.05 are ± 2.306. Since 11.14>>2.306, we find a STRONG LINEAR CORRELATION between length and width of chiton shells.
23. 23. Solve the problem • Followings are the records of amount of feed ingested (kg) and live weight (Kg) of broilers. Test whether there is any significant correlation between amount of feed intake and body weight. How much weight gains a broiler out of 1 kg feed. Bird No. 1 2 3 Feed 3.6 3.9 4.1 4.0 3.9 Wt. 2.1 2.2 2.4 2.3 2.0 4 5 r=0.726, sig (2-tailed) p<0.017 6 7 8 9 10 4.4 4.2 4.0 3.9 4.6 2.9 2.8 2.5 2.7 2.7