Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- 12 ch ken black solution by Krunal Shah 5085 views
- 10 ch ken black solution by Krunal Shah 9468 views
- 14 ch ken black solution by Krunal Shah 2754 views
- 11 ch ken black solution by Krunal Shah 4090 views
- 16 ch ken black solution by Krunal Shah 2813 views
- 08 ch ken black solution by Krunal Shah 7237 views

No Downloads

Total views

3,691

On SlideShare

0

From Embeds

0

Number of Embeds

10

Shares

0

Downloads

493

Comments

0

Likes

4

No embeds

No notes for slide

- 1. Chapter 13: Simple Regression and Correlation Analysis 1 Chapter 13 Simple Regression and Correlation Analysis LEARNING OBJECTIVES The overall objective of this chapter is to give you an understanding of bivariate regression and correlation analysis, thereby enabling you to: 1. Compute the equation of a simple regression line from a sample of data and interpret the slope and intercept of the equation. 2. Understand the usefulness of residual analysis in testing the assumptions underlying regression analysis and in examining the fit of the regression line to the data. 3. Compute a standard error of the estimate and interpret its meaning. 4. Compute a coefficient of determination and interpret it. 5. Test hypotheses about the slope of the regression model and interpret the results. 6. Estimate values of y using the regression model. CHAPTER TEACHING STRATEGY This chapter is about all aspects of simple (bivariate, linear) regression. Early in the chapter through scatter plots, the student begins to understand that the object of simple regression is to fit a line through the points. Fairly soon in the process, the student learns how to solve for slope and y intercept and develop the equation of the regression line. Most of the remaining material on simple regression is to determine how good the fit of the line is and if assumptions underlying the process are met. The student begins to understand that by entering values of the independent variable into the regression model, predicted values can be determined. The question
- 2. Chapter 13: Simple Regression and Correlation Analysis 2 then becomes: Are the predicted values good estimates of the actual dependent values? One rule to emphasize is that the regression model should not be used to predict for independent variable values that are outside the range of values used to construct the model. MINITAB issues a warning for such activity when attempted. There are many instances where the relationship between x and y are linear over a given interval but outside the interval the relationship becomes curvilinear or unpredictable. Of course, with this caution having been given, many forecasters use such regression models to extrapolate to values of x outside the domain of those used to construct the model. Whether the forecasts obtained under such conditions are any better than "seat of the pants" or "crystal ball" estimates remains to be seen. The concept of residual analysis is a good one to show graphically and numerically how the model relates to the data and the fact that it more closely fits some points than others, etc. A graphical or numerical analysis of residuals demonstrates that the regression line fits the data in a manner analogous to the way a mean fits a set of numbers. The regression model passes through the points such that the geometric distances will sum to zero. The fact that the residuals sum to zero points out the need to square the errors (residuals) in order to get a handle on total error. This leads to the sum of squares error and then on to the standard error of the estimate. In addition, students can learn why the process is called least squares analysis (the slope and intercept formulas are derived by calculus such that the sum of squares of error is minimized - hence "least squares"). Students can learn that by examining the values of se, the residuals, r2 , and the t ratio to test the slope they can begin to make a judgment about the fit of the model to the data. Many of the chapter problems ask the student to comment on these items (se, r2 , etc.). It is my view that for many of these students, an important facet of this chapter lies in understanding the "buzz" words of regression such as standard error of the estimate, coefficient of determination, etc. They may well only interface regression again as some type of computer printout to be deciphered. The concepts then become as important or perhaps more important than the calculations. CHAPTER OUTLINE 13.1 Introduction to Simple Regression Analysis 13.2 Determining the Equation of the Regression Line 13.3 Residual Analysis Using Residuals to Test the Assumptions of the Regression Model Using the Computer for Residual Analysis 13.4 Standard Error of the Estimate
- 3. Chapter 13: Simple Regression and Correlation Analysis 3 13.5 Coefficient of Determination Relationship Between r and r2 13.6 Hypothesis Tests for the Slope of the Regression Model and Testing the Overall Model Testing the Slope Testing the Overall Model 13.7 Estimation Confidence Intervals to Estimate the Conditional Mean of y: µy/x Prediction Intervals to Estimate a Single Value of y 13.8 Interpreting Computer Output KEY TERMS Coefficient of Determination (r2 ) Prediction Interval Confidence Interval Probabilistic Model Dependent Variable Regression Analysis Deterministic Model Residual Heteroscedasticity Residual Plot Homoscedasticity Scatter Plot Independent Variable Simple Regression Least Squares Analysis Standard Error of the Estimate (se) Outliers Sum of Squares of Error (SSE)
- 4. Chapter 13: Simple Regression and Correlation Analysis 4 SOLUTIONS TO CHAPTER 13 13.1 x x 12 17 21 15 28 22 8 19 20 24 Σx = 89 Σy = 97 Σxy = 1,767 Σx2 = 1,833 Σy2 = 1,935 n = 5 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 5 )89( 833,1 5 )97)(89( 767,1 2 − − = 0.162 b0 = 5 89 162.0 5 97 1 −=− ∑∑ n x b n y = 16.5 yˆ = 16.5 + 0.162 x
- 5. Chapter 13: Simple Regression and Correlation Analysis 5 13.2 x y 140 25 119 29 103 46 91 70 65 88 29 112 24 128 Σx = 571 Σy = 498 Σxy = 30,099 Σx2 = 58,293 Σy2 = 45,154 n = 7 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 7 )571( 293,58 7 )498)(571( 099,30 2 − − = -0.898 b0 = 7 571 )898.0( 7 498 1 −−=− ∑∑ n x b n y = 144.414 yˆ = 144.414 – 0.898 x
- 6. Chapter 13: Simple Regression and Correlation Analysis 6 13.3 (Advertising) x (Sales) y 12.5 148 3.7 55 21.6 338 60.0 994 37.6 541 6.1 89 16.8 126 41.2 379 Σx = 199.5 Σy = 2,670 Σxy = 107,610.4 Σx2 = 7,667.15 Σy2 = 1,587,328 n = 8 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 8 )5.199( 15.667,7 8 )670,2)(5.199( 4.610,107 2 − − = 15.24 b0 = 8 5.199 24.15 8 670,2 1 −=− ∑∑ n x b n y = -46.29 yˆ = -46.29 + 15.24 x 13.4 (Prime) x (Bond) y 16 5 6 12 8 9 4 15 7 7 Σx = 41 Σy = 48 Σxy = 333 Σx2 = 421 Σy2 = 524 n = 5 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 5 )41( 421 5 )48)(41( 333 2 − − = -0.715
- 7. Chapter 13: Simple Regression and Correlation Analysis 7 b0 = 5 41 )715.0( 5 48 1 −−=− ∑∑ n x b n y = 15.46 yˆ = 15.46 – 0.715 x 13.5 Starts Failures 233,710 57,097 199,091 50,361 181,645 60,747 158,930 88,140 155,672 97,069 164,086 86,133 166,154 71,558 188,387 71,128 168,158 71,931 170,475 83,384 166,740 71,857 Σx = 1,953,048 Σy = 809,405 Σx2 = 351,907,107,960 Σy2 = 61,566,568,203 Σxy = 141,238,520,688 n = 11 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 11 )048,953,1( 960,107,907,351 11 )405,809)(048,953,1( 688,520,238,141 2 − − = b1 = -0.48042194 b0 = 11 048,953,1 )48042194.0( 11 405,809 1 −−=− ∑∑ n x b n y = 158,881.1 yˆ = 158,881.1 – 0.48042194 x
- 8. Chapter 13: Simple Regression and Correlation Analysis 8 13.6 No. of Farms (x) Avg. Size (y) 5.65 213 4.65 258 3.96 297 3.36 340 2.95 374 2.52 420 2.44 426 2.29 441 2.15 460 2.07 469 2.17 434 Σx = 34.21 Σy = 4,132 Σx2 = 120.3831 Σy2 = 1,627,892 Σxy = 11,834.31 n = 11 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 11 )21.34( 3831.120 11 )132,4)(21.34( 31.834,11 2 − − = -72.6383 b0 = 11 21.34 )6383.72( 11 132,4 1 −−=− ∑∑ n x b n y = 601.542 yˆ = 601.542 – 72.6383 x
- 9. Chapter 13: Simple Regression and Correlation Analysis 9 13.7 Steel New Orders 99.9 2.74 97.9 2.87 98.9 2.93 87.9 2.87 92.9 2.98 97.9 3.09 100.6 3.36 104.9 3.61 105.3 3.75 108.6 3.95 Σx = 994.8 Σy = 32.15 Σx2 = 99,293.28 Σy2 = 104.9815 Σxy = 3,216.652 n = 10 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 10 )8.994( 28.293,99 10 )15.32)(8.994( 652.216,3 2 − − = 0.05557 b0 = 10 8.994 )05557.0( 10 15.32 1 −=− ∑∑ n x b n y = -2.31307 yˆ = -2.31307 + 0.05557 x
- 10. Chapter 13: Simple Regression and Correlation Analysis 10 13.8 x x 15 47 8 36 19 56 12 44 5 21 yˆ = 13.625 - 2.303 x Residuals: x y yˆ Residuals (y- yˆ ) 15 47 48.1694 -1.1694 8 36 32.0489 3.9511 19 56 57.3811 -1.3811 12 44 41.2606 2.7394 5 21 25.1401 -4.1401 13.9 x y Predicted ( yˆ ) Residuals (y- yˆ ) 12 17 18.4582 -1.4582 21 15 19.9196 -4.9196 28 22 21.0563 0.9437 8 19 17.8087 1.1913 20 24 19.7572 4.2428 yˆ = 16.5 + 0.162 x 13.10 x y Predicted ( yˆ ) Residuals (y- yˆ ) 140 25 18.6597 6.3403 119 29 37.5229 -8.5229 103 46 51.8948 -5.8948 91 70 62.6737 7.3263 65 88 86.0280 1.9720 29 112 118.3648 -6.3648 24 128 122.8561 5.1439 yˆ = 144.414 - 0.848 x
- 11. Chapter 13: Simple Regression and Correlation Analysis 11 13.11 x y Predicted ( yˆ ) Residuals (y- yˆ ) 12.5 148 144.2053 3.7947 3.7 55 10.0953 44.9047 21.6 338 282.8873 55.1127 60.0 994 868.0945 125.9055 37.6 541 526.7236 14.2764 6.1 89 46.6708 42.3292 16.8 126 209.7364 -83.7364 41.2 379 581.5868 -202.5868 yˆ = -46.29 + 15.24x 13.12 x y Predicted ( yˆ ) Residuals (y- yˆ ) 16 5 4.0259 0.9741 6 12 11.1722 0.8278 8 9 9.7429 -0.7429 4 15 12.6014 2.3986 7 7 10.4575 -3.4575 yˆ = 15.46 - 0.715 x 13.13 x y Predicted ( yˆ ) Residuals (y- yˆ ) 5 47 42.2756 4.7244 7 38 38.9836 -0.9836 11 32 32.3996 -0.3996 12 24 30.7537 -6.7537 19 22 19.2317 2.7683 25 10 9.3558 0.6442 yˆ = 50.5056 - 1.6460 x No apparent violation of assumptions
- 12. Chapter 13: Simple Regression and Correlation Analysis 12 13.14 Miles (x) Cost y ( yˆ ) (y- yˆ ) 1,245 2.64 2.5376 .1024 425 2.31 2.3322 -.0222 1,346 2.45 2.5628 -.1128 973 2.52 2.4694 .0506 255 2.19 2.2896 -.0996 865 2.55 2.4424 .1076 1,080 2.40 2.4962 -.0962 296 2.37 2.2998 .0702 No apparent violation of assumptions 13.15 Error terms appear to be non independent
- 13. Chapter 13: Simple Regression and Correlation Analysis 13 13.16 There appears to be a non constant error variance. 13.17 There appears to be nonlinear regression 13.18 The MINITAB Residuals vs. Fits graphic is strongly indicative of a violation of the homoscedasticity assumption of regression. Because the residuals are very close together for small values of x, there is little variability in the residuals at the left end of the graph. On the other hand, for larger values of x, the graph flares out indicating a much greater variability at the upper end. Thus, there is a lack of homogeneity of error across the values of the independent variable.
- 14. Chapter 13: Simple Regression and Correlation Analysis 14 13.19 SSE = Σy2 – b0Σy - b1ΣXY = 1,935 - (16.51)(97) - 0.1624(1767) = 46.5692 3 5692.46 2 = − = n SSE se = 3.94 Approximately 68% of the residuals should fall within ±1se. 3 out of 5 or 60% of the actually residuals in 11.13 fell within ± 1se. 13.20 SSE = Σy2 – b0Σy - b1ΣXY = 45,154 - 144.414(498) - (-.89824)(30,099) = SSE = 272.0 5 0.272 2 = − = n SSE se = 7.376 6 out of 7 = 85.7% fall within + 1se 7 out of 7 = 100% fall within + 2se 13.21 SSE = Σy2 – b0Σy - b1ΣXY = 1,587,328 - (-46.29)(2,670) - 15.24(107,610.4) = SSE = 70,940 6 940,70 2 = − = n SSE se = 108.7 Six out of eight (75%) of the sales estimates are within $108.7 million. 13.22 SSE = Σy2 – b0Σy - b1ΣXY = 524 - 15.46(48) - (-0.71462)(333) = 19.8885 3 8885.19 2 = − = n SSE se = 2.575 Four out of five (80%) of the estimates are within 2.5759 of the actual rate for bonds. This amount of error is probably not acceptable to financial analysts.
- 15. Chapter 13: Simple Regression and Correlation Analysis 15 13.23 (y- yˆ ) (y- yˆ )2 4.7244 22.3200 -0.9836 .9675 -0.3996 .1597 -6.7537 45.6125 2.7683 7.6635 0.6442 .4150 Σ(y- yˆ )2 = 77.1382 SSE = 2 )ˆ(∑ − yy = 77.1382 4 1382.77 2 = − = n SSE se = 4.391 13.24 (y- yˆ ) (y- yˆ )2 .1023 .0105 -.0222 .0005 -.1128 .0127 .0506 .0026 -.0996 .0099 .1076 .0116 -.0962 .0093 .0702 .0049 Σ(y- yˆ )2 = .0620 SSE = 2 )ˆ(∑ − yy = .0620 6 0620. 2 = − = n SSE se = .1017 The model produces estimates that are ±.1017 or within about 10 cents 68% of the time. However, the range of milk costs is only 45 cents for this data.
- 16. Chapter 13: Simple Regression and Correlation Analysis 16 13.25 Volume (x) Sales (y) 728.6 10.5 497.9 48.1 439.1 64.8 377.9 20.1 375.5 11.4 363.8 123.8 276.3 89.0 n = 7 Σx = 3059.1 Σy = 367.7 Σx2 = 1,464,071.97 Σy2 = 30,404.31 Σxy = 141,558.6 b1 = -.1504 b0 = 118.257 yˆ = 118.257 - .1504x SSE = Σy2 – b0Σy - b1ΣXY = 30,404.31 - (118.257)(367.7) - (-0.1504)(141,558.6) = 8211.6245 5 6245.8211 2 = − = n SSE se = 40.5256 This is a relatively large standard error of the estimate given the sales values (ranging from 10.5 to 123.8). 13.26 r2 = 5 )97( 935,1 6399.46 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .123 This is a low value of r2 13.27 r2 = 7 )498( 154,45 12.272 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .972 This is a high value of r2
- 17. Chapter 13: Simple Regression and Correlation Analysis 17 13.28 r2 = 8 )670,2( 328,587,1 940,70 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .898 This value of r2 is relatively high 13.29 r2 = 5 )48( 524 8885.19 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .685 This value of r2 is a modest value. 68.5% of the variation of y is accounted for by x but 31.5% is unaccounted for. 13.30 r2 = 6 )173( 837,5 1384.77 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .909 This value is a relatively high value of r2 . Almost 91% of the variability of y is accounted for by the x values. 13.31 CCI Median Income 116.8 37.415 91.5 36.770 68.5 35.501 61.6 35.047 65.9 34.700 90.6 34.942 100.0 35.887 104.6 36.306 125.4 37.005 Σx = 323.573 Σy = 824.9 Σx2 = 11,640.93413 Σy2 = 79,718.79 Σxy = 29,804.4505 n = 9
- 18. Chapter 13: Simple Regression and Correlation Analysis 18 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 9 )573.323( 93413.640,11 9 )9.824)(573.323( 4505.804,29 2 − − = b1 = 19.2204 b0 = 9 573.323 )2204.19( 9 9.824 1 −=− ∑∑ n x b n y = -599.3674 yˆ = -599.3674 + 19.2204 x SSE = Σy2 – b0Σy - b1ΣXY = 79,718.79 – (-599.3674)(824.9) – 19.2204(29,804.4505) = 1283.13435 7 13435.1283 2 = − = n SSE se = 13.539 r2 = 9 )9.824( 79.718,79 13435.1283 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .688 13.32 sb = 5 )89( 833.1 94.3 )( 22 2 − = −∑ ∑ n x x se = .2498 b1 = 0.162 Ho: β = 0 α = .05 Ha: β ≠ 0 This is a two-tail test, α/2 = .025 df = n - 2 = 5 - 2 = 3 t.025,3 = ±3.182 t = 2498. 0162.011 − = − bs b β = 0.65 Since the observed t = 0.65 < t.025,3 = 3.182, the decision is to fail to reject the null hypothesis.
- 19. Chapter 13: Simple Regression and Correlation Analysis 19 13.33 sb = 7 )571( 293,58 376.7 )( 22 2 − = −∑ ∑ n x x se = .068145 b1 = -0.898 Ho: β = 0 α = .01 Ha: β ≠ 0 Two-tail test, α/2 = .005 df = n - 2 = 7 - 2 = 5 t.005,5 = ±4.032 t = 068145. 0898.011 −− = − bs b β = -13.18 Since the observed t = -13.18 < t.005,5 = -4.032, the decision is to reject the null hypothesis. 13.34 sb = 8 )5.199( 15.667,7 7.108 )( 22 2 − = −∑ ∑ n x x se = 2.095 b1 = 15.240 Ho: β = 0 α = .10 Ha: β ≠ 0 For a two-tail test, α/2 = .05 df = n - 2 = 8 - 2 = 6 t.05,6 = 1.943 t = 095.2 0240,1511 − = − bs b β = 7.27 Since the observed t = 7.27 > t.05,6 = 1.943, the decision is to reject the null hypothesis.
- 20. Chapter 13: Simple Regression and Correlation Analysis 20 13.35 sb = 5 )41( 421 575.2 )( 22 2 − = −∑ ∑ n x x se = .27963 b1 = -0.715 Ho: β = 0 α = .05 Ha: β ≠ 0 For a two-tail test, α/2 = .025 df = n - 2 = 5 - 2 = 3 t.025,3 = ±3.182 t = 27963. 0715.011 −− = − bs b β = -2.56 Since the observed t = -2.56 > t.025,3 = -3.182, the decision is to fail to reject the null hypothesis. 13.36 Analysis of Variance SOURCE df SS MS F Regression 1 5,165 5,165.00 1.95 Error 7 18,554 2,650.57 Total 8 23,718 Let α = .05 F.05,1,7 = 5.59 Since observed F = 1.95 < F.05,1,7 = 5.59, the decision is to fail to reject the null hypothesis. There is no overall predictability in this model. t = 95.1=F = 1.40 t.025,7 = 2.365 Since t = 1.40 < t.025,7 = 2.365, the decision is to fail to reject the null hypothesis. The slope is not significantly different from zero.
- 21. Chapter 13: Simple Regression and Correlation Analysis 21 13.37 F = 8.26 with a p-value of .021. The overall model is significant at α = .05 but not at α = .01. For simple regression, t = F = 2.8674 t.05,5 = 2.015 but t.01,5 = 3.365. The slope is significant at α = .05 but not at α = .01. 13.38 x0 = 25 95% confidence α/2 = .025 df = n - 2 = 5 - 2 = 3 t.025,3 = ±3.182 5 89 == ∑ n x x = 17.8 Σx = 89 Σx2 = 1,833 se = 3.94 yˆ = 16.5 + 0.162(25) = 20.55 yˆ ± t /2,n-2 se ∑ ∑− − + n x x xx n 2 2 2 0 )( )(1 20.55 ± 3.182(3.94) 5 )89( 833,1 )8.1725( 5 1 2 2 − − + = 20.55 ± 3.182(3.94)(.63903) = 20.55 ± 8.01 12.54 < E(y25) < 28.56
- 22. Chapter 13: Simple Regression and Correlation Analysis 22 13.39 x0 = 100 For 90% confidence, α/2 = .05 df = n - 2 = 7 - 2 = 5 t.05,5 = ±2.015 7 571 == ∑ n x x = 81.57143 Σx= 571 Σx2 = 58,293 Se = 7.377 yˆ = 144.414 - .0898(100) = 54.614 yˆ ± t /2,n-2 se ∑ ∑− − ++ n x x xx n 2 2 2 0 )( )(1 1 = 54.614 ± 2.015(7.377) 7 )571( 293,58 )57143.81100( 7 1 1 2 2 − − ++ = 54.614 ± 2.015(7.377)(1.08252) = 54.614 ± 16.091 38.523 < y < 70.705 For x0 = 130, yˆ = 144.414 - .0898(130) = 27.674 y ± t /2,n-2 se ∑ ∑− − ++ n x x xx n 2 2 2 0 )( )(1 1 = 27.674 ± 2.015(7.377) 7 )571( 293,58 )57143.81130( 7 1 1 2 2 − − ++ = 27.674 ± 2.015(7.377)(1.1589) = 27.674 ± 17.227 10.447 < y < 44.901 The width of this confidence interval of y for x0 = 130 is wider that the confidence interval of y for x0 = 100 because x0 = 100 is nearer to the value of x = 81.57 than is x0 = 130.
- 23. Chapter 13: Simple Regression and Correlation Analysis 23 13.40 x0 = 20 For 98% confidence, α/2 = .01 df = n - 2 = 8 - 2 = 6 t.01,6 = 3.143 8 5.199 == ∑ n x x = 24.9375 Σx = 199.5 Σx2 = 7,667.15 Se = 108.8 yˆ = -46.29 + 15.24(20) = 258.51 yˆ ± t /2,n-2 se ∑ ∑− − + n x x xx n 2 2 2 0 )( )(1 258.51 ± (3.143)(108.8) 8 )5.199( 15.667,7 )9375.2420( 8 1 2 2 − − + 258.51 ± (3.143)(108.8)(0.36614) = 258.51 ± 125.20 133.31 < E(y20) < 383.71 For single y value: yˆ ± t /2,n-2 se ∑ ∑− − ++ n x x xx n 2 2 2 0 )( )(1 1 258.51 ± (3.143)(108.8) 8 )5.199( 15.667,7 )9375.2420( 8 1 1 2 2 − − ++ 258.51 ± (3.143)(108.8)(1.06492) = 258.51 ± 364.16 -105.65 < y < 622.67 The confidence interval for the single value of y is wider than the confidence interval for the average value of y because the average is more towards the middle and individual values of y can vary more than values of the average.
- 24. Chapter 13: Simple Regression and Correlation Analysis 24 13.41 x0 = 10 For 99% confidence α/2 = .005 df = n - 2 = 5 - 2 = 3 t.005,3 = 5.841 5 41 == ∑ n x x = 8.20 Σx = 41 Σx2 = 421 Se = 2.575 yˆ = 15.46 - 0.715(10) = 8.31 yˆ ± t /2,n-2 se ∑ ∑− − + n x x xx n 2 2 2 0 )( )(1 8.31 ± 5.841(2.575) 5 )41( 421 )2.810( 5 1 2 2 − − + = 8.31 ± 5.841(2.575)(.488065) = 8.31 ± 7.34 0.97 < E(y10) < 15.65 If the prime interest rate is 10%, we are 99% confident that the average bond rate is between 0.97% and 15.65%.
- 25. Chapter 13: Simple Regression and Correlation Analysis 25 13.42 x y 5 8 7 9 3 11 16 27 12 15 9 13 Σx = 52 Σx2 = 564 Σy = 83 Σy2 = 1,389 b1 = 1.2853 Σxy = 865 n = 6 b0 = 2.6941 a) yˆ = 2.6941 + 1.2853 x b) yˆ (Predicted Values) (y- yˆ ) residuals 9.1206 -1.1206 11.6912 -2.6912 6.5500 4.4500 23.2588 3.7412 18.1176 -3.1176 14.2618 -1.2618 c) (y- yˆ )2 1.2557 7.2426 19.8025 13.9966 9.7194 1.5921 SSE = 53.6089 4 6089.53 2 = − = n SSE se = 3.661 d) r2 = 6 )83( 389,1 6089.53 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .777
- 26. Chapter 13: Simple Regression and Correlation Analysis 26 e) Ho: β = 0 α = .01 Ha: β ≠ 0 Two-tailed test, α/2 = .005 df = n - 2 = 6 - 2 = 4 t.005,4 = ±4.604 sb = 6 )52( 564 661.3 )( 22 2 − = −∑ ∑ n x x se = .34389 t = 34389. 02853.111 − = − bs b β = 3.74 Since the observed t = 3.74 < t.005,4 = 4.604, the decision is to fail to reject the null hypothesis. f) The r2 = 77.74% is modest. There appears to be some prediction with this model. The slope of the regression line is not significantly different from zero using α = .01. However, for α = .05, the null hypothesis of a zero slope is rejected. The standard error of the estimate, se = 3.661 is not particularly small given the range of values for y (11 - 3 = 8). 13.43 x y 53 5 47 5 41 7 50 4 58 10 62 12 45 3 60 11 Σx = 416 Σx2 = 22,032 Σy = 57 Σy2 = 489 b1 = 0.355 Σxy = 3,106 n = 8 b0 = -11.335 a) yˆ = -11.335 + 0.355 x
- 27. Chapter 13: Simple Regression and Correlation Analysis 27 b) yˆ (Predicted Values) (y- yˆ ) residuals 7.48 -2.48 5.35 -0.35 3.22 3.78 6.415 -2.415 9.255 0.745 10.675 1.325 4.64 -1.64 9.965 1.035 c) (y- yˆ )2 6.1504 .1225 14.2884 5.8322 .5550 1.7556 2.6896 1.0712 SSE = 32.4649 d) se = 6 4649.32 2 = − = n SSE se = 2.3261 e) r2 = 8 )57( 489 4649.32 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = .608 f) Ho: β = 0 α = .05 Ha: β ≠ 0 Two-tailed test, α/2 = .025 df = n - 2 = 8 - 2 = 6 t.025,6 = ±2.447 sb = 8 )416( 032,22 3261.2 )( 22 2 − = −∑ ∑ n x x se = 0.116305
- 28. Chapter 13: Simple Regression and Correlation Analysis 28 t = 116305. 03555.011 − = − bs b β = 3.05 Since the observed t = 3.05 > t.025,6 = 2.447, the decision is to reject the null hypothesis. The population slope is different from zero. g) This model produces only a modest r2 = .608. Almost 40% of the variance of Y is unaccounted for by X. The range of Y values is 12 - 3 = 9 and the standard error of the estimate is 2.33. Given this small range, the se is not small. 13.44 Σx = 1,263 Σx2 = 268,295 Σy = 417 Σy2 = 29,135 Σxy = 88,288 n = 6 b0 = 25.42778 b1 = 0.209369 SSE = Σy2 - b0Σy - b1Σxy = 29,135 - (25.42778)(417) - (0.209369)(88,288) = 46.845468 r2 = 5.153 845468.46 1 )( 1 2 2 −= − − ∑ ∑ n y y SSE = .695 Coefficient of determination = r2 = .695 13.45 a) x0 = 60 Σx = 524 Σx2 = 36,224 Σy = 215 Σy2 = 6,411 b1 = .5481 Σxy = 15,125 n = 8 b0 = -9.026 se = 3.201 95% Confidence Interval α/2 = .025 df = n - 2 = 8 - 2 = 6 t.025,6 = ±2.447 yˆ = -9.026 + 0.5481(60) = 23.86
- 29. Chapter 13: Simple Regression and Correlation Analysis 29 8 524 == ∑ n x x = 65.5 yˆ ± tα /2,n-2 se ∑ ∑− − + n x x xx n 2 2 2 0 )( )(1 23.86 + 2.447((3.201) 8 )524( 224,36 )5.6560( 8 1 2 2 − − + 23.86 + 2.447(3.201)(.375372) = 23.86 + 2.94 20.92 < E(y60) < 26.8 b) x0 = 70 yˆ 70 = -9.026 + 0.5481(70) = 29.341 yˆ + tα/2,n-2 se ∑ ∑− − ++ n x x xx n 2 2 2 0 )( )(1 1 29.341 + 2.447(3.201) 8 )524( 224,36 )5.6570( 8 1 1 2 2 − − ++ 29.341 + 2.447(3.201)(1.06567) = 29.341 + 8.347 20.994 < y < 37.688 c) The confidence interval for (b) is much wider because part (b) is for a single value of y which produces a much greater possible variation. In actuality, x0 = 70 in part (b) is slightly closer to the mean (x) than x0 = 60. However, the width of the single interval is much greater than that of the average or expected y value in part (a).
- 30. Chapter 13: Simple Regression and Correlation Analysis 30 13.46 Σy = 267 Σy2 = 15,971 Σx = 21 Σx2 = 101 Σxy = 1,256 n = 5 b0 = 9.234375 b1 = 10.515625 SSE = Σy2 - b0Σy - b1Σxy = 15,971 - (9.234375)(267) - (10.515625)(1,256) = 297.7969 r2 = 2.713,1 7969.297 1 )( 1 2 2 −= − − ∑ ∑ n y y SSE = .826 If a regression model would have been developed to predict number of cars sold by the number of sales people, the model would have had an r2 of 82.6%. The same would hold true for a model to predict number of sales people by the number of cars sold. 13.47 n = 12 Σx = 548 Σx2 = 26,592 Σy = 5940 Σy2 = 3,211,546 Σxy = 287,908 b1 = 10.626383 b0 = 9.728511 yˆ = 9.728511 + 10.626383 x SSE = Σy2 - b0Σy - b1Σxy = 3,211,546 - (9.728511)(5940) - (10.626383)(287,908) = 94337.9762 10 9762.337,94 2 = − = n SSE se = 97.1277 r2 = 246,271 9762.337,94 1 )( 1 2 2 −= − − ∑ ∑ n y y SSE = .652
- 31. Chapter 13: Simple Regression and Correlation Analysis 31 t = 12 )548( 592,26 1277.97 0626383.10 2 − − = 4.33 If α = .01, then t.005,10 = 3.169. Since the observed t = 4.33 > t.005,10 = 3.169, the decision is to reject the null hypothesis. 13.48 Sales(y) Number of Units(x) 17.1 12.4 7.9 7.5 4.8 6.8 4.7 8.7 4.6 4.6 4.0 5.1 2.9 11.2 2.7 5.1 2.7 2.9 Σy = 51.4 Σy2 = 460.1 Σx = 64.3 Σx2 = 538.97 Σxy = 440.46 n = 9 b1 = 0.92025 b0 = -0.863565 SSE = Σy2 - b0Σy - b1Σxy = 460.1 - (-0.863565)(51.4) - (0.92025)(440.46) = r2 = 55.166 153926.99 1 )( 1 2 2 −= − − ∑ ∑ n y y SSE = .405
- 32. Chapter 13: Simple Regression and Correlation Analysis 32 13.49 1977 2000 581 571 213 220 668 492 345 221 1476 1760 1776 5750 Σx= 5059 Σy = 9014 Σx2 = 6,280,931 Σy2 = 36,825,446 Σxy = 13,593,272 n = 6 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 6 )5059( 931,280,6 6 )9014)(5059( 272,593,13 2 − − = 2.97366 b0 = 6 5059 )97366.2( 6 9014 1 −=− ∑∑ n x b n y = -1004.9575 yˆ = -1004.9575 + 2.97366 x for x = 700: yˆ = 1076.6044 yˆ + tα/2,n-2se ∑ ∑− − + n x x xx n 2 2 2 0 )( )(1 α = .05, t.025,4 = 2.776 x0 = 700, n = 6 x = 843.167 SSE = Σy2 – b0Σy –b1Σxy = 36,825,446 – (-1004.9575)(9014) – (2.97366)(13,593,272) = 5,462,363.69
- 33. Chapter 13: Simple Regression and Correlation Analysis 33 4 69.363,462,5 2 = − = n SSE se = 1168.585 Confidence Interval = 1076.6044 + (2.776)(1168.585) 6 )5059( 931,280,6 )167.843700( 6 1 2 2 − − + = 1076.6044 + 1364.1632 -287.5588 to 2440.7676 H0: β1 = 0 Ha: β1 ≠ 0 α = .05 df = 4 Table t.025,4 = 2.132 t = 8231614. 9736.2 833.350,015,2 585.1168 09736.201 = − = − bs b = 3.6124 Since the observed t = 3.6124 > t.025,4 = 2.132, the decision is to reject the null hypothesis. 13.50 Σx = 11.902 Σx2 = 25.1215 Σy = 516.8 Σy2 = 61,899.06 b1 = 66.36277 Σxy = 1,202.867 n = 7 b0 = -39.0071 yˆ = -39.0071 + 66.36277 x SSE = Σy2 - b0 Σy - b1 Σxy SSE = 61,899.06 - (-39.0071)(516.8) - (66.36277)(1,202.867) SSE = 2,232.343 5 343.232,2 2 = − = n SSE se = 21.13
- 34. Chapter 13: Simple Regression and Correlation Analysis 34 r2 = 7 )8.516( 06.899,61 343.232,2 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = 1 - .094 = .906 13.51 Σx = 44,754 Σy = 17,314 Σx2 = 167,540,610 Σy2 = 24,646,062 n = 13 Σxy = 59,852,571 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 13 )754,44( 610,540,167 13 )314,17)(754,44( 571,852,59 2 − − = .01835 b0 = 13 754,44 )01835(. 13 314,17 1 −=− ∑∑ n x b n y = 1268.685 yˆ = 1268.685 + .01835 x r2 for this model is .002858. There is no predictability in this model. Test for slope: t = 0.18 with a p-value of 0.8623. Not significant 13.52 Σx = 323.3 Σy = 6765.8 Σx2 = 29,629.13 Σy2 = 7,583,144.64 Σxy = 339,342.76 n = 7 b1 = ∑ ∑ ∑ ∑ ∑ − − = n x x n yx xy SS SS x xy 2 2 )( = 7 )3.323( 13.629,29 7 )8.6765)(3.323( 76.342,339 2 − − = 1.82751 b0 = 7 3.323 )82751.1( 7 8.6765 1 −=− ∑∑ n x b n y = 882.138 yˆ = 882.138 + 1.82751 x SSE = Σy2 –b0Σy –b1Σxy
- 35. Chapter 13: Simple Regression and Correlation Analysis 35 = 7,583,144.64 –(882.138)(6765.8) –(1.82751)(339,342.76) = 994,623.07 5 07.623,994 2 = − = n SSE se = 446.01 r2 = 7 )8.6765( 64.144,583,7 07.623,994 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = 1 - .953 = .047 H0: β = 0 Ha: β ≠ 0 α = .05 t.025,5 = 2.571 SSxx = ( ) 7 )3.323( 13.629,29 2 2 2 −=−∑ ∑ n x x = 14,697.29 t = 29.697,14 01.446 082751.101 − = − xx e SS s b = 0.50 Since the observed t = 0.50 < t.025,5 = 2.571, the decision is to fail to reject the null hypothesis. 13.53 Let Water use = y and Temperature = x Σx = 608 Σx2 = 49,584 Σy = 1,025 Σy2 = 152,711 b1 = 2.40107 Σxy = 86,006 n = 8 b0 = -54.35604 yˆ = -54.35604 + 2.40107 x yˆ 100 = -54.35604 + 2.40107(100) = 185.751 SSE = Σy2 - b0 Σy - b1 Σxy SSE = 152,711 - (-54.35604)(1,025) - (2.40107)(86,006) = 1919.5146 6 5146.919,1 2 = − = n SSE se = 17.886
- 36. Chapter 13: Simple Regression and Correlation Analysis 36 r2 = 8 )1025( 711,152 5145.919,1 1 )( 1 22 2 − −= − − ∑ ∑ n y y SSE = 1 - .09 = .91 Testing the slope: Ho: β = 0 Ha: β ≠ 0 α = .01 Since this is a two-tailed test, α/2 = .005 df = n - 2 = 8 - 2 = 6 t.005,6 = ±3.707 sb = 8 )608( 584,49 886.17 )( 22 2 − = −∑ ∑ n x x se = .30783 t = 30783. 040107.211 − = − bs b β = 7.80 Since the observed t = 7.80 < t.005,6 = 3.707, the decision is to reject the null hypothesis. 13.54 a) The regression equation is: yˆ = 67.2 – 0.0565 x b) For every unit of increase in the value of x, the predicted value of y will decrease by -.0565. c) The t ratio for the slope is –5.50 with an associated p-value of .000. This is significant at α = .10. The t ratio negative because the slope is negative and the numerator of the t ratio formula equals the slope minus zero. d) r2 is .627 or 62.7% of the variability of y is accounted for by x. This is only a modest proportion of predictability. The standard error of the estimate is 10.32. This is best interpreted in light of the data and the magnitude of the data. e) The F value which tests the overall predictability of the model is 30.25. For simple regression analysis, this equals the value of t2 which is (-5.50)2 . f) The negative is not a surprise because the slope of the regression line is also
- 37. Chapter 13: Simple Regression and Correlation Analysis 37 negative indicating an inverse relationship between x and y. In addition, taking the square root of r2 which is .627 yields .7906 which is the magnitude of the value of r considering rounding error. 13.55 The F value for overall predictability is 7.12 with an associated p-value of .0205 which is significant at α = .05. It is not significant at alpha of .01. The coefficient of determination is .372 with an adjusted r2 of .32. Thisrepresents very modest predictability. The standard error of the estimate is 982.219 which in units of 1,000 laborers means that about 68% of the predictions are within 982,219 of the actual figures. The regression model is: Number of Union Members = 22,348.97 - 0.0524 Labor Force. For a labor force of 100,000 (thousand, actually 100 million), substitute x = 100,000 and get a predicted value of 17,108.97 (thousand) which is actually 17,108,970 union members. 13.56 The Residual Model Diagnostics from MINITAB indicate a relatively healthy set of residuals. The Histogram indicates that the error terms are generally normally distributed. This is confirmed by the nearly straight line Normal Plot of Residuals. The I Chart indicates a relatively homogeneous set of error terms throughout the domain of x values. This is confirmed by the Residuals vs. Fits graph. This residual diagnosis indicates no assumption violations.

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment