Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BS2506 tutorial3

863 views

Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

BS2506 tutorial3

  1. 1. Tutorial 3Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ) me@pairach.com
  2. 2. 1. House price (Again) Predictor Coefficient (B) SE (B) (Variable) Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Analysis of Variance (ANOVA)Source of variation Sum of Squares Degree of Freedom Mean SquaresRegression 277,895Residual 34,727
  3. 3. 1 (a)(i) Write out the estimated regression equation Predictor Coefficient (B) SE (B) (Variable) Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 ˆ Y = −2.5 + 1.62 X 1 + 0.257 X 2 − 0.027 X 4
  4. 4. 1 (a)(ii) Test for the significance of regression equationStep1: At 1% α = 0.01Critical Value tα 2,df = t0.012 ,15− 4 = t0.005,11 = 3.1058Step2: βi t βi =t-Statistic SE βi
  5. 5. 1 (a) (ii) Test for the significance of regression equationStep1: Critical Value At 1% α = 0.01 t0.005,11 = 3.1058Step2: t-Statistic 1.62 Reject H0 t1 = = 7.71 > 3.1058 βi 0.21t βi = SE βi Do NOT 0.257 < 3.1058 t2 = = 0.137 Reject H0 1.88 − 0.027 t4 = = −3.375 < -3.1058 Reject H0 0.008
  6. 6. 1. a). (iii) What are DF for SSR & SSE? Predictor Coefficient (B) SE (B) (Variable) Constant -2.5 41.4 X1 1.62 0.21 X2 0.257 1.88 X4 -0.027 0.008 Analysis of Variance (ANOVA)Source of variation Sum of Squares Degree of Freedom Mean SquaresRegression 277,895 3 (p)Residual 34,727 11 (n-p-1)
  7. 7. 1. a). (iv) Test for Significant relationship X&Y? H0: β1 = β 2 = β 4 = 0 H1: At least one of the coefficients does not equal 0 Analysis of Variance (ANOVA)Source of Sum of Degree of Mean F Statisticvariation Squares Freedom SquaresRegression 277,895 3 92,631 29.341 Residual 34,727 11 3157Critical Value At α = 0.01 F0.01(3,11) = 6.217 Then we can reject Null hypothesis, there is a relationship between Xs & Y
  8. 8. 1. a). (v) Compute the coefficient of determination and explain its meaning 2 = 1− Sum Square ErrorR Sum Squares Total Analysis of Variance (ANOVA) Source of Sum of Degree of Mean F Statistic variation Squares Freedom Squares Regression 277,895 3 92,631 29.341 Residual 34,727 11 3157 TOTAL 312,622 R2 = 1 – (34,727/312,622) R2 = 1 – 0.111 R2 = 0.889 = 88.9%
  9. 9. 1(b) Model 1y = 1.8 + 1.601x1 − 0.026 x4ˆ R = 0.880 2 Model 2y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6ˆ R = 0.935 2 Model 3y = 65.2 + 1.22 x1 − 0.067 x2 − 0.026 x4 + 63.447 x5 − 65.447 x6ˆ R = 0.936 2
  10. 10. 1(b)(i) Compute Adjusted Coefficient of determination forthree models n −1 R 2 adj = R = 1 − (1 − R )( 2 2 ) n − p −1 15 − 1 R = 1 − (1 − 0.880)( 1 2 ) = 0.86 15 − 2 − 1 15 − 1 R = 1 − (1 − 0.935)( 2 2 ) = 0.909 15 − 4 − 1 15 − 1 R = 1 − (1 − 0.936)( 3 2 ) = 0.900 15 − 5 − 1
  11. 11. 1(b) (ii) Interpret the coefficients on the house type, Beta5 and Beta6(model 2) y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6 ˆ Prices for Detached houses increase by £63,794 Prices for Terrace Houses decreased by £65,371 (relative to Semi- detached)
  12. 12. 1(b) (iii) At 0.05 level of significance, determine whether model 2 is superior to model1Model 1 y = 1.8 + 1.601x1 − 0.026 x4 ˆModel 2 y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6 ˆ RComplete − RRe stricted 2 2 n − p −1 F= × 1− R 2 Complete p−q 0.935 − 0.880 15 − 4 − 1F= × = 4.231 1 − 0.935 4−2Fα ,( p − q ,n − p −1) = F0.05,( 4− 2,15− 4−1) = F0.05, 2,10 = 4.103 < 4.231 Significant i.e., Model 2 is better than Model 1
  13. 13. 1(b) (iv) At 0.05 level of significance, determine whether model 3 is superior to model 2Model 2 y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6 ˆModel 3 y = 65.2 + 1.22 x1 − 0.067 x2 − 0.026 x4 + 63.447 x5 − 65.447 x6 ˆ RComplete − RRe stricted 2 2 n − p −1F= × 1 − RComplete 2 p−q 0.936 − 0.935 15 − 5 − 1F= × = 0.141 1 − 0.936 5−4Fα ,( p − q ,n − p −1) = F0.05,( 5− 4,15−5−1) = F0.05,1,9 = 5.117 > 0.141 NOT Significant i.e., Model 3 is NOT better than Model 2
  14. 14. 1(b) (v) From model2, estimate the price of 5 years old detached house with 250 square meters y = 64.05 + 1.23 x1 − 0.026 x4 + 63.794 x5 − 65.371x6 ˆy = 64.05 + 1.23 * 250 − 0.026(250 * 5) + 63.794 *1 − 65.371* 0ˆ y = £402,844 ˆ
  15. 15. 2. Advertising expenditureX, Advertising Y, Sales R square 0.97 (£000) (£000) Adjusted R Square 0.96 5.5 90 Standard error of regression 3.37 2.0 40 Analysis of variance 3.2 55 DF Sum Square Mean Square Regression 2,904 6.0 95 Residual 80.0 3.8 70 4.4 80 Variables in the Equation 6.0 88 Variable B SE B 5.0 85 Advert 31.79 4.48 6.5 92 Advert-square -2.30 0.485 7.0 91 (constant) -17.22 9.65
  16. 16. 2.(a) State the regression equation for the curvilinear model. Variables in the Equation Variable B SE B Advert 31.79 4.48 Advert-square -2.30 0.485 (constant) -17.22 9.65 ˆ = β +β X −β X2 Yt 0 1 2 ˆ = −17.22 + 31.79 X − 2.30 X 2 Yt
  17. 17. 2.(b) Predict the monthly sales (in pounds) for a month with total advertising expenditure of £6,000 ˆ Yt = −17.22 + 31.79 X − 2.30 X 2X=6 ˆ Yt = −17.22 + 31.79(6) − 2.30(6)2 = 90.720 Sales = 90.720 *1,000 = £90,720
  18. 18. 2.(c) Determine there is significant relationship between the sales and advertising expenditure at the 0.01 level of significance H0: β1 = β 2 = 0 ˆ Yt = β 0 + β1 X − β 2 X 2 H1: At least one of the coefficients does not equal 0 Analysis of variance DF Sum Square Mean Square FRegression 2 2,904 1,452 127.05Residual 7 80.0 11.428Critical Value At α = 0.01 F0.01( 2, 7 ) = 5.547 Then we can reject Null hypothesis, there is a curvilinear relationship between sales and advertising expenditure
  19. 19. 2 (d) Fit a linear model to the dataand calculate SSE for this model ˆ β1 = ∑ xy − nx y ∑ x − nx 2 2 ˆ = y−β x β0 ˆ 1
  20. 20. 2 (d) Fit a linear model to the data and calculate SSE for this model X YID Advertising Sales1 5.5 902 2 403 3.2 554 6 955 3.8 706 4.4 807 6 888 5 859 6.5 9210 7 91
  21. 21. 2 (d) Fit a linear model to the data and calculate SSE for this model X Y ID xy x^2 y^2 Advertising Sales 1 5.5 90 495 30.25 8100 2 2 40 80 4 1600 3 3.2 55 176 10.24 3025 4 6 95 570 36 9025 5 3.8 70 266 14.44 4900 6 4.4 80 352 19.36 6400 7 6 88 528 36 7744 8 5 85 425 25 7225 9 6.5 92 598 42.25 8464 10 7 91 637 49 8281 Sum 49.4 786 4127 266.54 64764Average 4.94 78.6 412.7 26.654 6476.4
  22. 22. 2 (d) Fit a linear model to the data and calculate SSE for this modelˆβ1 = ∑ xy − nxy β = 4127 − 10(4.94)(78.6) = 10.85 ˆ ∑ 1 x − nx 2 2 266.54 − 10(4.94) 2 ˆ ˆβ 0 = y − β1 x ˆ β 0 = 78.6 − 10.85(4.94) = 25.0 y = 25.0 + 10.85 x ˆ
  23. 23. 2 (d) Fit a linear model to the data and calculate SSE for this model X Y ID xy x^2 y^2 Advertising Sales 1 5.5 90 495 30.25 8100 2 2 40 80 4 1600 3 3.2 55 176 10.24 3025 4 6 95 570 36 9025 5 3.8 70 266 14.44 4900 6 4.4 80 352 19.36 6400 7 6 88 528 36 7744 8 5 85 425 25 7225 9 6.5 92 598 42.25 8464 10 7 91 637 49 8281 Sum 49.4 786 4127 266.54 64764Average 4.94 78.6 412.7 26.654 6476.4
  24. 24. 2 (d) Fit a linear model to the data and calculate SSE for this model X Y predicted ID xy x^2 y^2 Advertising Sales Y 1 5.5 90 495 30.25 8100 84.68 2 2 40 80 4 1600 46.70 3 3.2 55 176 10.24 3025 59.72 4 6 95 570 36 9025 90.10 5 3.8 70 266 14.44 4900 66.23 ˆ = 25 + 10.85 X 6 4.4 80 352 19.36 6400 72.74 Yt7 8 9 6 5 6.5 88 85 92 528 425 598 36 25 42.25 7744 7225 8464 90.10 79.25 95.53 10 7 91 637 49 8281 100.95 Sum 49.4 786 4127 266.54 64764Average 4.94 78.6 412.7 26.654 6476.4
  25. 25. 2 (d) Fit a linear model to the data and calculate SSE for this model X Y predicted Square ID xy x^2 y^2 Advertising Sales Y Error 1 5.5 90 495 30.25 8100 84.68 28.35 2 2 40 80 4 1600 46.70 44.92 3 3.2 55 176 10.24 3025 59.72 22.29 4 6 95 570 36 9025 90.10 24.00 5 3.8 70 266 14.44 4900 66.23 14.20 6 4.4 80 352 19.36 6400 72.74 52.69 7 6 88 528 36 7744 90.10 4.41 8 5 85 425 25 7225 79.25 33.05 9 6.5 92 598 42.25 8464 95.53 12.43 10 7 91 637 49 8281 100.95 99.01 Sum 49.4 786 4127 266.54 64764Average 4.94 78.6 412.7 26.654 6476.4
  26. 26. 2 (d) Fit a linear model to the data and calculate SSE for this model X Y predicted Square ID xy x^2 y^2 Advertising Sales Y Error 1 5.5 90 495 30.25 8100 84.68 28.35 2 2 40 80 4 1600 46.70 44.92 3 3.2 55 176 10.24 3025 59.72 22.29 4 6 95 570 36 9025 90.10 24.00 5 3.8 70 266 14.44 4900 66.23 14.20 6 4.4 80 352 19.36 6400 72.74 52.69 7 6 88 528 36 7744 90.10 4.41 8 5 85 425 25 7225 79.25 33.05 9 6.5 92 598 42.25 8464 95.53 12.43 10 7 91 637 49 8281 100.95 99.01 Sum 49.4 786 4127 266.54 64764 335.36Average 4.94 78.6 412.7 26.654 6476.4
  27. 27. 2(e) At 0.01 level of significance, determine whether the curvilinear model is superior to the linear regression model Curvilinear Model ˆ Yt = −17.22 + 31.79 X − 2.30 X 2 Linear Regression Model ˆ Yt = 25 + 10.85 X SSE Linear − SSECurvilinear n − p − 1F= × SSECurvilinear p−q 335 − 80 10 − 2 − 1F= × = 22.3125 80 2 −1 Fα ,( p − q ,n − p −1) = F0.01,( 2−1,10− 2−1) = F0.01,1, 7 = 12.25 < 22.3 Significant i.e., Curvilinear effect make significant contribution and should be included in the model.
  28. 28. 2 (f) Draw a scatter diagram between the sales& Advertising expenditure. Sales1009080706050 Observed40302010 0 0 1 2 3 4 5 6 7 8
  29. 29. 2 (f) Sketch the Linear regression ˆ Yt = 25 + 10.85 X Sales10090807060 Linear50 Observed Regression40302010 0 0 1 2 3 4 5 6 7 8
  30. 30. 2 (f) Sketch the Quadratic regression ˆ Yt = −17.22 + 31.79 X − 2.30 X 2 Sales10090 Quadratic8070 Regression60 Linear50 Observed Regression40302010 0 0 1 2 3 4 5 6 7 8
  31. 31. Thank you Download this Slides atwww.pairach.com/teaching Q&A

×