Research Methods William G. Zikmund, Ch23

1,198 views

Published on

Research Methods
William G. Zikmund

Published in: Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,198
On SlideShare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
131
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Research Methods William G. Zikmund, Ch23

  1. 1. Business Research Methods William G. Zikmund Chapter 23Bivariate Analysis: Measures of Associations
  2. 2. Measures of Association• A general term that refers to a number of bivariate statistical techniques used to measure the strength of a relationship between two variables.
  3. 3. Relationships Among Variables• Correlation analysis• Bivariate regression analysis
  4. 4. Type of Measure ofMeasurement AssociationInterval and Correlation CoefficientRatio Scales Bivariate Regression
  5. 5. Type of Measure ofMeasurement AssociationOrdinal Scales Chi-square Rank Correlation
  6. 6. Type of Measure ofMeasurement Association Chi-Square Nominal Phi Coefficient Contingency Coefficient
  7. 7. Correlation Coefficient• A statistical measure of the covariation or association between two variables.• Are dollar sales associated with advertising dollar expenditures?
  8. 8. The Correlation coefficient for two rxy variables, X and Y is .
  9. 9. Correlation Coefficient• r• r ranges from +1 to -1• r = +1 a perfect positive linear relationship• r = -1 a perfect negative linear relationship• r = 0 indicates no correlation
  10. 10. Simple Correlation Coefficientrxy = ryx = ∑ ( X − X )(Y − Y ) i i ∑ ( Xi − X ) ∑ (Yi − Y ) 2 2
  11. 11. Simple Correlation Coefficient σ xy rxy = ryx = σ σ2 x 2 y
  12. 12. Simple Correlation Coefficient Alternative Method σ 2 x= Variance of X σ = Variance of Y 2 y σ xy= Covariance of X and Y
  13. 13. Y Correlation Patterns NO CORRELATION X .
  14. 14. Y Correlation Patterns PERFECT NEGATIVE CORRELATION - r= -1.0 X .
  15. 15. Correlation PatternsY A HIGH POSITIVE CORRELATION r = +.98 X .
  16. 16. Calculation of r − 6.3389r= (17.837)( 5.589 ) − 6.3389 = = −.635 99.712 Pg 629
  17. 17. Coefficient of Determination Explained variancer = 2 Total Variance
  18. 18. Correlation Does Not Mean Causation• High correlation• Rooster’s crow and the rising of the sun – Rooster does not cause the sun to rise.• Teachers’ salaries and the consumption of liquor – Covary because they are both influenced by a third variable
  19. 19. Correlation Matrix• The standard form for reporting correlational results.
  20. 20. Correlation Matrix Var1 Var2 Var3Var1 1.0 0.45 0.31Var2 0.45 1.0 0.10Var3 0.31 0.10 1.0
  21. 21. Walkup’s First Laws of Statistics• Law No. 1 – Everything correlates with everything, especially when the same individual defines the variables to be correlated.• Law No. 2 – It won’t help very much to find a good correlation between the variable you are interested in and some other variable that you don’t understand any better.
  22. 22. Walkup’s First Laws of Statistics• Law No. 3 – Unless you can think of a logical reason why two variables should be connected as cause and effect, it doesn’t help much to find a correlation between them. In Columbus, Ohio, the mean monthly rainfall correlates very nicely with the number of letters in the names of the months!
  23. 23. Regression DICTIONARY GOING OR DEFINITION MOVING BACKWARDGoing back to previous conditions Tall men’s sons
  24. 24. Bivariate Regression• A measure of linear association that investigates a straight line relationship• Useful in forecasting
  25. 25. Bivariate Linear Regression• A measure of linear association that investigates a straight-line relationship• Y = a + bX• where• Y is the dependent variable• X is the independent variable• a and b are two constants to be estimated
  26. 26. Y intercept• a• An intercepted segment of a line• The point at which a regression line intercepts the Y-axis
  27. 27. Slope• b• The inclination of a regression line as compared to a base line• Rise over run• D - notation for “a change in”
  28. 28. 160 Y Scatter Diagram150 and Eyeball Forecast140 My line Your line1301201101009080 X 70 80 90 100 110 120 130 140 150 160 170 180 190 .
  29. 29. Regression Line and SlopeY130120110 ˆ ˆ ˆ Y = a + βX10090 ∆Yˆ80 ∆X 80 90 100 110 120 130 140 150 160 170 180 190 X .
  30. 30. 160 Y Least-Squares150 Regression Line140 Actual Y for Dealer 7130120110 Y “hat” for Dealer 7100 Y “hat” for Dealer 390 Actual Y for80 Dealer 3 70 80 90 100 110 120 130 140 150 160 170 180 190 X
  31. 31. Scatter Diagram of ExplainedY and Unexplained Variation130 Deviation not explained } {}120 Total deviation110 Deviation explained by the regression100 Y9080 80 90 100 110 120 130 140 150 160 170 180 190 X .
  32. 32. The Least-Square Method• Uses the criterion of attempting to make the least amount of total error in prediction of Y from X. More technically, the procedure used in the least-squares method generates a straight line that minimizes the sum of squared deviations of the actual values from this predicted regression line.
  33. 33. The Least-Square Method• A relatively simple mathematical technique that ensures that the straight line will most closely represent the relationship between X and Y.
  34. 34. Regression - Least-Square Method n∑ 2 e is minimumi =1 i
  35. 35. ei = Yi - ˆ Yi (The “residual”)Yi = actual value of the dependent variable ˆYi = estimated value of the dependent variable (Y hat)n = number of observationsi = number of the observation
  36. 36. The Logic behind the Least- Squares Technique• No straight line can completely represent every dot in the scatter diagram• There will be a discrepancy between most of the actual scores (each dot) and the predicted score• Uses the criterion of attempting to make the least amount of total error in prediction of Y from X
  37. 37. Bivariate Regression ˆX a =Y −β ˆ
  38. 38. Bivariate Regressionˆ= n( ∑ XY ) − ( ∑ X )( ∑Y )β n( ∑ X ) − (∑ X ) 2 2
  39. 39. ˆβ = estimated slope of the line (the “regression coefficient”)ˆa = estimated intercept of the y axisY = dependent variableY = mean of the dependent variableX = independent variableX = mean of the independent variablen = number of observations
  40. 40. ˆ = 15(193,345) − 2,806,875β 15( 245,759 ) − 3,515,625 2,900,175 − 2,806,875 = 3,686,385 − 3,515,625 93,300 = = .54638 170,760
  41. 41. a = 99.8 − .54638(125)ˆ = 99.8 − 68.3 = 31.5
  42. 42. a = 99.8 − .54638(125)ˆ = 99.8 − 68.3 = 31.5
  43. 43. Y = 31.5 + .546( X ) ˆ = 31.5 + .546( 89 ) = 31.5 + 48.6 = 80.1
  44. 44. Y = 31.5 + .546( X ) ˆ = 31.5 + .546( 89 ) = 31.5 + 48.6 = 80.1
  45. 45. Dealer 7 (Actual Y value = 129) Y7 = 31.5 + .546(165) ˆ = 121.6Dealer 3 (Actual Y value = 80) Y3 = 31.5 + .546( 95) ˆ = 83.4
  46. 46. ˆei = Y9 − Y9 = 97 − 96.5 = 0 .5
  47. 47. Dealer 7 (Actual Y value = 129) Y7 = 31.5 + .546(165) ˆ = 121.6Dealer 3 (Actual Y value = 80) Y3 = 31.5 + .546( 95) ˆ = 83.4
  48. 48. ˆei = Y9 − Y9 = 97 − 96.5 = 0 .5
  49. 49. ˆ = 31.5 + .546(119 )Y9
  50. 50. F-Test (Regression)• A procedure to determine whether there is more variability explained by the regression or unexplained by the regression.• Analysis of variance summary table
  51. 51. Total Deviation can be Partitioned into Two Parts• Total deviation equals• Deviation explained by the regression plus• Deviation unexplained by the regression
  52. 52. “We are always acting on what has just finished happening. It happened at least 1/30th of a second ago.We think we’re inthe present, but we aren’t. The present we . know is only a movie of the past.” Tom Wolfe in The Electric Kool-Aid Acid Test
  53. 53. Partitioning the Variance(Yi − Y ) = Yi ( ˆ −Y ) ( ˆ + Yi − Yi ) Deviation Deviation unexplained byTotal = explained by the + the regressiondeviation regression (Residual error)
  54. 54. Y = Mean of the total group ˆY = Value predicted with regression equationYi = Actual value
  55. 55. ∑ (Y − Y ) i 2 = ∑ (Yˆ − Y ) i 2 ( + ∑ Yi − Yi ˆ ) 2Total Unexplained Explainedvariation = + variation variationexplained (residual)
  56. 56. Sum of SquaresSSt = SSr + SSe
  57. 57. Coefficient of Determination r2• The proportion of variance in Y that is explained by X (or vice versa)• A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable
  58. 58. Coefficient of Determination r 2 SSr SSer = 2 = 1− SSt SSt
  59. 59. Source of Variation• Explained by Regression• Degrees of Freedom – k-1 where k= number of estimated constants (variables)• Sum of Squares – SSr• Mean Squared – SSr/k-1
  60. 60. Source of Variation• Unexplained by Regression• Degrees of Freedom – n-k where n=number of observations• Sum of Squares – SSe• Mean Squared – SSe/n-k
  61. 61. r2 in the Example 3,398.49r = 2 = .875 3,882.4
  62. 62. Multiple Regression• Extension of Bivariate Regression• Multidimensional when three or more variables are involved• Simultaneously investigates the effect of two or more variables on a single dependent variable• Discussed in Chapter 24
  63. 63. Correlation Coefficient, r = .75 Correlation: Player Salary and Ticket Price 30 20 Change in Ticket 10 Price 0 Change in-10 Player Salary-20 1995 1996 1997 1998 1999 2000 2001

×