Concepts in Hypothesis Testing
Background Information <ul><li>The manager of Pepperoni Pizza Restaurant has recently begun experimenting with a new metho...
The Experiment <ul><li>For 100 randomly selected customers who order a pepperoni pizza for home delivery, he includes both...
Hypothesis Testing <ul><li>This example’s goal is to explain hypothesis testing concepts. We are not implying that the man...
Null and Alternative Hypotheses <ul><li>Usually, the null hypothesis is labeled H o  and the alternative hypothesis is lab...
One-Tailed Versus Two-Tailed Tests <ul><li>The form of the alternative hypothesis can be either a  one-tailed  or  two-tai...
One-Tailed Versus Two-Tailed Tests -- continued <ul><li>Once the hypotheses are set up, it is easy to detect whether the t...
Types of Errors <ul><li>Whether or not one decides to accept or reject the null hypothesis, it might be the wrong decision...
Types of Errors -- continued <ul><li>These ideas appear graphically below. </li></ul><ul><li>While these errors seem to be...
Significance Level and Rejection Region <ul><li>The real question is how strong the evidence in favor of the alternative h...
Significance Level and Rejection Region -- continued <ul><li>Then, given the value of  sigma , we use statistical theory t...
Significance from  p -values <ul><li>This approach is currently more popular than the significance level and rejected regi...
Significance from  p -values -- continued <ul><li>Here “extreme” is relative to the null hypothesis. </li></ul><ul><li>In ...
Significance from  p -values -- continued <ul><li>How small is a “small”  p -value? This is largely a matter of semantics ...
Hypothesis Tests for a Population Mean
Background Information <ul><li>Recall that the manager of the Pepperoni Pizza Restaurant is running an experiment to test ...
PIZZA1.XLS <ul><li>The ratings of 40 randomly selected customers and several summary statistics appear in this file and in...
Summary Statistics <ul><li>From the summary statistics, we see that the sample mean is 2.10 and the sample standard deviat...
Summary Statistics -- continued
Running the Test <ul><li>To run the test, we calculate the test statistic, using the borderline null hypothesis value  mu ...
Running the Test -- continued <ul><li>The probability beyond this value in the right tail of the  t  distribution with  n-...
Using StatPro <ul><li>Another way to interpret the results is in terms of traditional significance levels but the  p -valu...
One-Sample Dialog Box
Hypothesis Test Dialog Box
The Results <ul><li>Most of this output should be familiar; it mirrors the previous calculations. </li></ul><ul><li>The re...
Conclusion <ul><li>Should the manager switch to the new-style pizza on the basis of these sample results? </li></ul><ul><l...
Hypothesis Tests for a  Population Mean
Background Information <ul><li>Assume that the manager of the Pepperoni Pizza Restaurant currently uses two methods of pro...
Background Information - continued <ul><li>Each of 40 randomly selected customers receives two pizzas, one made by each me...
PIZZA2.XLS <ul><li>The results of the survey appear in this file. </li></ul><ul><li>Is there enough evidence in this sampl...
Formulating the Hypotheses <ul><li>We now rite the hypotheses as  H 0 : mu =0 versus  H a :  mu  0, where    is the mean...
Test & Findings  <ul><li>To run the test use StatPros One-Sample Analysis procedure. Select to do an hypotheses test on th...
In Conclusion <ul><li>Should the manager discontinue the (evidently) less popular second method on the basis of this hypot...
In Conclusion -- continued <ul><li>So why not continue to use the second method if the cost is not prohibitive? </li></ul>...
Hypothesis Tests for Other Parameters
Background Information <ul><li>The Walpole Appliance Company has a customer service department that handles customer quest...
Background Information -- continued <ul><li>Letter writers first receive a mail-gram asking them to call customer service;...
Background Information -- continued <ul><li>To do so, she changes the process for responding to letter writers. Under the ...
Background Information -- continued <ul><li>With this new process in place, the manager has tracked 400 letter writers and...
Solution <ul><li>The manager’s goal is to reduce the proportion of unsatisfied customers after 30 days from 0.15 to 0.075 ...
LETTERS.XLS <ul><li>The test statistic for the data, using the borderline value p 0 =0.075 is This value appears in cell B...
Solution -- continued <ul><li>We find the denominator in cell B8 with the formula =SQRT(HypProp*(1-HyProp/SampSize) </li><...
Results <ul><li>The  p- value might not be as low as you expected - or as low as the manager would like. </li></ul><ul><li...
Results -- continued <ul><li>The 95% confidence interval extends from 0.035 to 0.080.  It includes the target value, 0.075...
Hypothesis Test for Differences Between Population Proportions
Background Information <ul><li>The ArmCo Company, a large manufacturer of automobile parts, has several plants in the Unit...
Background Information -- continued <ul><li>No such initiatives were taken at the other ArmCo plants. </li></ul><ul><li>As...
Background Information - continued <ul><li>Employees were instructed to respond to each item on the questionnaire by check...
EMPOWER1.XLS <ul><li>The results of the questionnaire for these two items appear in this file and in rows 5 and 6 of the t...
Questions <ul><li>Does it appear that the policies at the Midwest plant are appreciated? </li></ul><ul><li>Should ArmCo im...
Solution <ul><li>For either questionnaire item we let p1 be the proportion of “yes” responses we would obtain at the Midwe...
Solution -- continued <ul><li>The data from this type of questionnaire is usually given as  counts   of “yes” and “no” res...
Solution -- continued <ul><li>Then the test statistic is 1.473, and the corresponding  p- value for the test is the probab...
Results <ul><li>These results should be fairly good news for management. </li></ul><ul><li>There is moderate, but not over...
Results -- continued <ul><li>Corresponding 95% confidence intervals for the difference between proportions appear in rows ...
Results -- continued <ul><li>Only 39% of the sampled employees at that plant believe that management generally responds to...
Tests for Normality
Background Information <ul><li>A company manufactures strips of metal that are supposed to have a width of 10 centimeters....
NORMTEST.XLS <ul><li>The sample data appear in this file and in the table below. </li></ul>
Summary Measures <ul><li>A number of summary measures also appear in the table. </li></ul><ul><li>These summary measures h...
The Test <ul><li>The test we will be running is the  chi-square goodness-of fit . </li></ul><ul><li>This test involves  fo...
The Test -- continued <ul><li>After specifying the histogram categories in the usual way we obtain the following message: ...
Resulting Histogram
Resulting Table
Analysis <ul><li>The normal fit to the data appears to be quite good. </li></ul><ul><li>The message confirms this statisti...
Analysis -- continued <ul><li>The corresponding  p -value in cell H5, 0.814, is calculated with the formula  =CHIDIST(Test...
Analysis -- continued <ul><li>Therefore, whatever statistical procedure the manager intends to use, he doesn’t need to wor...
Chi-Square Test for Independence
Background Information <ul><li>Big Office, a chain of large office supply stores, sells an extensive line of desktop and l...
Background Information -- continued <ul><li>Because of limitations in its information system, Big Office does not have the...
PCDEMAND.XLS <ul><li>Each day’s demand for each type of computer is categorized as Low, Medium-Low, Medium-High, or High. ...
PCDEMAND.XLS <ul><li>The individual counts show, for example, that demand was high for both desktops and laptops on 11 of ...
Chi-Square Test for Independence <ul><li>This test is used in situations where a population is categorized in two differen...
Chi-Square Test for Independence -- continued <ul><li>In this example however, we might suspect that these attributes are ...
Chi-Square Test for Independence -- continued <ul><li>The null hypothesis for the test is that the two attributes are inde...
Testing the Data <ul><li>The idea of the test is to compare actual counts in the table with what we would expect them to b...
Testing the Data -- continued <ul><li>What do we expect under independence? </li></ul><ul><li>The totals in row 9 indicate...
Testing the Data -- continued <ul><li>The probability estimates of low desktops from row 5, for example is 4/43 = 0.093. S...
Testing the Data -- continued <ul><li>We can perform the calculations for the test easily with StatPro. </li></ul><ul><li>...
Testing the Data -- continued <ul><li>StatPro then provides the message shown here and it appends a sheet name ChiSqIndep ...
Testing the Data -- continued
Analysis <ul><li>We interpret the  p -value of the test, 0.045, in the usual way. Specifically, we can reject the null hyp...
Analysis -- continued <ul><li>The two tables in rows 8-19 are especially helpful. If the demands  were  independent, the r...
Upcoming SlideShare
Loading in …5
×

Lecture6 Applied Econometrics and Economic Modeling

1,039 views

Published on

Applied Economic Modeling

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,039
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
53
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Lecture6 Applied Econometrics and Economic Modeling

  1. 1. Concepts in Hypothesis Testing
  2. 2. Background Information <ul><li>The manager of Pepperoni Pizza Restaurant has recently begun experimenting with a new method of baking its pepperoni pizzas. </li></ul><ul><li>He believes that the new method produces a better-tasting pizza, but he would like to base a decision on whether to switch from the old method to the new method on customer reactions. </li></ul><ul><li>Therefore he performs an experiment. </li></ul>
  3. 3. The Experiment <ul><li>For 100 randomly selected customers who order a pepperoni pizza for home delivery, he includes both an old style and a free new style pizza in the order. </li></ul><ul><li>All he asks is that these customers rate the difference between pizzas on a -10 to +10 scale, where -10 means they strongly favor the old style, +10 means they strongly favor the new style, and 0 means they are indifferent between the two styles. </li></ul><ul><li>Once he gets the ratings from the customers, how should he proceed? </li></ul>
  4. 4. Hypothesis Testing <ul><li>This example’s goal is to explain hypothesis testing concepts. We are not implying that the manager would, or should, use a hypothesis testing procedure to decide whether to switch methods. </li></ul><ul><li>First, hypothesis testing does not take costs into account. In this example, if the new method is more costly it would be ignored by hypothesis testing. </li></ul><ul><li>Second, even if costs of the two pizza-making methods are equivalent, the manager might base his decision on a simple point estimate and possibly a confidence interval. </li></ul>
  5. 5. Null and Alternative Hypotheses <ul><li>Usually, the null hypothesis is labeled H o and the alternative hypothesis is labeled H a . </li></ul><ul><li>The null and alternative hypotheses divide all possibilities into two nonoverlapping sets, exactly one of which must be true. </li></ul><ul><li>Traditionally, hypotheses testing has been phrased as a decision-making problem, where an analyst decides either to accept the null hypothesis or reject it, based on the sample evidence. </li></ul>
  6. 6. One-Tailed Versus Two-Tailed Tests <ul><li>The form of the alternative hypothesis can be either a one-tailed or two-tailed , depending on what the analyst is trying to prove. </li></ul><ul><li>A one-tailed hypothesis is one where the only sample results which can lead to rejection of the null hypothesis are those in a particular direction, namely, those where the sample mean rating is positive. </li></ul><ul><li>A two-tailed test is one where results in either of two directions can lead to rejection of the null hypothesis. </li></ul>
  7. 7. One-Tailed Versus Two-Tailed Tests -- continued <ul><li>Once the hypotheses are set up, it is easy to detect whether the test is one-tailed or two-tailed. </li></ul><ul><li>One tailed alternatives are phrased in terms of “>” or “<“ whereas two tailed alternatives are phrased in terms of “  ” </li></ul><ul><li>The real question is whether to set up hypotheses for a particular problem as one-tailed or two-tailed. </li></ul><ul><li>There is no statistical answer to this question. It depends entirely on what we are trying to prove. </li></ul>
  8. 8. Types of Errors <ul><li>Whether or not one decides to accept or reject the null hypothesis, it might be the wrong decision. </li></ul><ul><li>One might reject the null hypothesis when it is true or incorrectly accept the null hypothesis when it is false. </li></ul><ul><li>These errors are called type I and type II errors. </li></ul><ul><li>In general we incorrectly reject a null hypothesis that is true. We commit a type II error when we incorrectly accept a null hypothesis that is false. </li></ul>
  9. 9. Types of Errors -- continued <ul><li>These ideas appear graphically below. </li></ul><ul><li>While these errors seem to be equally serious, actually type I errors have traditionally been regarded as the more serious of the two. </li></ul><ul><li>Therefore, the hypothesis-testing procedure factors caution in terms of rejecting the null hypothesis. </li></ul>
  10. 10. Significance Level and Rejection Region <ul><li>The real question is how strong the evidence in favor of the alternative hypothesis must be to reject the null hypothesis. </li></ul><ul><li>The analyst determines the probability of a type I error that he is willing to tolerate. The value is denoted by  and is most commonly equal to 0.05, although sigma =0.01 and sigma =0.10 are also frequently used. </li></ul><ul><li>The value of  is called the significance level of the test. </li></ul>
  11. 11. Significance Level and Rejection Region -- continued <ul><li>Then, given the value of sigma , we use statistical theory to determine the rejection region. </li></ul><ul><li>If the sample falls into this region we reject the null hypothesis; otherwise, we accept it. </li></ul><ul><li>Sample evidence that falls into the rejection region is called statistically significant at the sigma level . </li></ul>
  12. 12. Significance from p -values <ul><li>This approach is currently more popular than the significance level and rejected region approach. </li></ul><ul><li>This approach is to avoid the use of the  level and instead simply report “how significant” the sample evidence is. </li></ul><ul><li>We do this by means of the p- value .The p -value is the probability of seeing a random sample at least as extreme as the sample observes, given that the null hypothesis is true. </li></ul>
  13. 13. Significance from p -values -- continued <ul><li>Here “extreme” is relative to the null hypothesis. </li></ul><ul><li>In general smaller p -values indicate more evidence in support of the alternative hypothesis. If a p -value is sufficiently small, almost any decision maker will conclude that rejecting the null hypothesis is the more “reasonable” decision. </li></ul>
  14. 14. Significance from p -values -- continued <ul><li>How small is a “small” p -value? This is largely a matter of semantics but if the </li></ul><ul><ul><li>p -value is less than 0.01, it provides “convincing” evidence that the alternative hypothesis is true; </li></ul></ul><ul><ul><li>p -value is between 0.01 and 0.05, there is “strong” evidence in favor of the alternative hypothesis; </li></ul></ul><ul><ul><li>p -value is between 0.05 and 0.10, it is in a “gray area”; </li></ul></ul><ul><ul><li>p -values greater than 0.10 are interpreted as weak or no evidence in support of the alternative. </li></ul></ul>
  15. 15. Hypothesis Tests for a Population Mean
  16. 16. Background Information <ul><li>Recall that the manager of the Pepperoni Pizza Restaurant is running an experiment to test the hypotheses of H 0 : mu  0 versus H a : mu > 0, where  is the mean rating in the entire customer population. </li></ul><ul><li>Here, each customer rates the difference between an old-style pizza and a new-style pizza on a -10 to +10 scale, where negative ratings favor the old-style pizza and positive ratings favor the new-style pizza. </li></ul>
  17. 17. PIZZA1.XLS <ul><li>The ratings of 40 randomly selected customers and several summary statistics appear in this file and in the following table. </li></ul>
  18. 18. Summary Statistics <ul><li>From the summary statistics, we see that the sample mean is 2.10 and the sample standard deviation is 4.717. </li></ul><ul><li>The positive sample mean provides some evidence in favor of the alternative hypothesis, but given the rather large standard deviation and the boxplot of ratings shown on the next slide does it provide enough evidence to reject H 0 ? </li></ul>
  19. 19. Summary Statistics -- continued
  20. 20. Running the Test <ul><li>To run the test, we calculate the test statistic, using the borderline null hypothesis value mu 0 = 0, and report how much probability is beyond it in the right tail of the appropriate t distribution. </li></ul><ul><li>We use the right tail because the alternative is one-tailed of the “greater than” variety. </li></ul><ul><li>The test statistic is </li></ul>
  21. 21. Running the Test -- continued <ul><li>The probability beyond this value in the right tail of the t distribution with n-1 = 39 degrees of freedom is approximately 0.004, which can be found in Excel with the function TDIST(2.816,39,1). </li></ul><ul><li>The probability, 0.004, is the p -value for the test. It indicates that these sample results would be very unlikely if the null hypothesis is true. </li></ul><ul><li>The manager has two choices: he can conclude that the null hypothesis is true or he can conclude that the alternative hypothesis is true - and presumably switch to the new-style pizza. The second choice appears to be more reasonable. </li></ul>
  22. 22. Using StatPro <ul><li>Another way to interpret the results is in terms of traditional significance levels but the p -value is the preferred method. </li></ul><ul><li>The StatPro One-Sample procedure can be used to perform this analysis easily. To use it select the StatPro/Statistical Inference/One-Sample Analysis menu item, and choose the Rating variable as the variable to analyze. </li></ul><ul><li>Then fill in the dialog boxes as shown here. </li></ul>
  23. 23. One-Sample Dialog Box
  24. 24. Hypothesis Test Dialog Box
  25. 25. The Results <ul><li>Most of this output should be familiar; it mirrors the previous calculations. </li></ul><ul><li>The results are significant at the 1% level. </li></ul>
  26. 26. Conclusion <ul><li>Should the manager switch to the new-style pizza on the basis of these sample results? </li></ul><ul><li>We would probably recommend “yes”. There is no indication that the new-style pizza costs any more to make than the old-style pizza, and the sample evidence is fairly convincing that customers, on average, will prefer the new-style pizza. </li></ul><ul><li>Therefore, unless there are reasons for not switching (for example, costs) then we recommend the switch. </li></ul>
  27. 27. Hypothesis Tests for a Population Mean
  28. 28. Background Information <ul><li>Assume that the manager of the Pepperoni Pizza Restaurant currently uses two methods of producing pepperoni pizzas. </li></ul><ul><li>He plans to discontinue one of these methods if the results of a survey indicate that customers favor one of the methods by a “significant” margin. </li></ul><ul><li>The survey is conducted exactly as in the previous example. </li></ul>
  29. 29. Background Information - continued <ul><li>Each of 40 randomly selected customers receives two pizzas, one made by each method. </li></ul><ul><li>These customers are asked to rate the pizzas on a scale of -10 to +10, where negative ratings favor the first method and positive ratings favor the second method. </li></ul>
  30. 30. PIZZA2.XLS <ul><li>The results of the survey appear in this file. </li></ul><ul><li>Is there enough evidence in this sample data to persuade the manager to discontinue one of the methods? </li></ul>
  31. 31. Formulating the Hypotheses <ul><li>We now rite the hypotheses as H 0 : mu =0 versus H a : mu  0, where  is the mean rating over the entire customer population. </li></ul><ul><li>A two-tailed alternative is appropriate here because the manager has no idea, before the sample is taken, which method (if either) will be favored. </li></ul><ul><li>It is not appropriate to look at the sample and decide to use the one-tailed variety because of the negative mean. The hypotheses should always be formulated before the same data are observed. </li></ul>
  32. 32. Test & Findings <ul><li>To run the test use StatPros One-Sample Analysis procedure. Select to do an hypotheses test on the mean. </li></ul><ul><li>The small p -value provides convincing evidence for the manager that there is a difference, on average, between customer reactions to the two methods of making pizzas. </li></ul><ul><li>On average, customers appear to favor the new method of making pizzas. </li></ul>
  33. 33. In Conclusion <ul><li>Should the manager discontinue the (evidently) less popular second method on the basis of this hypothesis test? </li></ul><ul><li>The answer almost certainly depends on the costs that have not been mentioned. </li></ul><ul><li>The primary reason for discontinuing one of the methods is presumably to save costs by using only one production instead of two. </li></ul><ul><li>It appears that on average the population favors the first method but the data show that a good-sized minority favors the second method. </li></ul>
  34. 34. In Conclusion -- continued <ul><li>So why not continue to use the second method if the cost is not prohibitive? </li></ul><ul><li>The company could easily achieve greater overall profit by continuing to make pizzas by both methods than by discontinuing the slightly less popular method. </li></ul><ul><li>Once again, hypothesis testing provides useful information but the decision should be based on a careful cost analysis. </li></ul>
  35. 35. Hypothesis Tests for Other Parameters
  36. 36. Background Information <ul><li>The Walpole Appliance Company has a customer service department that handles customer questions and complaints. </li></ul><ul><li>This department's processes are set up to respond quickly and accurately to customers who phone in their concerns. However, there is a sizable minority of customers who prefer to write letters. </li></ul><ul><li>Traditionally, the customer service department has not been very efficient in responding to these customers. </li></ul>
  37. 37. Background Information -- continued <ul><li>Letter writers first receive a mail-gram asking them to call customer service; and when they do call, the customer service representative who answers the phone typically has no knowledge of the customer’s problem. </li></ul><ul><li>As a result, the department manager estimates that 15% of the letter writers have not obtained a satisfactory response within 30 days of the time their letters were first received. </li></ul><ul><li>The manager’s goal is to reduce this value by at least half, that is, to 7.5% or less. </li></ul>
  38. 38. Background Information -- continued <ul><li>To do so, she changes the process for responding to letter writers. Under the new process, these customers now receive a prompt and courteous form letter that responds to their problem. </li></ul><ul><li>Each form letter states that if the customer still has problems, he or she can call the department. </li></ul><ul><li>The manager also files the original letters so that if customers do call back, the representative will be able to find their letters quickly and respond intelligently. </li></ul>
  39. 39. Background Information -- continued <ul><li>With this new process in place, the manager has tracked 400 letter writers and has found that only 23 of them are classified as “unsatisfied” after a 30-day period. </li></ul><ul><li>Does it appear that the manager has achieved her goal? </li></ul>
  40. 40. Solution <ul><li>The manager’s goal is to reduce the proportion of unsatisfied customers after 30 days from 0.15 to 0.075 or less. </li></ul><ul><li>Because the burden of proof is on her to “prove” that she has accomplished this goal, we set up the hypotheses as H o : p > = 0.075 versus H a : p < 0.075, where p is the proportion of all the letter writers who are still unsatisfied after 30 days. </li></ul><ul><li>The sample proportion she has observed is 0.0575. This is obviously less than 0.075, but is it enough less to reject he null hypothesis? </li></ul>
  41. 41. LETTERS.XLS <ul><li>The test statistic for the data, using the borderline value p 0 =0.075 is This value appears in cell B10 of this file which contains the analysis of the new process. </li></ul>
  42. 42. Solution -- continued <ul><li>We find the denominator in cell B8 with the formula =SQRT(HypProp*(1-HyProp/SampSize) </li></ul><ul><li>The corresponding p -value, 0.092 is found with the formula =NORMSDIST(TestStat) in cell B11. </li></ul><ul><li>It is the probability to to left of -1.329 in the standard normal distribution. </li></ul><ul><li>Also, because np 0 = 400(0.075)=30>5 and n (1-p 0 )= 400(0.925)>5, this test is valid; that is, the sample size is large enough for the normal approximation to hold. </li></ul>
  43. 43. Results <ul><li>The p- value might not be as low as you expected - or as low as the manager would like. </li></ul><ul><li>In spite of the fact that the sample proportion appears to be well below the target proportion of 0.075, the evidence in support of the alternative hypothesis is not overwhelming. </li></ul><ul><li>In statistical terminology, the results are significant at the 10% level, but not at the 5% or 1% level. </li></ul>
  44. 44. Results -- continued <ul><li>The 95% confidence interval extends from 0.035 to 0.080. It includes the target value, 0.075, but just barely. In this sense it also supports the argument that the manager has indeed achieved her goal. </li></ul><ul><li>Analysts might disagree on whether a hypothesis test or a confidence interval is the more appropriate way to present these results. However, we see them as complementary and do not necessarily favor one over the other. </li></ul><ul><li>The bottom line is that they both provide strong, but not totally conclusive, evidence that the manager has achieved her goal. </li></ul>
  45. 45. Hypothesis Test for Differences Between Population Proportions
  46. 46. Background Information <ul><li>The ArmCo Company, a large manufacturer of automobile parts, has several plants in the United States. </li></ul><ul><li>For years ArmCo employees have complained that their suggestions for improvements in the manufacturing processes are ignored by upper management. </li></ul><ul><li>In the spirit of employee empowerment, ArmCo management at the Midwest plant decided to initiate a number of policies to respond to employee suggestions. </li></ul>
  47. 47. Background Information -- continued <ul><li>No such initiatives were taken at the other ArmCo plants. </li></ul><ul><li>As expected, there was a great deal of employee enthusiasm at the Midwest plant shortly after the new policies were implemented, but the question was whether life would revert to normal and enthusiasm would dampen with time. </li></ul><ul><li>To check this, 100 randomly selected employees at the Midwest plant and 300 employees from other plants were asked to fill out a questionnaire 6 months after the implementation of the new policies . </li></ul>
  48. 48. Background Information - continued <ul><li>Employees were instructed to respond to each item on the questionnaire by checking either a “yes” box or a “no” box. </li></ul><ul><li>Two specific items on the questionnaire were </li></ul><ul><ul><li>Management at this plant is generally responsive to employee suggestions or improvements in the manufacturing processes. </li></ul></ul><ul><ul><li>Management at this plant is more responsive to employees suggestions now than it used to be. </li></ul></ul>
  49. 49. EMPOWER1.XLS <ul><li>The results of the questionnaire for these two items appear in this file and in rows 5 and 6 of the table below. </li></ul>
  50. 50. Questions <ul><li>Does it appear that the policies at the Midwest plant are appreciated? </li></ul><ul><li>Should ArmCo implement these policies in other plants? </li></ul>
  51. 51. Solution <ul><li>For either questionnaire item we let p1 be the proportion of “yes” responses we would obtain at the Midwest plant if the questionnaire were given to all its employees. </li></ul><ul><li>We define p 2 similarly for the other plants. </li></ul><ul><li>Management certainly hopes to find a larger proportion of “yes” responses (to either item), with the hypotheses set up H 0 :p 1 - p 2 < = 0 versus H a : p 1 - p 2 > 0. </li></ul>
  52. 52. Solution -- continued <ul><li>The data from this type of questionnaire is usually given as counts of “yes” and “no” responses, but these translate into sample proportions. </li></ul><ul><li>For the first questionnaire item, the sample proportions of “yes” responses are 0.39 and 0.31 for a difference of 0.08. The standard error of this difference, under the assumption that p 1 = p 2 , uses the pooled proportion equal to 0.33. This produces a standard error of 0.054, calculated in cell B13 with the formula =SQRT(PooledProp*(1-PooledProp) *(1/SampSize1+1/SampSize2)) </li></ul>
  53. 53. Solution -- continued <ul><li>Then the test statistic is 1.473, and the corresponding p- value for the test is the probability to the right of 1.473 in the standard normal distribution. Its value is 0.070 found in cell B15 with the formula =1-NORMDIST(TestStat) </li></ul><ul><li>A similar analysis for the second questionnaire item leads to a sample difference of 0.15 and a p -value of 0.004. </li></ul>
  54. 54. Results <ul><li>These results should be fairly good news for management. </li></ul><ul><li>There is moderate, but not overwhelming, support for the hypothesis that management at the Midwest plant is more responsive than at the other plants, at least as perceived by employees. </li></ul><ul><li>There is convincing support for the hypothesis that things have improved more at the Midwest plant than at the other plants. </li></ul>
  55. 55. Results -- continued <ul><li>Corresponding 95% confidence intervals for the difference between proportions appear in rows 21 and 22. </li></ul><ul><li>Since they are almost completely positive, they reinforce the hypothesis-test findings. </li></ul><ul><li>Moreover, they provide a range of plausible values for the differences between the population proportions. </li></ul><ul><li>The only real downside to these findings, from Midwest management’s point of view, is the sample proportion for the first item. </li></ul>
  56. 56. Results -- continued <ul><li>Only 39% of the sampled employees at that plant believe that management generally responds to their suggestions, even though 68% believe things are better than they used to be. </li></ul><ul><li>A reasonable conclusion by ArmCo management is that they are on the right track at the Midwest plant, and the policies initiated there ought to be initiated at other plants, but more still needs to be done at all plants. </li></ul>
  57. 57. Tests for Normality
  58. 58. Background Information <ul><li>A company manufactures strips of metal that are supposed to have a width of 10 centimeters. </li></ul><ul><li>For purposes of quality control, the manager plans to run some statistical tests on these strips. </li></ul><ul><li>However, realizing that these statistical procedures assume normally distributed widths, he first tests this normality assumption on 90 randomly sampled strips. </li></ul><ul><li>How should he proceed? </li></ul>
  59. 59. NORMTEST.XLS <ul><li>The sample data appear in this file and in the table below. </li></ul>
  60. 60. Summary Measures <ul><li>A number of summary measures also appear in the table. </li></ul><ul><li>These summary measures help the manager to select “reasonable” categories for a histogram of the data. </li></ul><ul><li>After observing them, the manager chooses 10 categories for the histogram. The extreme categories are “less than or equal to 9.980” and “greater than 10.020”, and the middle eight categories each have a length of 0.005. </li></ul>
  61. 61. The Test <ul><li>The test we will be running is the chi-square goodness-of fit . </li></ul><ul><li>This test involves forming a histogram of the sample data and comparing this to the expected histogram we would would observe if the data were normally distributed with the same mean and standard deviation as the sample. </li></ul><ul><li>To begin running the test in this example, we select the StatPro/Tests for Normality/Chi-Square Test menu item, which leads to the same dialog box as the histogram procedure. </li></ul>
  62. 62. The Test -- continued <ul><li>After specifying the histogram categories in the usual way we obtain the following message: </li></ul><ul><li>In addition we obtain the following histogram and data table: </li></ul>
  63. 63. Resulting Histogram
  64. 64. Resulting Table
  65. 65. Analysis <ul><li>The normal fit to the data appears to be quite good. </li></ul><ul><li>The message confirms this statistically. </li></ul><ul><li>The values in columns D and E of the table were calculated as the total number of observations multiplied by the normal probability of being in the corresponding category. </li></ul><ul><li>Column E contains the individual chi-square test statistic. </li></ul>
  66. 66. Analysis -- continued <ul><li>The corresponding p -value in cell H5, 0.814, is calculated with the formula =CHIDIST(TestStat,7) </li></ul><ul><li>The large p -value provides no evidence whatsoever of nonnormality. </li></ul><ul><li>It implies that if we repeated the procedure on many random samples, each taken from a population known to be normal, we would obtain a fit at least this poor in about 81% of the samples. </li></ul><ul><li>Stated differently, only about 19% of the fits would be better than the ones we observed. </li></ul>
  67. 67. Analysis -- continued <ul><li>Therefore, whatever statistical procedure the manager intends to use, he doesn’t need to worry about the normality assumption. </li></ul>
  68. 68. Chi-Square Test for Independence
  69. 69. Background Information <ul><li>Big Office, a chain of large office supply stores, sells an extensive line of desktop and laptop computers. </li></ul><ul><li>Company executives want to know whether the demands for these two types of computers are related in any way. </li></ul><ul><li>The products might act as complementary products, where high demand for desktops accompanies high demand for laptops (computers in general are hot), or they might act as substitute products (demand for one takes away demand for the other), or their demand might be unrelated. </li></ul>
  70. 70. Background Information -- continued <ul><li>Because of limitations in its information system, Big Office does not have the exact demands for these products. </li></ul><ul><li>However, it does have daily information on categories of demand, listed in aggregate (that is, over all stores). </li></ul><ul><li>These data appear on the next slide. </li></ul>
  71. 71. PCDEMAND.XLS <ul><li>Each day’s demand for each type of computer is categorized as Low, Medium-Low, Medium-High, or High. </li></ul><ul><li>The table is based on 250 days, so that the counts add to 250. </li></ul>
  72. 72. PCDEMAND.XLS <ul><li>The individual counts show, for example, that demand was high for both desktops and laptops on 11 of the 250 days. </li></ul><ul><li>For convenience, we include row and column totals in the margins. </li></ul><ul><li>Based on these data, can Big Office conclude that demands for these two products are independent? </li></ul>
  73. 73. Chi-Square Test for Independence <ul><li>This test is used in situations where a population is categorized in two different ways. </li></ul><ul><li>For example, we might categorize people by their smoking habits and their drinking habits. The question then is whether these two attributes are independent in a probabilistic sense. </li></ul><ul><li>The answer is yes if information on a person’s drinking habits is of no use in predicting the person's smoking habits (or vice versa). </li></ul>
  74. 74. Chi-Square Test for Independence -- continued <ul><li>In this example however, we might suspect that these attributes are dependent . </li></ul><ul><li>In particular, we might suspect that heavy drinkers are more likely to be heavy smokers, and we might suspect that nondrinkers are more likely to be nonsmokers. </li></ul><ul><li>The chi-square test for independence enables us to test this empirically. </li></ul>
  75. 75. Chi-Square Test for Independence -- continued <ul><li>The null hypothesis for the test is that the two attributes are independent. Therefore, statistically significant results are those that indicate some sort of dependence. </li></ul><ul><li>The data for the test consist of counts in various combinations of categories. </li></ul><ul><li>We usually arrange these in a rectangular table called a contingency table , a cross-tabs , or using Excel terminology - a pivot table. </li></ul>
  76. 76. Testing the Data <ul><li>The idea of the test is to compare actual counts in the table with what we would expect them to be under independence. </li></ul><ul><li>If the actual counts are sufficiently far from the expected counts, we can then reject the null hypothesis of independence. </li></ul><ul><li>The “distance” measure used to check how far apart they are is essentially the same chi-square statistic used in the chi-square test for normality. </li></ul>
  77. 77. Testing the Data -- continued <ul><li>What do we expect under independence? </li></ul><ul><li>The totals in row 9 indicate that demand for desktops was low on 38 of the 350 days. Therefore, if we had to estimate the probability of low demand for desktops, this estimate would be 38/250 = 0.152. </li></ul><ul><li>Now, if demands for the two products were independent, we should arrive at this same estimate from the data in any of the rows 5-8. </li></ul><ul><li>That is the prediction about desktops should be the same regardless of the demand for laptops. </li></ul>
  78. 78. Testing the Data -- continued <ul><li>The probability estimates of low desktops from row 5, for example is 4/43 = 0.093. Similarly, for rows 6,7, and 8, it is 8/80 = 0.100, 16/70 = 0.229, and 10/57 = 0.175. </li></ul><ul><li>These calculations provide some evidence that desktops and laptops act as substitute products - the probability of low desktop demand is larger when laptop demand is medium-high or high than when it is low or medium-low. </li></ul>
  79. 79. Testing the Data -- continued <ul><li>We can perform the calculations for the test easily with StatPro. </li></ul><ul><li>We use StatPro/Statistical Inference/Chi-Square Independence Test menu item. </li></ul><ul><li>There is only one dialog box, which asks for the range of the contingency table - not counting any labels or row or column totals that might surround the table. </li></ul><ul><li>In this case the relevant range has been range-named Counts. </li></ul>
  80. 80. Testing the Data -- continued <ul><li>StatPro then provides the message shown here and it appends a sheet name ChiSqIndep that contains the calculations that are shown on the next slide. </li></ul>
  81. 81. Testing the Data -- continued
  82. 82. Analysis <ul><li>We interpret the p -value of the test, 0.045, in the usual way. Specifically, we can reject the null hypothesis of independence at the 5% or 10% significance level, but not at the 1% level. </li></ul><ul><li>There is a good bit of evidence that the demands for the two products are not independent, but it is not overwhelming. </li></ul><ul><li>If we accept that there is some sort of dependence, we can use the output to examine its form. </li></ul>
  83. 83. Analysis -- continued <ul><li>The two tables in rows 8-19 are especially helpful. If the demands were independent, the rows of this first table should be identical, and the columns of the second table should be identical. </li></ul><ul><li>This is because each row in the first table shows the distribution of desktop demand for each category of laptop demand, whereas each column in the second table shows the distribution of laptop demand for each category of desktop demand. </li></ul><ul><li>A close study of these percentages again provides some evidence that the two products are substitutes, but the evidence is not overwhelming. </li></ul>

×