Your SlideShare is downloading.
×

×

Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- Null hypothesis for a Two Sample Z ... by BYU Center for Te... 360 views
- Reporting a two sample z test for p... by BYU Center for Te... 361 views
- Confidence Intervals by mandalina landy 38052 views
- z-test by Marygrace Cagungun 869 views
- Type i and type ii errors by p24ssp 2835 views
- Z test by Mohmmedirfan Momin 2294 views

1,468

Published on

ASQ/Penn State Univ. Great Valley Statistics Symposium 10/3/09

ASQ/Penn State Univ. Great Valley Statistics Symposium 10/3/09

No Downloads

Total Views

1,468

On Slideshare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

242

Comments

0

Likes

4

No embeds

No notes for slide

- 1. Proportion Testing Chris Connors, Ph.D. Jay Armstrong, MSc., M.C.E. October 2, 2009 Statistics Symposium
- 2. Applied Statistics in Business, Healthcare, Pharmaceuticals, Education,and Industry
- 3. Outline <ul><li>What we are covering and what we are not covering today </li></ul><ul><li>Virtual Scavenger Hunt </li></ul><ul><li>Statistical Decisions and Risk </li></ul><ul><li>Six Sigma DMAIC application </li></ul><ul><li>The Business Approach </li></ul><ul><li>Hypothesis Test Approach </li></ul><ul><li>Understanding Distributions </li></ul><ul><li>Sample Size </li></ul><ul><li>Test of Independence </li></ul><ul><li>Example 1: Regulatory Compliance Documentation </li></ul><ul><li>Example 2: Workload Balance (Productivity) </li></ul><ul><li>References and Web Sites </li></ul><ul><li>Q&A </li></ul>
- 4. Hypothesis Tests: What we are covering? Continuous Data Attribute Data 1 sample t-test : Δ mean from known test mean 2 sample t-test: Δ mean between 2 independent sample means Paired t-test: Δ mean between 2 dependent sample means One Way ANOVA : At least 1 sample mean Δ between 3 or more samples Kruskal Wallis & Mood’s Median: At least 1 sample median Δ between 3 or more samples F-test, Levene’s test, & Bartlett’s test: At least 1 sample standard deviation Δ between 3 or more samples Correlation/Regression/DOE: 2 or more factors are correlated/ Predictor affects the sampled process 1 proportion test: A sample proportion Δ against a known value 2 proportion test: Proportions from the two samples are different Chi Square test: At least one sample proportion Δ from others:
- 5. Scavenger Hunt <ul><li>Find another person who can sign off on these statements. Each person can only sign once. </li></ul><ul><li> 1. Has used Chi-Square or Proportion Test </li></ul><ul><li> 2. Has more than $50 on them </li></ul><ul><li> 3. Used Minitab to determine sample size </li></ul><ul><li> 4. Worked on a project with a value proposition >$1 million </li></ul><ul><li> 5. Knows Chris Connors' middle name </li></ul><ul><li> 6. Has more than three children </li></ul><ul><li> 7. Has met a movie star or celebrity (and was not arrested) </li></ul><ul><li> 8. Knows the difference between a confidence interval and a confidence level </li></ul><ul><li> 9. Knows what a quark or a fantod is </li></ul><ul><li> 10. Has more than one academic degree, license or certification </li></ul>
- 6. Statistical Decision: Setting up your risk level <ul><li>Type I and II errors </li></ul><ul><li>There are two kinds of errors that can be made in significance testing: </li></ul><ul><li>a true null hypothesis can be incorrectly rejected and </li></ul><ul><li>a false null hypothesis can fail to be rejected. The former error is called a Type I error and the latter error is called a Type II error. These two types of errors are defined in the table. </li></ul>The probability of a Type I error is designated by the Greek letter alpha ( ) and is called the Type I error rate; the probability of a Type II error (the Type II error rate) is designated by the Greek letter beta (ß). A Type II error is only an error in the sense that an opportunity to reject the null hypothesis correctly was lost. It is not an error in the sense that an incorrect conclusion was drawn since no conclusion is drawn when the null hypothesis is not rejected.
- 7. Six Sigma DMAIC method: Hypothesis Tests <ul><li>Six Sigma DMAIC method has 5 phases: </li></ul><ul><li>Define Opportunity/Problem </li></ul><ul><li>Measure Performance </li></ul><ul><li>Analyze Process and Performance </li></ul><ul><li>Improve Process and Performance </li></ul><ul><li>Control Process and Performance </li></ul><ul><li>I typically use this diagram to depict the continuous focus of measurement in the Six Sigma method by placing Measure in the center of the DMAIC method. </li></ul>Measure Define Control Improve Analyze
- 8. 6S Black Belt Level of Cognition for Hypothesis Testing *1= learned, 2= know, 3 = used, 4 = taught Topic Level of Cognition My Development Introduction to Statistical Comparisons 2* Normality and Transformation 2 Correlation Analysis 3 Regression Analysis 3 Introduction to Multiple Linear Regression 1 t-tests 3 ANOVA 3 1 and 2 proportion test 3 Chi-Square Analysis 3 Binary Logistic Regression 1
- 9. 6S BB Level of Cognition for Hypothesis Testing *1= learned, 2= know, 3 = used, 4 = taught Topic Level of Cognition My Development Introduction Experimental Design 2* Background on Experimental design 2 DOE Designs and terminology 2 Full Factorial design 2 Half factorial designs 2 Robust Designs 2 Checklists for designing and conducting DOE BBC exercise: DOE Simulation 2 Results of DOE Simulation 2
- 10. The Business Approach Proportion Tests <ul><li>When we want to make a statistical comparison of a discrete variable with a target, or between two discrete variables, Proportion Tests should be used. </li></ul>Statistical Problem Statistical Problem Business Problem Business Solution Decision Statistical Solution Potential Root Causes Identified Root Causes Verified
- 11. Selecting the Right Statistical Tool Discrete Discrete Continuous Proportion Tests Logistic Regression t test ANOVA DOE Correlation Regression X Y Continuous
- 12. Determine if a statistically significant difference of proportion exists between: - A sample and a target - Two independent samples - Two samples or less Tests of Proportion Use samples to make inferences about population proportions 1 Proportion Test 1 Sample Comparing Proportions 2 Proportion Test Chi-Square Test More Than 2 Samples 2 Samples
- 13. Proportion Test Approach <ul><li>State the null and alternative hypotheses </li></ul><ul><li>Null H 0 P 1 = P 2 Number of tails = 2 </li></ul><ul><li> P 1 - P 2 0 Number of tails = 1 </li></ul><ul><li> P 1 - P 2 0 Number of tails = 1 </li></ul><ul><li>Alternatives H a P 1 - P 2 0 P 1 P 2 Number of tails = 1, left or right </li></ul><ul><li>P 1 - P 2 0 </li></ul><ul><li>P 1 - P 2 0 </li></ul><ul><li>2. Formulate an analysis plan : 1 Proportion to known value (z) or 2 Proportions test </li></ul><ul><li>3. Analyze sample data </li></ul><ul><ul><li>Independence Test: Fisher’s, Barnard’s, G-Test </li></ul></ul><ul><ul><li>Pooled sample proportion to compute standard error </li></ul></ul><ul><ul><li>P value for test statistic </li></ul></ul><ul><li>4. Interpret results: for a statistical decision (hopefully a business decision, not not always) </li></ul><ul><li>If P is low, H 0 must be no go </li></ul>
- 14. One Tail or Two Tails: Placing the Alpha Risk
- 15. Useful Discrete Distributions <ul><li>Binomial distribution for: </li></ul><ul><li>The number X of successes (or failures!) in n trials when p is the chance of success (or failure!) or each trial. </li></ul><ul><li>Examples: </li></ul><ul><ul><li>number X of faulty expense reports out of n =100 submitted in a particular month, when the faulty expense report rate typically runs at p =0.03 (i.e., 3%) </li></ul></ul><ul><ul><li>number of voters out of a random sample of n =800 expressing approval of the President’s performance, when the approval rating in the entire population of voters is p =0.42 (i.e., 42%) </li></ul></ul><ul><li>X is discrete : it must be one of 0, 1, 2, … , n </li></ul>
- 16. Binomial - key facts Useful fact: has approximately a normal distribution when n is large (more than 25 or 30) and np and n( 1 -p) are not too small (say >5).
- 17. Binomial - Normal Approximation
- 18. Histogram: n=20
- 19. Histogram: n=100
- 20. Sample Size <ul><li>General Guidelines (if not followed, test may not run): </li></ul><ul><li>Each Sample includes at least 10 failures and 10 successes (some texts say 5) </li></ul><ul><li>The sample is from a population 10 x the sample </li></ul><ul><li>Use Minitab sample size calculator </li></ul><ul><li>Use TI 83 or TI 84 Graphing Calculator (see web) </li></ul>
- 21. Hypothesis testing - terms <ul><li>Null hypothesis (H 0 ) – e.g., µ 1 = µ 2 - this is the hypothesis to be tested and should be in the form of a true/false statement . This hypothesis states that there is NO DIFFERENCE between the data sets or samples or populations. Null hypotheses are never accepted – we either reject them or fail to reject them. The null hypothesis has PRIORITY and should not be rejected unless there is strong statistical evidence to do so. </li></ul><ul><li>Alternate hypothesis (H 1 , H A ) – e.g., µ 1 ≠ µ 2 - the alternative to the null hypothesis – states that there IS A DIFFERENCE between the data sets or populations. </li></ul><ul><li>Type 1 error – rejecting the null hypothesis when it is really true – e.g., “convicting the innocent” </li></ul><ul><li>Type 2 error – failing to reject the null hypothesis when it really is false – e.g., “letting the guilty go free” </li></ul><ul><li>Level (or size) of a test = Alpha ( α ) – is the probability of a type 1 error – default = 5% </li></ul><ul><li>Beta ( β ) – is the probability of a type 2 error – default = 10% </li></ul><ul><li>Power of a test or power – is the probability of correctly rejecting a false null hypothesis. Since β is the probability of a type I error, power is calculated by the formula (1 - β ). Power = (1 - β ) when the null hypothesis is false. The default value for power is 90% This means that you have an 90% chance of finding a difference when you really want to find it. </li></ul><ul><li>Critical region (rejection region) – set of values of the test statistic that cause the null hypothesis to be rejected. If the test statistic falls into the rejection region, the null hypothesis is rejected. </li></ul>
- 22. Hypothesis testing steps <ul><li>State the null hypothesis H 0 and the alternate hypothesis H A (e.g., the mean incomes of college graduates does not equal that of other people) </li></ul><ul><li>Choose the level of significance, alpha ( α default = 0.05) and the sample size (default n = 25) </li></ul><ul><li>Choose the appropriate statistical techniques (t test, Chi-square, etc.,) and test statistic (e.g., mean) </li></ul><ul><li>Collect the data and calculate the sample value of the test statistic </li></ul><ul><li>Calculate the p value based on the test statistic and compare it with alpha ( α = 0.05) </li></ul><ul><li>Make a statistical decision – if p is greater than or equal to alpha, fail to reject the null hypothesis. If the p value is less than alpha, reject the null hypothesis. </li></ul>
- 23. Hypothesis tests are either one tailed or two tail tests Fail to Reject H 0 Reject H 0 1% or 5% significance level Fail to Reject H 0 Reject H 0 One tail test - Answers only ONE question - is the test statistic less than or greater than the known distribution Fail to Reject H 0 Reject H 0 Reject H 0 Two tailed test – Only asks if the test statistic is different from the known distribution – H A usually has “not equal to” in the wording 2.5% significance level 2.5% significance level
- 24. Clinical Testing One-tailed example by hand <ul><li>The “Feel Good” Drug company has discovered a new drug which prevents acne. Since the market for skin care products is larger for woman than men, the company would like to be able to show a treatment advantage for women vs men. The company statistician chooses a simple random sample of 110 women and 207 men from a population of 100,000 healthy volunteers. After 6 months, 48% of women had no acne, vs 61% of men. Can the company claim a benefit for women vs men at the 0.01 level of significance? </li></ul><ul><li>What are the hypotheses? </li></ul><ul><li>Calculate the pooled sample proportion and the Standard Error and consult the z-score statistic </li></ul><ul><li>What do the results tell us? </li></ul>
- 25. Clinical Testing One-tailed example by hand <ul><li>1) What are the hypotheses? </li></ul><ul><li>Ho - P1 = P2 </li></ul><ul><li>Ha – P1 < > P2 </li></ul><ul><li>The null hypothesis will be rejected if the proportion of women developing acne (p1) is substantially smaller than the proportion of men developing acne (p2) </li></ul><ul><li>Calculate the pooled sample proportion and the Standard Error and consult the z-score statistic: </li></ul><ul><ul><li>P = (p1 * n1 + p2 * n2)/(n1 + n2) </li></ul></ul><ul><ul><li>= [(0.48 *110) + (0.61 * 207)]/(110 + 207) </li></ul></ul><ul><ul><li>= 52.8 + 126.3 / 317 </li></ul></ul><ul><ul><li>= 0.564 </li></ul></ul><ul><ul><li>SE = sqrt { p * (1 - p) * [(1/n1) + (1/n2)]} </li></ul></ul><ul><ul><li>= [ 0.564 * 0.436 * (1/110 + 1/207) </li></ul></ul><ul><ul><li>= sqrt 0.245 * (0.009 + 0.005) </li></ul></ul><ul><ul><li>= 0.058 </li></ul></ul><ul><ul><li>Z = (p1 - p2)/SE = (0.48 - 0.61) / 0.058 </li></ul></ul><ul><ul><li>= -2.24 </li></ul></ul><ul><ul><li>Since this is a one tailed test, the P value is the probability that the z-score is </li></ul></ul><ul><ul><li>less than -2.24. The Normal distribution calculator for P (z < -2.24) = 0.013 </li></ul></ul><ul><ul><li>P value = 0.013. Since 0.013 is greater than the chosen significance level (0.01), </li></ul></ul><ul><ul><li>WE FAIL TO REJECT THE NULL HYPOTHESIS – THERE IS NO STATISTICAL DIFFERENCE BETWEEN THE POPULATIONS </li></ul></ul>
- 26. Test of Independence <ul><li>Fisher’s Exact Test is most commonly used for 2 x 2 tables to determine if there is a nonrandom relationship between two categorical variables. Fisher’s calculates conditional probability for the observed row and column matrix . </li></ul><ul><li>Fisher’s exact test in Minitab: </li></ul>Rows: adverse Columns: drug new old All n 90 80 170 y 210 120 330 All 300 200 500 Cell Contents: Count Fisher's exact test: P-Value = 0.0265193
- 27. Regulatory Compliance Documentation Sample Size: Minitab
- 28. The Business Approach 1-Proportion Test Statistical Problem Statistical Problem Business Problem Business Solution Decision Statistical Solution Potential Root Causes Identified Root Causes Verified
- 29. <ul><li>A Black Belt is studying the company’s ability to get regulatory compliance documentation to the record center with in 5 days from project completion. </li></ul><ul><li>What is the binomial characteristic? </li></ul><ul><li>A random sample of 130 project documentation records showed that 74 of them met the 5 day deadline. </li></ul><ul><li>The business was heard saying “at least we’re over the half way mark!” </li></ul><ul><li>Test the hypothesis at 95% confidence that more than 50% of engagements met the deadline. </li></ul><ul><li>What is the Null Hypothesis? </li></ul>Regulatory Compliance Documentation Example
- 30. <ul><li>H o : The proportion of compliance documentation filed at the record center on time is 50% (interim target value). </li></ul><ul><li>H a : The proportion of external work papers filed at the record center on time is greater than 50%. </li></ul><ul><li>Note: Typically the alternative is stated as “there is a difference.” </li></ul><ul><li>Why does this example state “greater than?” </li></ul>Regulatory Compliance Documentation Example - Hypothesis
- 31. <ul><li>Tool Bar Menu > Stat > Basic Statistics > 1 Proportion Analysis </li></ul>Compliance Documentation Example – Minitab Commands target
- 32. Compliance Documentation Example – Minitab Results What’s our interpretation? Test and CI for One Proportion Test of p = 0.5 vs p > 0.5 95% Lower Exact Sample X N Sample p Bound P-Value 1 74 130 0.569231 0.493309 0.068
- 33. Regulatory Compliance Documentation Sample Size <ul><li>Power and Sample Size </li></ul><ul><li>Test for Two Proportions </li></ul><ul><li>Testing proportion 1 = proportion 2 (versus <) </li></ul><ul><li>Calculating power for proportion 2 = 0.7 </li></ul><ul><li>Alpha = 0.05 </li></ul><ul><li>Sample Target </li></ul><ul><li>Proportion 1 Size Power Actual Power </li></ul><ul><li>0.6 388 0.9 0.900148 </li></ul><ul><li>0.6 281 0.8 0.800923 </li></ul><ul><li>The sample size is for each group. </li></ul><ul><li>Is the sample size a concern? </li></ul>
- 34. The Business Approach 2-Proportion Test Statistical Problem Statistical Problem Business Problem Business Solution Decision Statistical Solution Potential Root Causes Identified Root Causes Verified
- 35. Analysis of Proportions for Workload Balance Jack Lairdieson, MBB, Vanguard Interpret as an Interval Plot for Multiple Proportions Total Region 5 Region 6 Region 3 Region 1 Region 3 Region 2
- 36. The Workload Balance (WLB) metrics were being discussed at a regional meeting. The Region 1 representative scoffed at the Region 2 representative that the Region 2’s “In-range” WLB performance metrics were at the “bottom of the barrel”. The Region 2 representative quickly responded, “Really, Region 1 is no better than Region 2.” Once back to the office the concerned Region 1 representative gave the following Workload Balance data to a Black Belt. WLB Stats In-Range Staff Region 1 663 1411 Region 2 141 353 Should Region 1 be concerned about his conclusion? What is the null hypothesis? Workload Balance Example
- 37. <ul><li>H o : The proportion of Region 1 “In-Range” staff is equal to the proportion of Region 2 “In-Range” staff. </li></ul><ul><li>H a : The proportion of Region 1 “In-Range” staff is not equal to the proportion of Region 2 “In-Range” staff. </li></ul><ul><li>or </li></ul><ul><li>H a : The proportion of Region 1 “In-Range” staff is greater than the proportion of Region 2 “In-Range” staff. </li></ul>Workload Balance Example - Hypothesis
- 38. <ul><li>Tool Bar Menu > Stat > Basic Statistics > 2 Proportion </li></ul><ul><li>Analysis through MINITAB™ </li></ul>Workload Balance Example – Minitab Commands
- 39. Workload Balance Example – Minitab Results Session Window Output What’s our interpretation? What Hypothesis did we choose to test? Is the sample size a concern? Test and CI for Two Proportions Sample X N Sample p 1 663 1411 0.469880 2 141 353 0.399433 Difference = p (1) - p (2) Estimate for difference: 0.0704461 95% lower bound for difference: 0.0223190 Test for difference = 0 (vs > 0): Z = 2.41 P-Value = 0.008
- 40. Sample Size: Minitab <ul><li>Testing proportion 1 = proportion 2 (versus >) </li></ul><ul><li>Calculating power for proportion 2 = 0.399 </li></ul><ul><li>Alpha = 0.05 </li></ul><ul><li>Sample Target </li></ul><ul><li>Proportion 1 Size Power Actual Power </li></ul><ul><li>0.469 857 0.9 0.900072 </li></ul><ul><li>0.469 619 0.8 0.800094 </li></ul><ul><li>The sample size is for each group. </li></ul>
- 41. References <ul><li>Fisher RA (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh </li></ul><ul><li>Barnard GA (1945). A new test for 2 x 2 tables. Nature 156:177 </li></ul><ul><li>Chan I (1998) Exact tests of equivalence and efficacy with non-zero lower bound for comparative studies. Statistics in Medicine 17, 1403-1413 </li></ul><ul><li>Mehta CR and Senchaudhuri P (2003). Conditional versus unconditional tests for comparing two binomials. Cytel Software. </li></ul><ul><li>Web Sites: </li></ul><ul><li>http://www.minitab.com/support/documentation/answers/ </li></ul><ul><li>SampleSize2p.pdf </li></ul><ul><li>www.statsoft.com/textbook/stathome </li></ul><ul><li>http://sofia.fhda.edu/gallery/statistics/lessons/lesson10-2 </li></ul>
- 42. Six Sigma Links Six Sigma Motorola, Inc. - Motorola University Six Sigma - What is Six Sigma? i Six Sigma - Six Sigma Quality Resources for Achieving Six Sigma Results General Electric : Our Company : What is Six Sigma? Quality American Society for Quality - ASQ TQM Virtual CoursePack SPC Press - Home Statistics http://www. statsoft .com/textbook/ stathome .html Penn State Statistical Education Resource Kit--Overview of Statistics Data Statistics Video Course The Sofia Open Content Initiative - Elementary Statistics Resource: Learning Math: Data Analysis, Statistics, and Probability Lean Six Sigma Kaizen and Lean Manufacturing Consulting: Gemba Research - | Kaizen Products Conquering Complexity, Fast Innovation, Lean Six Sigma Quality. George Group Consulting Six Sigma Training Book LEAN.org - Lean Enterprise Institute| Lean Production| Lean Manufacturing| LEI| Lean Services| Lean Enterprise Training Course| Lean Consumption| Lean Resources| Lean Experts| Lean Healthcare| Lean in Healthcare| Training on Lean Manufacturing| Lean Business Excel Statistics Add on http://www.qimacros.com/

Be the first to comment