NG BB 33 Hypothesis Testing Basics
Upcoming SlideShare
Loading in...5
×
 

NG BB 33 Hypothesis Testing Basics

on

  • 3,976 views

 

Statistics

Views

Total Views
3,976
Views on SlideShare
3,976
Embed Views
0

Actions

Likes
4
Downloads
471
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

NG BB 33 Hypothesis Testing Basics NG BB 33 Hypothesis Testing Basics Presentation Transcript

  • UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training Module 33 Hypothesis Testing Basics UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOCPI Roadmap – Analyze 8-STEP PROCESS 6. See 1.Validate 2. Identify 3. Set 4. Determine 5. Develop 7. Confirm 8. Standardize Counter- the Performance Improvement Root Counter- Results Successful Measures Problem Gaps Targets Cause Measures & Process Processes Through Define Measure Analyze Improve Control ACTIVITIES TOOLS • Value Stream Analysis • Identify Potential Root Causes • Process Constraint ID • Reduce List of Potential Root • Takt Time Analysis Causes • Cause and Effect Analysis • Brainstorming • Confirm Root Cause to Output • 5 Whys Relationship • Affinity Diagram • Estimate Impact of Root Causes • Pareto on Key Outputs • Cause and Effect Matrix • FMEA • Prioritize Root Causes • Hypothesis Tests • Complete Analyze Tollgate • ANOVA • Chi Square • Simple and Multiple Regression Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive. UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOLearning Objectives  Review the terms “Parameters” and “Statistics” as they relate to Populations and Samples.  Introduce Confidence Intervals for expressing the uncertainty when predicting a population parameter using a sample statistic, and how to calculate CI’s for some common situations for different sample sizes.  Show how the Central Limit Theorem and the Standard Error of the Mean applies to the use of Confidence Intervals and Tests Hypothesis Testing - Basic UNCLASSIFIED / FOUO 3
  • UNCLASSIFIED / FOUOLearning Objectives (Cont.)  Introduce statistical tests for some common tests and introduce the t-distribution with testing  Learn about Hypothesis Testing to prove a statistical difference in process performance in applications of Minitab  Understand the tradeoffs and influences of sample sizes on statistical tests.  Apply knowledge of different classes of statistical errors to the decisions used in process improvement to minimize risk. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 4
  • UNCLASSIFIED / FOUOApplication Examples  Transactional – A Black Belt has just finished a pilot of a new process for handling blanket Purchase Orders and wants to know if it has a statistically significant: a) shorter cycle time and b) increased accuracy over the old process.  Administrative – The manager of an AAFES1 order entry department wants to compare two order entry procedures to see if one is faster than the other.  Service – Medical diagnostic imaging services are provided from two different medical treatment facilities to a central hospital which wants to know if there are differences in the quality of service, particularly: a) the number of lost records and re-takes, and b) average waiting time for MRIs and X-rays. 1AAFES, Army and Air Force Exchange System Hypothesis Testing - Basic UNCLASSIFIED / FOUO 5
  • UNCLASSIFIED / FOUOPopulation vs. Sample Population Sample All U.S. registered voters 10,000 people are asked who they will vote for President All sufferers of a certain 3,000 people are given a disease that might be new treatment in a clinical given the new treatment study All appraisals completed 25 appraisals chosen at that month random from a given month Since it is not always practical or possible to measure/query every item/person in the population, you take a random sample. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 6
  • UNCLASSIFIED / FOUOTerms and Labels: Population vs. Sample Population = Sample = Term Parameter Statistic Count of items N n m  x Mean ~ ~ Median m x Standard Dev. s S   m x Estimators =  s s Hypothesis Testing - Basic UNCLASSIFIED / FOUO 7
  • UNCLASSIFIED / FOUOPopulation Parameters vs. Sample Statistics Random Samples of Size, n = 4 Population x1 , s1 x2 , s2 x3 , s3 m, s x4 , s4 Population Parameters; Sample Statistics; Mean, x-bar, Mean, m (mu), and Standard Deviation, s (sigma) and Standard Deviation, s Hypothesis Testing - Basic UNCLASSIFIED / FOUO 8
  • UNCLASSIFIED / FOUOCentral Limit Theorem  If: x1, x2, …, xn are independent measurements (i.e., a random sample of size n) from a population, where the mean of x is m, when the standard deviation of x is given as s,  Then:  The distribution of x X  X 1  X 2 x3  X n  n has mean and standard deviation given by:  s Standard Error mX  m and sX  of the Mean n In addition, when n is sufficiently large, then the distribution of x- bar is approximately normal (“bell-shaped curve”). More on sample sizes later... Hypothesis Testing - Basic UNCLASSIFIED / FOUO 9
  • UNCLASSIFIED / FOUOVariability of Means  Sample statistics estimate population parameters by inference:   For a given sample ( x, s, n ), we can estimate population   parameters of m  s by inference.  As the sample size increases we are more confident that our sample statistic is a more valid estimator of the population parameter. n=5 sx n=3 sx  sx n n=1 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 10
  • UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training Confidence Intervals UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOWhat Is a Confidence Interval?  We know that when we take the average of a sample, it is probably not exactly the same as the average of the population.  Confidence intervals help us determine the likely range of the population parameter.  For example, if my 95% confidence interval is 5 +/- 2, then I have 95% confidence that the mean of the population is between 3 and 7. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 12
  • UNCLASSIFIED / FOUOWhat Is a Confidence Interval? (Cont.)  Usually, confidence intervals have an additive uncertainty: Estimate ± Margin of Error Sample Statistic ± [ ___ X ___ ] Example: Confidence Measure of x, s Factor VariabilityNote: Detailed formulas may be found in the appendix. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 13
  • UNCLASSIFIED / FOUOWhy Do We Need Confidence Intervals?  Sample statistics, such as Mean and Standard Deviation, are only estimates of the population’s parameters.  Because there is variability in these estimates from sample to sample, we can quantify our uncertainty using statistically-based confidence intervals. Confidence intervals provide a range of plausible values for the population parameters (m and s).  Any sample statistic will vary from one sample to another and, therefore, from the true population or process parameter value. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 14
  • UNCLASSIFIED / FOUOExercise  Let’s look at a population that has a normal distribution with:  known mean value = 65  standard deviation = 4 (This has been generated in dataset Confidence.mtw)  Each member in the class will randomly sample 25 data points from this population. (In Minitab, use Calc>Random Data>Sample from Columns.)  Sample 25 rows of data from C1 and store the results in C2.  Use graphical descriptive statistics to calculate the 95% confidence interval for the mean and sigma based on your sample of 25 data points. Do they include the mean, 65, and the sigma, 4?  Based on a class size of 25, we would expect 1 confidence interval to not contain 65 for the mean, and 1 that does not include 4 for sigma. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 15
  • Confidence Interval for the Mean (m) withUNCLASSIFIED / FOUOPopulation Standard Deviation (s) Known Example A random sample of size, n = 36, is taken and the distribution of x is normal. We are given that the population standard deviation (s) is 18.0. The value of x-bar is an estimator of the population mean (m), and the standard error of x-bar is: s x bar s / n  18.0 / 36  3.0  From the properties of the standardized normal distribution, there is a 95% chance that m is within the range of ( x-bar + and - 1.96 times the Standard Error of x-bar). This is known as the Standard Error of the Mean Hypothesis Testing - Basic UNCLASSIFIED / FOUO 16
  • UNCLASSIFIED / FOUOWhat Values of x-bar Can I Expect? Distribution of x-bar .95 95% of all x-bars will fall into the shaded region, defined by m ± 1.96(3.0) .025 .025 Standard Error m1 - 1.96(3.0) m1 m1+ 1.96(3.0) of the Mean Hypothesis Testing - Basic UNCLASSIFIED / FOUO 17
  • UNCLASSIFIED / FOUOBut I Don’t Know m, I Only Know x-bar!  We can turn it around.  x-bar lying in the interval m ± 1.96(3.0) is the same thing as m lying in the interval x-bar ± (----------- x-barsample A -----------) 1.96(3.0). (---------- x-barsample B-----------)  Because there is a 95% chance (---------- x-barsample C-----------) that x-bar lies in the interval m ± 1.96(3.0), there is a 95% chance that the interval x-bar m1 - 1.96(3.0) m1 m1+ 1.96(3.0) ± 1.96(3.0) encloses m.  The interval we construct using Observed sample mean, x-barsample C the observed sample mean is called a 95% confidence interval for m. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 18
  • Confidence Interval for the Mean (m) withUNCLASSIFIED / FOUOPopulation Standard Deviation (s) Known Another Example An airline needs an estimate of the average number of passengers on a newly scheduled flight. Its experience is that data for the first month of flights is unreliable, but thereafter the passenger loading settles down. Therefore, the mean passenger load is calculated for the first 20 weekdays of the second month after initiation of this particular new flight. If the sample mean (x-bar) is 112.0 and the population standard deviation (s) is assumed to be 25, find a 95% confidence interval for the true, long-run average number of passengers on this flight. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 19
  • Confidence Interval for the Mean (m)UNCLASSIFIED / FOUOwith Standard Deviation (s) Known Solution We assume that the hypothetical population of daily passenger loads for weekdays is not badly skewed. Therefore, the sampling distribution of x-bar is approximately normal and the confidence interval results are approximately correct, even for a sample size of only 20 weekdays. x -bar  112.0 s  25 s s x -bar   5.59 20 For a 95% confidence interval, we use z.025= 1.96 in the formula to obtain 112  1.965.59 or 101.04 to 122.96 We are 95% confident that the long-run mean, m , lies in this interval. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 20
  • Confidence Interval for the Mean (m) withUNCLASSIFIED / FOUOPopulation Standard Deviation (s) Unknown A very important point to remember is that for this example we assumed that we knew the population standard deviation, and many times that is not the case. Often, we have to estimate both the mean and the standard deviation from the sample.  When s is not known, we use the t-distribution rather than the normal (z) distribution. The t-distribution will be explained next.  In many cases, the true population s is not known, so we must use our sample standard deviation (s) as an estimate for the population standard deviation (s Hypothesis Testing - Basic UNCLASSIFIED / FOUO 21
  • Confidence Interval for the Mean (m) withUNCLASSIFIED / FOUOStandard Deviation (s) Unknown (Cont.)  Since there is less certainty (not knowing m or s ), the t-distribution essentially “relaxes” or “expands” our confidence intervals to allow for this additional uncertainty.  In other words, for a 95% confidence interval, you would multiply the standard error by a number greater than 1.96, depending on the sample size.  1.96 comes from the normal distribution, but the number we will use in this case will come from the t-distribution. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 22
  • UNCLASSIFIED / FOUOWhat Is This t-Distribution?  The t-distribution is actually a family of distributions.  They are similar in shape to the normal distribution (symmetric and bell-shaped), although wider, and flatter in the tails.  How wide and flat the specific t-distribution is depends on the sample size. The smaller the sample size, the wider and flatter the distribution tails.  As sample size increases, the t-distribution approaches the exact shape of the normal distribution. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 23
  • UNCLASSIFIED / FOUOAn Example of a t-Distribution 0.4 t-distribution 0.3 (n = 5) frequency 0.2 Area = 0.025 0.1 0.0 -3 -2 -1 0 1 2 2.78 3 t Hypothesis Testing - Basic UNCLASSIFIED / FOUO 24
  • UNCLASSIFIED / FOUOSome Selected t-Values  Here are values from the t-distribution for various sample sizes (for 95% confidence intervals): Sample Size t-value (.025)* 2 12.71 3 4.30 5 2.78 10 2.26 20 2.09 30 2.05 100 1.98 1000 1.96 * For a 95% CI,  = .05. Therefore, for a two tail distribution: /2= .05/2= .025 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 25
  • Confidence Interval for the Mean (m) withUNCLASSIFIED / FOUOPopulation Standard Deviation (s) Unknown Example The customer expectation when phoning an order-out pizza shop is that the average amount of time from completion of dialing until they hear the message indicating the time in queue is equal to 55.0 seconds (less than a minute was the response from customers surveyed, so the standard was established at 10% less than a minute). You decide to randomly sample at 20 times from 11:30am until 9:30pm on 2 days to determine what the actual average is. In your sample of 20 calls, you find that the sample mean, x-bar, is equal to 54.86 seconds and the sample standard deviation, s, is equal to 1.008 seconds. The actual data was as follows: 54.1, 53.3, 56.1, 55.7, 54.0, 54.1, 54.5, 57.1, 55.2, 53.8, 54.1, 54.1, 56.1, 55.0, 55.9, 56.0 ,54.9, 54.3, 53.9, 55.0 What is a 95% confidence interval for the true mean call completion time? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 26
  • UNCLASSIFIED / FOUO95% Confidence Intervalfor Mean Call Completion Time x = 54.860 We’re 95% confident that the actual mean s = 1.008 call completion time is somewhere between 54.389 seconds and 55.331 seconds, n = 20 based on our sample of 20 calls.  t.025,19 = 2.09 our sample of 20 calls s Luckily, we don’t x  t α/2, n1 have to worry about n the details of how to calculate the t-value. 1.008 54.860  2.09  Minitab takes care of 20 that for us. 54.389, 55.331 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 27
  • UNCLASSIFIED / FOUONow Let Minitab Calculatethe Confidence Interval 1. Open the Minitab file PizzaCall.mtw Hypothesis Testing - Basic UNCLASSIFIED / FOUO 28
  • UNCLASSIFIED / FOUONow Let Minitab Calculatethe Confidence Interval (Cont.) 2. Select Stat> Basic Statistics> Graphical Summary Hypothesis Testing - Basic UNCLASSIFIED / FOUO 29
  • UNCLASSIFIED / FOUONow Let Minitab Calculatethe Confidence Interval (Cont.) 3. Double click on C-1 to place it in the Variables box 4. Click on OK Hypothesis Testing - Basic UNCLASSIFIED / FOUO 30
  • UNCLASSIFIED / FOUONow Let Minitab Calculatethe Confidence Interval (Cont.) Summary for C1 A nderson-Darling Normality Test We’re 95% confident that A -Squared 0.60 the actual mean is P-V alue 0.105 between Mean StDev 54.860 1.008 54.388 and 55.332 V ariance 1.016 Skew ness 0.560026 We’re also taking a 5% Kurtosis -0.509797 N 20 chance that we’re wrong. Minimum 53.300 1st Q uartile 54.100 Median 54.700 54 55 56 57 3rd Q uartile Maximum 55.850 57.100 95% Confidence Interval 95% C onfidence Interv al for Mean for Mean (m: 54.388 55.332 54.388 55.332 95% C onfidence Interv al for Median 54.100 55.582 95% Confidence Intervals 95% C onfidence Interv al for StDev 95% Confidence Interval Mean 0.767 1.472 for Standard Deviation (s: 0.767 1.472 Median 54.00 54.25 54.50 54.75 55.00 55.25 55.50 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 31
  • UNCLASSIFIED / FOUOOther Types of Confidence Intervals  There are other types of confidence intervals that are based on the same principles we have learned:  Standard Deviation  Proportions  Median  Others  We will discuss some of these later. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 32
  • UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training Hypothesis Testing UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOExtending the Conceptof Confidence Intervals  Extending the concept of confidence intervals allows us to set-up and interpret statistical tests.  We refer to these tests as Hypothesis Tests.  One way to describe a hypothesis test:  Determining whether or not a particular value of interest is contained within a confidence interval.  Hypothesis testing also gives us the ability to calculate the probability that our conclusion is wrong. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 34
  • UNCLASSIFIED / FOUOThe New Car  You buy a one-year old car from the Lemon Lot in order to save money on gas. The previous owner still had the original features sticker and you were pleased to note that the EPA mileage estimate indicated that the car should get 31 miles per gallon overall. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 35
  • UNCLASSIFIED / FOUOThe New Car (Cont.)  As soon as you buy the car, you fill up the tank so that you’ll be ready to take the family for a drive and to go to work the next day. A few days later, you fill up again and calculate your gas mileage for that tank. After you push the “=“ key on your calculator, the number 27.1 appears. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 36
  • UNCLASSIFIED / FOUOThe New Car (Cont.)  Should you send the car to a mechanic to check for problems?  Do you conclude that the EPA estimate is simply wrong?  Do you leave cruel messages on the seller’s answering machine?  What ARE your conclusions? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 37
  • UNCLASSIFIED / FOUOContinuing the Car Situation  At what value of gas consumption should you become alarmed that you are experiencing anything more than just random variation? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 38
  • UNCLASSIFIED / FOUOThe Car Situation (Cont.)  What if we knew this? Distribution of gas consumption for this car 12.8 % s = 3.46 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 39
  • UNCLASSIFIED / FOUOHypothesis Testing Hypothesis Testing:  Allows us to determine statistically whether or not a value is cause for alarm (or is simply due to random variation)  Tells us whether or not two sets of data are different  Tells us whether or not a statistical parameter (mean, standard deviation, etc.) is statistically different from a test value of interest  Allows us to assess the “strength” of our conclusion (our probability of being correct or wrong) Hypothesis Testing - Basic UNCLASSIFIED / FOUO 40
  • UNCLASSIFIED / FOUOHypothesis Testing (Cont.) Hypothesis Testing Enables Us to:  Handle uncertainty using a commonly accepted approach  Be more objective (2 persons will use the same techniques and come to similar conclusions almost all of the time)  Disprove or “fail to disprove” assumptions  Control our risk of making wrong decisions or coming to wrong conclusions Hypothesis Testing - Basic UNCLASSIFIED / FOUO 41
  • UNCLASSIFIED / FOUOHypothesis Testing (Cont.) Some Possible Samples Sample A True Sample B Population Distribution Sample C Sample D m Population Mean Hypothesis Testing - Basic UNCLASSIFIED / FOUO 42
  • UNCLASSIFIED / FOUOSample Size Concerns  If we sample only one item, how close do we expect to get to the true population mean?  How well do you think this one item represents the true mean?  How much ability do we have to draw conclusions about the mean?  What if we sample 900 items? Now, how close would we expect to get to the true population mean? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 43
  • UNCLASSIFIED / FOUOSample Size (Cont.) The larger our sample, the closer x-bar is likely to be to Population the true population mean. Likely value of x-bar with a small sample size m Likely value of x-bar x with a large sample size x Hypothesis Testing - Basic UNCLASSIFIED / FOUO 44
  • UNCLASSIFIED / FOUOStandard Deviation  What effect would a lot of variation in the population have on our estimate of the population mean from a sample?  How would this affect our ability to draw conclusions about the mean?  What if there is very little variation in the population? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 45
  • UNCLASSIFIED / FOUOStandard Deviation (Cont.) Population with a lot of variation m Likely value of x-bar x with sample size, n Population with less variation m Likely value of x-bar with sample size, n x Hypothesis Testing - Basic UNCLASSIFIED / FOUO 46
  • UNCLASSIFIED / FOUOStatistical Inferences and Confidence  How much confidence do we have in our estimates?  How close do you think the true mean, m, is to our estimate of the mean, x-bar?  How certain do we want/need to be about conclusions we make from our estimates?  If we want to be more confident about our sample estimate (i.e., we want a lower risk of being wrong), then we must relax our statement of how close we are to the true value. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 47
  • UNCLASSIFIED / FOUOStatistical Inferences and Confidence(Cont.) Population If we want to have high confidence in our conclusions, we must relax the range in which we say the true m mean lies As we tighten our estimate of the mean, our risk of being wrong x increases. Thus, our confidence decreases. x Hypothesis Testing - Basic UNCLASSIFIED / FOUO 48
  • UNCLASSIFIED / FOUOThree Factors Drive Sample Sizes  Three concepts affect the conclusions drawn from a single sample data set of (n) items:  Variation in the underlying population (sigma)  Risk of drawing the wrong conclusions (alpha, beta)  How small a Difference is significant (delta) Risk (n) Variation Difference Hypothesis Testing - Basic UNCLASSIFIED / FOUO 49
  • UNCLASSIFIED / FOUOThree Factors: Variation, Risk, Difference These 3 factors work together. Each affects the others.  Variation: When there’s greater variation, a larger sample is needed to have the same level of confidence that the test will be valid. More variation reduces our confidence interval.  Risk: If we want to be more confident that we are not going to make a decision error or miss a significant event, we must increase the sample size.  Difference: If we want to be confident that we can identify a smaller difference between two test samples, the sample size must increase. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 50
  • UNCLASSIFIED / FOUOThree Factors (Cont.)  Larger samples improve our confidence interval.  Lower confidence levels allow smaller samples.  All of these translate into a specific confidence interval for a given parameter, set of data, confidence level and sample size.  They also translate into what types of conclusions result from hypothesis tests.  Testing for larger differences between the samples, reduces the size of the sample. This is known as delta (D). Hypothesis Testing - Basic UNCLASSIFIED / FOUO 51
  • UNCLASSIFIED / FOUOAn Example A unit has several quick response forces, QRF. Some forces have over 700 members, with at least 300 on the site at any time.  By regulation, all forces must have a quick response plan, the critical first phase of which is required to be completed in 10 minutes (600 seconds) or less.  There are two teams that are vying for “most responsive.” They have taken somewhat different approaches to implementing their quick response plans and management wants to know which approach is better: Team 1 or Team 2  Each one has 100 data points for actual responses and drills (Minitab file Response.mtw) Hypothesis Testing - Basic UNCLASSIFIED / FOUO 52
  • UNCLASSIFIED / FOUOThe Data from Team 1 598.0 598.8 600.2 599.4 599.6 599.8 598.8 599.6 599.0 601.2 600.0 599.8 599.6 598.4 599.6 599.8 599.2 599.6 599.0 600.2 600.0 599.4 600.2 599.6 600.0 600.0 600.0 599.2 598.8 600.0 598.8 600.2 599.0 599.2 599.4 598.2 600.2 599.6 599.6 599.8 599.4 599.6 600.4 598.6 599.2 599.6 599.0 600.0 599.8 599.6 599.4 599.0 599.0 599.6 599.4 599.4 599.8 599.6 599.2 600.0 600.0 600.8 599.4 599.6 600.0 598.8 598.8 599.2 600.2 599.2 599.2 598.2 597.8 599.8 599.4 599.4 600.0 600.4 599.6 599.6 599.6 599.2 599.6 600.0 599.8 599.0 599.8 600.0 599.6 599.0 599.2 601.2 600.8 599.2 599.6 600.6 600.4 600.4 598.6 599.4 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 53
  • UNCLASSIFIED / FOUOThe Data from Team 2 601.6 600.8 599.4 599.8 601.6 600.4 598.6 598.0 602.8 603.4 598.4 600.0 597.6 600.0 597.0 600.0 600.4 598.0 599.6 599.8 596.8 600.8 597.6 602.2 597.8 602.8 600.8 601.2 603.8 602.4 600.8 597.2 599.0 603.6 602.2 603.6 600.4 600.4 601.8 600.6 604.2 599.8 600.6 602.0 596.2 602.4 596.4 599.0 603.6 602.4 598.4 600.4 602.2 600.8 601.4 599.6 598.2 599.8 600.2 599.2 603.4 598.6 599.8 600.4 601.6 600.6 599.6 601.0 600.2 600.4 598.4 599.0 601.6 602.2 598.0 598.2 598.2 601.6 598.0 601.2 602.0 599.4 600.2 598.4 604.2 599.4 599.4 601.8 600.8 600.2 599.4 600.2 601.2 602.8 600.0 600.8 599.0 597.6 597.6 596.8 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 54
  • UNCLASSIFIED / FOUODescriptive Statistics – Team 1 Summary for Team 1 A nderson-D arling N ormality Test A -S quared 0.84 P -V alue 0.029 M ean 599.55 S tD ev 0.62 V ariance 0.38 S kew ness -0.082566 Kurtosis 0.745102 N 100 M inimum 597.80 1st Q uartile 599.20 M edian 599.60 3rd Q uartile 600.00 597.75 598.50 599.25 600.00 600.75 M aximum 601.20 95% C onfidence Interv al for M ean 599.43 599.67 95% C onfidence Interv al for M edian 599.40 599.60 95% C onfidence Interv al for S tD ev 9 5 % C onfidence Inter vals 0.54 0.72 Mean Median 599.40 599.45 599.50 599.55 599.60 599.65 599.70 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 55
  • UNCLASSIFIED / FOUODescriptive Statistics – Team 2 Summary for Team 2 A nderson-D arling N ormality Test A -S quared 0.29 P -V alue 0.615 M ean 600.23 S tD ev 1.87 V ariance 3.51 S kew ness 0.051853 Kurtosis -0.518286 N 100 M inimum 596.20 1st Q uartile 599.00 M edian 600.20 3rd Q uartile 601.60 597.0 598.5 600.0 601.5 603.0 M aximum 604.20 95% C onfidence Interv al for M ean 599.86 600.60 95% C onfidence Interv al for M edian 599.80 600.60 95% C onfidence Interv al for S tD ev 9 5 % C onfidence Inter vals 1.65 2.18 Mean Median 599.8 600.0 600.2 600.4 600.6 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 56
  • UNCLASSIFIED / FOUOExample  The average cycle time for Team 1 is 599.55 seconds.  The average cycle time for Team 2 is 600.23 seconds.  The target cycle time for Phase 1 response is 600 seconds.  Is the difference between the two average cycle times statistically significant? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 57
  • UNCLASSIFIED / FOUOExample (Cont.)  The unit wants to determine if the true averages of the two teams are really different.  The unit thinks that the 600.23 average of team 2 is little too high, so there is a need to determine if the data indicates that the true average is really not equal to the target of 600 seconds.  The unit will use hypothesis testing to answer these questions. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 58
  • UNCLASSIFIED / FOUOExample  The first hypothesis test to be performed is to determine whether there is a statistically significant difference between the means of the two teams. This is called a 2-Sample t Test.  The real question is whether or not the means are different enough to indicate that the approaches taken by the two teams really are centered differently, or are they close enough that the difference could simply be a result of random variation?  After that, hypothesis testing can tell us if there is evidence indicating whether or not each team’s average is different from the target of 600 seconds.  First, we need to introduce some terminology. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 59
  • UNCLASSIFIED / FOUOThe Null Hypothesis for a 2-Sample t Test  The 2-Sample t Test is used to test whether or not the means of two populations are the same.  The null hypothesis is a statement that the population means for the two samples are equal. Ho: μ1 = μ2  We assume the null hypothesis is true unless we have enough evidence to prove otherwise. We say – we “fail to reject the null”.  If we can prove otherwise, then we “reject the null” hypothesis and accept the Alternative Hypothesis HA: μ1 ≠ μ2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 60
  • UNCLASSIFIED / FOUONull Hypothesis for 2-Sample t Test (Cont.)  This is analogous to our judicial system principle of “innocent until proven guilty”  The symbol used for the null hypothesis is Ho: H 0 : m1  m2 OR H 0 : m1  m2  0 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 61
  • UNCLASSIFIED / FOUOThe Alternative Hypothesisfor a 2-Sample t Test  The alternative hypothesis is a statement that represents reality if there is enough evidence to reject Ho.  If we reject the null hypothesis then we accept the alternative hypothesis.  This is analogous to being found “guilty” in a court of law.  The symbol used for the alternative hypothesis is Ha: H a : m1  m2 OR H a : m1  m2  0 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 62
  • UNCLASSIFIED / FOUOOur Emergency Response Team Example  In our example, the first hypothesis test will take this form: H o : m1  m 2 H a : m1  m 2 Reminder: We are conducting a 2- Sample t test to determine if We can rewrite it in this form: the average cycle time of the Phase 1 response from our two H o : m1  m 2  0 teams are different. H a : m1  m 2  0 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 63
  • UNCLASSIFIED / FOUOOur Emergency Response Team Example(Cont.)  If we wanted to specifically test only whether or not there was enough evidence to indicate that team 2’s average was greater than team 1’s, it would take this form: H o : m1  m 2  0 H a : m1  m 2  0 This is still a 2-Sample t-Test Hypothesis Testing - Basic UNCLASSIFIED / FOUO 64
  • UNCLASSIFIED / FOUOOur Emergency Response Team Example(Cont.)  The second hypothesis test will be a 1-Sample t. It will take this form for each team: H o : m1  600 H a : m1  600 When you are testing whether or not a population mean is equal to a given or Target value, you use a 1-Sample t Hypothesis Testing - Basic UNCLASSIFIED / FOUO 65
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab  We will use Minitab to conduct our hypothesis tests.  Open the Minitab file Response.mtw Hypothesis Testing - Basic UNCLASSIFIED / FOUO 66
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab:2-Sample t-Test Select Stat> Basic Statistics> 2-Sample t to compare Team 1 to Team 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 67
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) Team 1 and Team 2 are in different columns, so select Samples in different columns Double click on C1-Supp1 Then double click on C2-Supp2 to place them In First and Second boxes Select Graphs to get the Graphs dialog box Hypothesis Testing - Basic UNCLASSIFIED / FOUO 68
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) In the Graphs dialog box, check both Boxplots of data and Dotplots of data Click OK here, and then click on OK in the previous dialog box Hypothesis Testing - Basic UNCLASSIFIED / FOUO 69
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) Boxplot of Team 1, Team 2 605 604 603 602 601 Data 600 599 598 597 596 Team 1 Team 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 70
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) Individual Value Plot of Team 1, Team 2 605 604 603 602 601 Data 600 599 598 597 596 Team 1 Team 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 71
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) This descriptive output shows up in your Session Window Hypothesis Testing - Basic UNCLASSIFIED / FOUO 72
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) The null hypothesis states that the difference between the two means is zero Hypothesis Testing - Basic UNCLASSIFIED / FOUO 73
  • UNCLASSIFIED / FOUOHypothesis Test in Minitab (Cont.) We will cover p-values in more detail a little later The p-value here is less than 0.05, so we can reject the null hypothesis Hypothesis Testing - Basic UNCLASSIFIED / FOUO 74
  • UNCLASSIFIED / FOUOAssumptions  The Hypothesis Tests we have discussed make certain assumptions:  Independence between and within samples  Random samples  Normally distributed data  Unknown Variance  In our example, we did not assume equal variances. This is the safe choice. However, if we had reason to believe equal variances, then we could have checked the “Assume equal variances” box in the dialogue box. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 75
  • UNCLASSIFIED / FOUOThe Risks of Being Wrong Error Matrix Conclusion Drawn Accept Ho Reject Ho Type I Ho True Correct Error The -Risk) True State Type II Error Correct Ho False  -Risk) Hypothesis Testing - Basic UNCLASSIFIED / FOUO 76
  • UNCLASSIFIED / FOUOType I and Type II Errors  Type I Error I’ve discovered  Alpha Risk something that really  Producer Risk isn’t here!  The risk of rejecting the null, and taking action, when no action was necessary  Type II Error I’ve missed a  Beta Risk significant effect!  Consumer Risk  The risk of failing to reject the null when you should have rejected it.  No action is taken when there should have been action. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 77
  • UNCLASSIFIED / FOUOType I and Type II Errors (Cont.)  The Type I Error is determined up front.  It is the alpha value you choose.  The confidence level is one minus the alpha level.  The Type II Error is determined from the circumstances of the situation.  If alpha is made very small, then beta increases (all else being equal).  Requiring overwhelming evidence to reject the null increases the chances of a type II error.  To minimize beta, while holding alpha constant, requires increased sample sizes.  One minus beta is the probability of rejecting the null hypothesis when it is false. This is referred to as the Power of the test. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 78
  • UNCLASSIFIED / FOUOType I and Type II Errors (Cont.)  What type of error occurs when an innocent man is convicted?  What about when a guilty man is set free?  Does the American justice system place more emphasis on the alpha or beta risk? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 79
  • UNCLASSIFIED / FOUOExercise  Draw the Type I & II error matrix for airport security.  Do you think the security system at most airports places more emphasis on the alpha or beta risk? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 80
  • UNCLASSIFIED / FOUOThe p-Value  If we reject the null hypothesis, the p-value is the probability of being wrong.  In other words, if we reject the null hypothesis, the p- value is the probability of making a Type I error.  It is the critical alpha value at which the null hypothesis is rejected.  If we don’t want alpha to be more than 0.05, then we simply reject the null hypothesis when the p-value is 0.05 or less.  As we will learn later, it isn’t always this simple. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 81
  • UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training Power, Delta and Sample Size UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOBeta, Power, and Sample Size  If two populations truly have different means, but only by a very small amount, then you are more likely to conclude they are the same. This means that the beta risk is greater.  Beta only comes into play if the null hypothesis truly is false. The “more” false it is, the greater your chances of detecting it, and the lower your beta risk.  The power of a hypothesis test is its ability to detect an effect of a given magnitude. Power  1    Minitab will calculate beta for us for a given sample size, but first let’s show it graphically…. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 83
  • UNCLASSIFIED / FOUOBeta and Alpha 95% Confidence Limit (alpha = .05) for mean, m1 (critical value) Alpha Risk m1 Here is our first population and its corresponding alpha risk. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 84
  • UNCLASSIFIED / FOUOBeta and Alpha (Cont.) 95% Confidence Limit (alpha = .05) for mean, m1 (critical value) m1 m2 D We want to compare these two populations. Do you think that we will easily be able to determine if they are different? Hypothesis Testing - Basic UNCLASSIFIED / FOUO 85
  • UNCLASSIFIED / FOUOBeta and Alpha (Cont.) 95% Confidence Limit (alpha = .05) for mean, m1 (critical value) Beta Risk m1 m2 D If our sample from population 2 is in this grey area, we will not be able to see the difference. This is called Beta Risk. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 86
  • UNCLASSIFIED / FOUOBeta and Delta  If we are trying to see a larger change, we have less Beta Risk. 95% Confidence Limit (alpha = .025) for Beta Risk mean, m1 (critical value) m1 m2 D Hypothesis Testing - Basic UNCLASSIFIED / FOUO 87
  • UNCLASSIFIED / FOUOBeta and Sigma  Now we’re back to our original graphic. What do you think happens to Beta Risk if the standard deviations of the populations decrease? 95% Confidence Limit (alpha = .05) for mean, m1 (critical value) Beta Risk m1 m2 D Hypothesis Testing - Basic UNCLASSIFIED / FOUO 88
  • UNCLASSIFIED / FOUOBeta and Sigma (Cont.)  If the standard deviation decreases, Beta Risk decreases.  Reducing variability has the same effect on Beta Risk as increasing sample size. Beta Risk 95% Confidence Limit (alpha = .05) for mean, m1 (critical value) m1 m2 D Hypothesis Testing - Basic UNCLASSIFIED / FOUO 89
  • UNCLASSIFIED / FOUOHow Can Power Be Increased?  Power is related to risk, variation, sample size, and the size of change that we want to detect.  If we want to detect a smaller delta (effect), we typically must increase our sample size. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 90
  • UNCLASSIFIED / FOUOExample:Power  Let’s use Minitab to determine the beta risk of the hypothesis test we performed on the two teams.  First, we’ll have to make some assumptions.  We don’t know the TRUE difference in the means, so we’ll assume that it’s 0.682, the differences in the sample averages.  A variance hypothesis test shows that the variances are not equal.  We will average the variances from Minitab to determine the combined variance using the following formula: 2 2 s1  s 2 s  2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 91
  • UNCLASSIFIED / FOUOExample:Power (Cont.) Select; Stat> Power and Sample Size> 2-Sample t... Hypothesis Testing - Basic UNCLASSIFIED / FOUO 92
  • UNCLASSIFIED / FOUOExample:Power (Cont.) To calculate Power, we need three things; 1. Sample Size 2. The Difference between the two Means 3. The Average Standard Deviation of the two samples We can get all this information from our 2-Sample t-Test conducted earlier: 1. Sample Size = 100 2. Difference Between Means = 0.682 (600.230 – 599.548 = 0.682) 3. Average Standard Deviation ?? (See Next Slide) Hypothesis Testing - Basic UNCLASSIFIED / FOUO 93
  • UNCLASSIFIED / FOUOExample:Power (Cont.) To Calculate Average Standard Deviation Remember that Standard Deviations are the Square Roots of the Variance. Since square roots are not additive (we cannot add them and divide by two) we have to convert them back to Variances which are additive. StDev Squared = Variance Team 1 0.619 squared = 0.3832 Team 2 1.870 squared = 3.4969 Sum = 3.8801 Divide by 2 to get Average = 1.9401 And Square Root of Average = 1.3929 So the Average Standard Deviation for the two samples is 1.3929 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 94
  • UNCLASSIFIED / FOUOExample:Power (Cont.) 1. Type in Sample Size of 100 here 2. Type in Difference Between Means of 0.682 here 3. Type in Average Standard Deviation of 1.393 here 4. Click on OK Hypothesis Testing - Basic UNCLASSIFIED / FOUO 95
  • UNCLASSIFIED / FOUOExample:Power (Cont.) The Power = 0.9312 And since Beta = (1 –Power) Beta = 0.0688. If the TRUE difference between the two support orgs. was 0.682, we would have a 6.88% chance of not observing this and therefore concluding that they are the same. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 96
  • UNCLASSIFIED / FOUOExample:Power (Cont.)  In practice, we evaluate the power of a test to determine its ability to detect a difference of a given magnitude that we deem important, or practically significant.  For example, we could calculate the power of a hypothesis test to see if we could measure a one minute difference in responsiveness between the two teams. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 97
  • UNCLASSIFIED / FOUOExample:Power (Cont.)  Let’s say that if the two support organizations’ cycle times differ by as little as 0.4 seconds, then we need to analyze the reasons for the differences.  What is the power of our test to detect this difference?  What is the probability of making a type II error (concluding that there is no difference when one exists)?  Use Minitab to individually answer these questions. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 98
  • UNCLASSIFIED / FOUOExercise:Sample Size  Now that we understand the relationship between Beta, Power, Delta, and Sample Size, we can use this information to calculate the sample size necessary to give us the information we want.  We simply use the same function in Minitab to solve for sample size rather than power.  This is a very useful and common extension of Hypothesis Testing. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 99
  • UNCLASSIFIED / FOUOExercise:Sample Size (Cont.) Here we enter the Difference (delta) we wish to detect, and the minimum Power value that we are willing to live with. We leave Sample sizes blank. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 100
  • UNCLASSIFIED / FOUOExercise:Sample Size  Let’s extend our response team cycle time example  Determine what sample size we would need to detect a difference of 0.4 seconds at a power of 0.90.  What about at a power of 0.95?  What about at a power of 0.95 and an alpha of 0.025?  Hint: Click the Options button in the Power and Sample Size dialogue box. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 101
  • UNCLASSIFIED / FOUOOther Power and Sample Size Scenarios We can perform these calculations not only for the difference between two means, but for other tests as well. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 102
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab  Now, we will return to Minitab to test the following hypothesis about our two support organizations cycle times: H o : m1  600 H a : m1  600 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 103
  • UNCLASSIFIED / FOUOBack to the Support Organization Example:One Sample t-Test 1-Sample t-test in Minitab Choose Stat>Basic Statistics>1-Sample t to test the mean of each response team against a standard or spec Hypothesis Testing - Basic UNCLASSIFIED / FOUO 104
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab Double click on C1 Team 1 and C2 Team 2 to place them in the dialog box here. Type in the Hypothesized mean, or standard we are comparing to. Here it is 600. Click the Graphs button to get to the Graphs dialog box. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 105
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab (Cont.) Select Histogram of data and Boxplot of data Click OK here and on the previous Screen Hypothesis Testing - Basic UNCLASSIFIED / FOUO 106
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab Histogram of Team 1 (with Ho and 95% t-confidence interval for the mean) 35 30 This shows the Target we are testing, along with 25 the Average and the 20 Confidence Interval Frequency from the data. 15 10 5 0 _ X -5 Ho 598.0 598.5 599.0 599.5 600.0 600.5 601.0 Team 1 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 107
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab (Cont.) - adj Boxplot of Team 1 (with Ho and 95% t-confidence interval for the mean) _ X Ho 598.0 598.5 599.0 599.5 600.0 600.5 601.0 601.5 Team 1 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 108
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab (Cont.) Histogram of Team 2 (with Ho and 95% t-confidence interval for the mean) 15.0 12.5 10.0 Frequency 7.5 5.0 2.5 0.0 _ X Ho 597.0 598.5 600.0 601.5 603.0 Team 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 109
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab (Cont.) Boxplot of Team 2 (with Ho and 95% t-confidence interval for the mean) _ X Ho 596 597 598 599 600 601 602 603 604 605 Team 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 110
  • UNCLASSIFIED / FOUO1-Sample t-test in Minitab (Cont.) Here is the descriptive output for the 1-Sample t-Test found in Session Window Hypothesis Testing - Basic UNCLASSIFIED / FOUO 111
  • UNCLASSIFIED / FOUO2-Sided and 1-Sided Hypothesis Tests  We have concentrated on 2-sided hypothesis tests.  2-Sided tests determine whether or not two items are equal or whether a parameter is equal to some value.  Whether an item is less than or greater than another item or a value is not sought up front. A 2-sided test is a less specific test.  The alternative hypothesis is “Not Equal”.  Everything we have learned also applies to 1-sided tests.  1-Sided tests determine whether or not an item is less than (<) or greater than (>) another item or value.  The alternative hypothesis is either (<) or (>).  This makes for a more powerful test (lower beta at a given alpha and sample size). Hypothesis Testing - Basic UNCLASSIFIED / FOUO 112
  • UNCLASSIFIED / FOUOMore Detailed Information Remember to use the Stat Guide button to learn more about the results and to help you interpret them. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 113
  • UNCLASSIFIED / FOUOHypothesis Test Summary Template Hypothesis Test Factor (x) (ANOVA, 1 or 2 sample t - test, Chi Squared, p Value Observations/Conclusion Regression, Test of Equal Variance, etc) Tested Significant factor - 1 hour driving time from DC Example: ANOVA Location 0.030 to Baltimore office causes ticket cycle time to generally be longer for the Baltimore site Significant factor - on average, calls requiring Example: ANOVA Part vs. No Part 0.004 parts have double the cycle time (22 vs 43 hours) Significant factor - Department 4 has digitized Example: Chi Squared Department 0.000 addition of customer info to ticket and less human intervention, resulting in fewer errors South region accounted for 59% of the defects Example: Pareto Region n/a due to their manual process and distance from the parts warehouse - Example - Optional BB Deliverable Describe any other observations about the root cause (x) data UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUO One-Way ANOVA Template Boxplots of Net Hour by Part/No (means are indicated by solid circles) After further investigation, possible 150 Boxplot: Part/ No Part Impact on Ticket Cycle Time reasons proposed by the team are OEM backorders, lack of technician - Example - Net Hours Call Open certifications and the distance from the OEM to the client site. It is also 100 caused by the need for technicians to make a second visit to the end user to complete the part replacement. 50 Next step will be for the team to confirm these suspected root causes. 0 Part/No Part Part No Part Analysis of Variance for Net Hour  Because the p-value <= Source DF SS MS F P Part/No 1 7421 7421 8.65 0.004 0.05, we can be confident Error 69 59194 858 that calls requiring parts Total 70 66615 do have an impact on the Individual 95% CIs For Mean Level N Mean StDev --+---------+---------+---------+---- ticket cycle time. No Part 27 21.99 19.95 (--------*---------) Part 44 43.05 33.70 (------*------) --+---------+---------+---------+---- Pooled StDev = 29.29 12 24 36 48 Optional BB Deliverable UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOLinear Regression Template 95% confident that 94.1% of the variation in “Wait Time” is from the “Qty of Deliveries” Fitted Line Plot Wait Time = 32.05 + 0.5825 Deliveries 55 S 1.11885 R-Sq 94.1% R-Sq(adj) 93.9% 50 Wait Time 45 40 - Example - 35 10 15 20 25 30 35 Deliveries Optional BB Deliverable UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOTakeaways  Since it is not always practical or possible to measure every item in the population, you take a random sample.  A basic understanding of the terms: Population, Sample, Population Parameter, Sample Statistic, Sample Mean, and Sample Standard Deviation  How to calculate a confidence interval with the population standard deviation known  How to calculate a confidence interval with the population standard deviation unknown Hypothesis Testing - Basic UNCLASSIFIED / FOUO 117
  • UNCLASSIFIED / FOUOTakeaways (Cont.)  How Hypothesis tests help us handle uncertainty  The role of sample size, variation, and confidence level  The null and alternative hypotheses  Type I and Type II errors  Hypothesis tests in Minitab  Stat Guide  p-value Hypothesis Testing - Basic UNCLASSIFIED / FOUO 118
  • UNCLASSIFIED / FOUOTakeaways (Cont.)  How to conduct a 1-way and 2-way t-test  How to conduct a Variance test (see Appendix)  How to conduct a Paired t-test (see Appendix)  Understanding of 1-way and 2-way test of proportions (see Appendix)  Understanding the relationship between Power and sample size and detectable difference (delta) Hypothesis Testing - Basic UNCLASSIFIED / FOUO 119
  • UNCLASSIFIED / FOUO What other comments or questions do you have? UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOReferences  Hildebrand and Ott, Statistical Thinking for Managers, 4th Edition  Kiemele, Schmidt, and Berdine, Basic Statistics, 4th Edition Hypothesis Testing - Basic UNCLASSIFIED / FOUO 121
  • UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO National Guard Black Belt Training APPENDIX UNCLASSIFIED / FOUO UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOHypothesis Testing - Steps Step 1: Define the problem objective Step 2: Determine what data to collect (continuous or attribute) Step 3: Based on data type, determine the appropriate hypothesis test to use Step 4: Specify the null (H0) hypothesis and the alternative (H1) hypothesis Step 5: Select a significance level (degree of risk acceptable), usually 0.05 Step 6: Execute Data Collection plan from step 2 Step 7: From the sample, conduct the hypothesis test using a statistical tool Step 8: Identify the p-value Step 9: Compare the p-value to the significance level - if the p-value is less than or equal to your acceptable risk (your alpha), then the null hypothesis is rejected Step 10: Translate the decision to the situation UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree Matrix Data Type Hypothesis to be Tested (Step 3) Tree (Step 2) Variable Testing equality of population MEAN (average) to a specific value 1 Variable Testing equality of population MEANS (averages) from two populations 2 Variable Testing equality of population MEANS (averages) from more than two populations 3 Testing equality of population VARIANCES (standard deviation) from more than two Variable populations 4 Attribute - Binomial "Go/No-Go" Testing equality of population PROPORTIONS (binomial data; e.g., pass/fail, go/no "Pass/Fail" or go, is/is not, etc.) from one or more populations 5 "Defective" Data Attribute - Poisson Testing equality of population PROPORTIONS (Poisson data; i.e., frequency of "Count" or occurence in time or space) from two or more populations 6 "Defects" data Testing for ASSOCIATION (not necessarily causal) Attribute (Contingency Table Data) Note: For use with attribute data only. For variable data, use correlation 7 or regression. No decision tree required. UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 1 Testing Equality of Population Mean Application: to a Specific Value Type of Data: Variable (Continuous) Has the average button diameter from the welder Example: changed from its historical value? Start Is Yes 1-Sample Z-test n > 30? Stat > Basic Statistics > 1-Sample Z No Is population 1-Sample Wilcoxon test normally random sample from a continuous, No distributed? symmetric population (Anderson- Stat > Nonparametrics > 1-Sample Wilcoxon Darling) Yes 1-Sample t-test (reasonably robust against normality assumption) Stat > Basic Statistics > 1-Sample t UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 2 Testing Equality of Means Application: from Two Populations Type of Data: Variable (Continuous) Is the average button diameter from Welder A Example: different from that of Welder B? Start Are the Paired t-te st two samples Yes (samples from normal distribution) dependent? Stat > Basic Statistics > Paired t No Do n1 and n2 Yes 2-Sample Z-te st both exceed Stat > Basic Statistics > 2-Sample t 30? No 2-Sample M ann Whitne y (independent, random variables from two Are both populations with same shape, same variance) populations Stat > Nonparametrics > Mann-Whitney normally No distributed? (Anderson- Note: If the two populations have different shapes Darling) or different standard deviations, then use: 2-Sample t-test without pooling variances Yes Equal 2-Sample t-te st Variances? No without pooling variances (F-test) Stat > Basic Statistics > 2-Sample t (Do not assume equal variances) Yes 2-Sample t-te st w ith pooled variances (reasonably robust against normality assumption) Stat > Basic Statistics > 2-Sample t  Assume equal variances Example2.vsd 6-1-00 UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 3 Testing Equality of Means from Application: More than Two Populations Type of Data: Variable (Continuous) Do the average button diameters from Example: Welders A, B and C differ from one another? Start Are the One-Way Analysis of Variance populations normally (ANOVA) Yes distrubted? (reasonably robust against assumptions (Anderson- of normality and equal v ariances) Darling) Stat > ANOVA > One-way Did the test show signif icance? No Stop No Tukeys test Yes to conduct pairwise comparisons Stat > ANOVA > One-way Comparisons:  Tukeys Do samples Moods Median test contain outliers? Yes (independant, random samples f rom continuous (Box Plot) distributions hav ing same shape) Stat > Nonparametrics > Moods Median test No Note: Use Dunnetts Method if Kruskal-Wallis comparing treatments to a control. (independant, random samples f rom continuous distributions hav ing same shape) Stat > Nonparametrics > Kruskal-Wallis Example3.v sd 6-1-00 UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 4 Application: Testing Equality of Variances Type of Data: Variable (Continuous) Do the variances in button diameter Example: from the three welders differ from one another? Start Are the populations How many populations are 2 normally No Levenes test being compared? distributed? Stat > Basic Statistics > 2 Variances (Anderson- Darling) Yes More than 2 F-test Stat > Basic Statistics >2 Variances Are the populations normally No Levenes test distributed? Stat > ANOVA > Test for Equal Variances (Anderson- Darling) Yes Note: The F-test and Bartletts test are not robust Bartletts test against the normality assumption. Stat > ANOVA > Test for Equal Variances UNCLASSIFIED / FOUO Example4.v sd 6-1-00
  • UNCLASSIFIED / FOUODecision Tree # 5 Testing Equality of Population Application: Proportions Type of Data: Attribute (Discrete) - Binomial Distribution Case 2: Case 1: Te sting Equality of Te sting Population Proportion Proportions from Two Against a Spe cific Value Populations Example: Has the % defective rate on Line 1 changed Example: Are Lines 1 and 2 from its historical value? running at the same % defective rate? Stat > Basi c Stati sti cs > 2-Proporti ons Ho:P1=P2 no di fference i n popluati on Stat > Basi c Stati sti cs > 1-Proporti on proporti ons M i niTab - Opti ons sel ect pool ed p Case 3: Te sting Equality of Proportions from M ore than Two Populations Example: Are Lines 1, 2 and 3 running at the same % defective rate? Use Chi-Square test MiniTab Stat>Tables>Chi-square test Example5.vsd 5-10-01 UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 6 Testing Equality of Population Application: Defect Rates Type of Data: Attribute (Discrete) - Poisson Distribution 1) Is the number of errors on invoices different between Dept. A and Dept. B? Examples: 2) Does the number of seat defects differ among shifts 1, 2 and 3? Comparing two Poission Comparing more than two Poisson Distributions Distributions Stat >Use 2 Sample t-test Basic Stats > 2-sample Poisson Rate Use One-Way Analysis of Variance Stat > Basic Stat > 2 Sample t Stat > ANOVA > One-way Caution No Extreme Outliers UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUODecision Tree # 7 Application: Testing for Association Type of Data: Attribute (Contingency Table Data) Does the type of defect that occurs Example: depend on which product is being produced? Chi-square test Minitab: Stat > Tables > Chi-square test UNCLASSIFIED / FOUO
  • UNCLASSIFIED / FOUOGroup 1:Hypothesis Tests for Variation  Use the Minitab electronic docs, stat guide, and help to learn about performing hypothesis tests for equality of variance among two populations. You may also use your textbooks if you wish.  Prepare a 10-15 minute teachback on hypothesis tests for variation.  Be sure to work an example in your teachback.  Hint: Using Minitab to conduct a hypothesis test to determine if there is a difference in the amount of variation exhibited by each support organization would be a good example to use. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 132
  • UNCLASSIFIED / FOUOGroup 2:Paired t-Test  Use the Minitab electronic docs, stat guide, and help to learn about performing paired t-tests. You may also use your textbooks if you wish.  Prepare a 10-15 minute teachback on paired t-tests.  Be sure to illustrate the difference between a standard 2-way t-test and a paired t-test.  Be sure to work an example in your teachback.  Go through a sample size calculation in your example. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 133
  • UNCLASSIFIED / FOUOGroup 3:Hypothesis Tests with Proportions  Use the Minitab electronic docs, stat guide, and help to learn about performing hypothesis tests with proportions. You may also use your text books if you wish.  Prepare a 10-15 minute teachback on hypothesis tests with proportions.  Be sure to illustrate the main difference between hypothesis tests of proportions and the other hypothesis tests we have talked about.  Include both 1-way and 2-way proportion hypothesis tests.  Be sure to work an example in your teachback.  Go through a sample size calculation in your example. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 134
  • UNCLASSIFIED / FOUOConfidence Interval Formulas Confidence Intervals for:  Mean (s Known) Mean (s Unknown) σ s x  Z α/2 x  t α/2,n1 n n  Standard Deviation n1 n1 s σs  α/2, n1 2  1α/2, n1 2  Proportions (Approximate) p 1  p  ˆ ˆ p 1  p  ˆ ˆ ˆ p  Z  /2 ˆ  p  p  Z  /2 n n Hypothesis Testing - Basic UNCLASSIFIED / FOUO 135
  • UNCLASSIFIED / FOUOTable of Normal Curve Areas z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.00 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.10 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.20 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.30 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.40 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.50 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.60 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 0.70 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 0.80 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.60 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 0 z 1.00 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 1.10 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 1.20 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 1.30 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 1.40 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 1.50 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 1.60 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 1.70 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 1.80 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 1.90 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 2.00 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817 2.10 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 2.20 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 2.30 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 2.40 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936 2.50 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 2.60 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 z area 2.70 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 3.5 0.49976737 2.80 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 4.0 0.49996833 2.90 0.4981 0.4982 0.4982 0.4983 0.4981 0.4984 0.4985 0.4985 0.4986 0.4986 4.5 0.49999660 3.00 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 5.0 0.49999971 Source: Computed by P.J. Hildebrand.Source: Statistical Thinking for Managers, Hildebrand and Ott, 4th Edition, page 800. Hypothesis Testing - Basic UNCLASSIFIED / FOUO 136
  • UNCLASSIFIED / FOUOCalculation of t Test Statistic  The t test statistic is calculated as follows: (X 1  X 2 )  D0 t sX 1  X 2 where D0 is the hypothesized difference between the two population means.  For an assumption of unequal variances: 2 2 s1 s2 sX   1 X 2 n1 n 2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 137
  • UNCLASSIFIED / FOUOCalculation of t Test Statistic  For an assumption of equal variances: 1 1 sX  sp  1 X 2 n1 n2 where 2 2 (n1  1)s 1  (n 2  1)s 2 sp  n1  n 2  2 Hypothesis Testing - Basic UNCLASSIFIED / FOUO 138