Your SlideShare is downloading. ×
En505 engineering statistics student notes
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

En505 engineering statistics student notes

3,277
views

Published on

Published in: Education, Technology

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,277
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
47
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. EN505 Engineering Statistics Fernando Tovia, Ph.D. 1 RANDOM VARIABLES AND PROBABILITY1.1 Random VariablesDefinition 1.1 A random experiment is an experiment such that the outcome cannot bepredicted in advance with absolute precision.Definition 1.2 The set of all possible outcomes of a random experiment is called thesample space. The sample space is denoted by Ω. An element of the sample space isdenoted by ω.Example 1.1 Construct the sample space for each of the following random experiments:1. flip a coin2. toss a die3. flip a coin twiceDefinition 1.3 A subset of Ω is called an event. Events are denoted by italicized, capitalletters.Example 1.2 Consider the random experiment consisting of tossing a die. Describe thefollowing events. 1. A = the event that 2 appears 2. B = the event that an even number appears 3. C = the event that an odd number appears 4. D = the event that a number appears 5. E = the event that no number appearsThe particular set that we are interested in depends on the problem being considered.However, a good thing to do when beginning any probability modeling problem is toclearly define all the events of interest.One graphical method of describing events defined on a sample space is the Venndiagram. The representation of an event using a Venn diagram is given in Figure 1.1 Notethat the rectangle corresponds to the sample space, and the shaded region corresponds tothe event of interest. Figure 1.1 Venn Diagram for Event A 1
  • 2. Definition 1.4 Let A and B be two event defined on a sample space Ω. A is a subset of B,denoted by A ⊂ B, if an only if (iff), ∀ ω ∈ A, ω ∈ B. (Figure 1.2) Figure 1.2 Venn Diagram for A ⊂ BDefinition 1.5 Let A be an event defined on a sample space Ω. ω ∈ Ac iff ω ∉ A. Ac iscalled the complement of A. (Figure 1.3) Figure 1.3 Venn Diagram for AcDefinition 1.6 Let A and B be two events defined on the sample space Ω. ω ∈ A ∪ B iffω ∈ A or ω ∈ B (or both). A ∪ B is called the union of A and B (see Figure 1.4). Figure 1.4 Venn Diagram for A ∪ BLet {A1, A2, …} be a collection of events defined on a sample space. ∞ω ∈ U Aj j =1iff ∃ some j = 1, 2, … ∋ ω ∈ Aj∞UAj =1 j is called the union of {A1, A2, …}Definition 1.7 Let A and B be two events defined on the sample space Ω. ω ∈ A ∩ B iffω ∈ A and ω ∈ B. A ∩ B is called the intersection of A and B (see Figure 1.5). Figure 1.5 Venn Diagram forLet {A1, A2, …} be a collection of events defined on a sample space. ∞ω ∈ I Aj j =1iff ω ∈ A ∀ j = 1, 2, …∞IAj =1 j is called the intersection of {A1, A2, …}Example 1.3 (example 1.2 continued) 1. Bc = C 2
  • 3. 2. B ∪ C = D 3. A ∩ B = BTheorem 1.1 Properties of ComplementsLet A be an event defined on a sample space Ω. Then(a)(b)Theorem 1.2 Properties of the UnionsLet A, B, C be events defined on a sample space Ω. Then(a)(b)(c)(d)(e)Example Prove Theorem 1.2 (c)Theorem 1.3 Properties of the IntersectionLet A, B, and C be events defined on the sample space Ω. Then(a)(b)(c)(d)(e)Example 1.6 Prove theorem 1.3 (b) 3
  • 4. Theorem 1.4 Distribution of Union and IntersectionLet A, B and C be events defined in the sample space Ω. Then (a) A ∩ (B ∪ C) = (A ∩ B) ∪ (A∩C) (b) A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C)Theorem 1.5 DeMorgan’s LawLet A, B and C be events defined in the sample space Ω. Then (a) (A ∪ B)c = Ac ∩ Bc (b) (A ∩ B) c = Ac ∪ BcDefinition 1.8 Let A and B be two events defined in the sample space Ω. A and B are saidto be mutually exclusive or disjoint iff A ∩ B = Ø (Figure 1.6). A collection of events{A1, A2, … }, defined on a sample space Ω, is said to be disjoint iff every pair of eventsin the collection is mutually exclusive. Figure 1.6 Venn Diagram for Mutually Exclusive EventsDefinition 1.9 A collection of events {A1, A2, …, An} defined on a sample space Ω, issaid to be a partition (Figure 1.7) of Ω iff (a) the collection is disjoint n (b) UA j =Ω j =1 Figure 1.7 Venn Diagram for a PartitionExample 1.7 (Example 1.2 continued) Using the defined event, identify: (a) a set of mutually exclusive events (b) a partition of the sample space 4
  • 5. Definition 1.10 A collection of events, F, defined on a sample space Ω, is said to be afield iff (a) Ω ∈ F, (b) if A Ω ∈ F, then Ac ∈ F, then n U A ∈ F, j =1 jWe use fields to represent all the events that we are interested in study. To construct afield: 1. we start with Ω 2. Ø is inserted by implication (Definition 1.10 (b) 3. we then add the events of interest 4. we then add complements and unionsExample 1.8 Suppose we perform a random experiment which consists of observing thetype of shirt worn by the next person entering a room. Suppose we are interested in thefollowing events. L = the shirt that has long sleeves S = the shirt has short sleeves N = the shirt has no sleevesAssuming that {L, S, N} is a partition of Ω, construct an appropriate fieldTheorem 1.6 Intersection are in FieldsLet F be a field of events defined in the sample space Ω. Then if A1, A2, … , An ∈ F,then nIA ∈ Fj =1 jExample 1.9 Prove that if A, B ∈ F, then A ∩ B ∈ F. 5
  • 6. Any meaningful expression containing events of interest, ∪ , ∩, and c can be shown to be in the field. Definition 1.11 Consider a set of elements, such as S = {a, b, c}. A permutation of the elements is an ordered sequence of elements. The number of permutations of n different elements is n! where n! = n x (n-1) x (n-2) x …x 2 x 1 Example 1.10 List all the permutations of the elements S Definition 1.12 The number of permutations of subsets of r elements selected from a set of n different elements is Another counting problem of interest is the number of subsets of r elements that can be selected from a set of n elements. Here the order is not important, and are called combinations. Definition 1.13 The number of combinations, subsets of size r that can be selected from a set of n elements, is denoted as Example 1.11 The EN505 class has 13 students. If teams of 2 students can be selected, how many different teams are possible?1.2 Probability Probability is used to quantify the likelihood, or chance, that an outcome of a random experiment will occur. Definition 1.14 A random variable is a real-valued function defined on a sample space. Random variables are typically denoted by italicized capital letters. Specific values taken on by a random variable are typically denoted by italicized, lower-case letters. Definition 1.15 A random variable that can take on a countable number of values is said to be a discrete random variable. Definition 1.16 A random variable that can take on an uncountable number of values is said to be a continuous random variable. 6
  • 7. Definition 1.17 The set of possible values for a random variable is referred as a range ofa random variable.Example 1.12 Consider the following experiments of random variables, define therandom variable, identify the range for each random variable, and classify it as discrete orcontinuous. 1. flip a coin 2. toss a die until a 6 appears 3. quality inspection of a shipment of manufactured items. 4. arrival of customer to a bankDefinition 1.18 Let Ω be the random space for some random experiment. For any eventdefined on Ω, Pr(·) is a function which assigns a number to the event. Pr(A) is called theprobability of event A provided the following conditions hold:(a)(b)(c)Probability is used to quantify the likelihood, or chance, that an event will occur withinthe sample space. 7
  • 8. Whenever a sample consists of N possible outcomes , the probability of each outcome is1/NTheorem 1.7 Probability Computational RulesLet A and B events defined on a sample space Ω, and let {A1, A2, …, An} be a collectionof events defined on Ω. Then(a)(b)(c)(d)(e)(f)Corollary 1.1 Union of Three or More EventsLet A, B, C and D be events defined on a sample space Ω. Then,Pr( A ∪ B ∪ C ) = Pr( A) + Pr( B ) + Pr(C ) − Pr( A ∩ B ) − Pr( A ∩ C ) − Pr( B ∩ C ) + Pr( A ∩ B ∩ C )andPr( A ∪ B ∪ C ∪ D ) = Pr( A) + Pr( B) + Pr(C ) + Pr( D) − Pr( A ∩ B) − Pr( A ∩ C ) − Pr( A ∩ D ) −− Pr( B ∩ C ) − Pr( B ∩ D) − Pr(C ∩ D) + Pr( A ∩ B ∩ C ) + Pr( A ∩ B ∩ D) + Pr( B ∩ C ∩ D) +− Pr( A ∩ B ∩ C ∩ D )Example 1.11 Let A, B and C be events defined on a sample space Ω ∋Pr(A) = 0.30Pr(Bc) = 0.60Pr(C) = 0.20Pr(A ∪ B) = 0.50Pr( B ∩ C ) = 0.05A and C are mutually exclusiveCompute the following probabilities (a) Pr(B) 8
  • 9. (b) Pr( B ∪ C ) = (c) Pr( A ∩ B ) (d) Pr( A ∪ C ) (e) Pr( A ∩ C ) (f) Pr( B ∩ C c ) (g) Pr( A ∪ B ∪ C ) =1.3 IndependenceTwo events are independent if any one of the following equivalent statements is true 9
  • 10. (1) P(A|B) = P(A) (2) P(B|A) = P(B) (3) P ( A ∩ B ) = P ( A) P ( B ) Example 2.29 (book work in class)1.4 Conditional ProbabilityDefinition 1.19. Let A and B events define on a sample space Ω ∋ B ≠ Ø. We refer toPr(A|B) as the conditional probability of event given the occurrence of event B, wherePr( AC | B) =probability of not A given B 10
  • 11. note that Pr(A|Bc) ≠ 1-Pr(A|B)Example 1.12A semiconductor manufacturing facility is controlled in a manner such that 2% ofmanufactured chips are subjected to high levels of contamination. If a chip is subjected tohigh levels of contamination, there is a 12% chance that it will fail testing. What is theprobability that a chip is subjected to high levels of contamination and fails upon tests?c=f=Pr(High c level) =Pr(Fail | high c level) =Pr(F∩C) =Example 1.13An air quality test is designed to detect the presence of two molecules (molecule 1 andmolecule 2). 17% of all samples contain both molecules, and 48% of all samples containmolecule 1. If a sample contains molecule 1, what is the probability that it also containsmolecule 2?M1 = molecule 1M2 = molecule 2Pr(M1∩M2) =Pr(M1) =Pr(M2|M1) =Theorem 1.8 Properties of Conditional ProbabilityLet A and B be non-empty event defined on a sample space Ω. Then (a) If A and B are mutually exclusive then Pr(A|B) = 0 (b) If A ⊂ B then Pr(A|B)>=Pr(A) 11
  • 12. (c) If B ⊂ A then Pr(A|B) =1Theorem 1.9 Law of Total Probability – Part 1Let A and B be events defined on a sample space Ω ∋ A ≠ Ø, B ≠ Ø, Bc ≠ Ø. ThenExample 1.15A certain machine’s performance can be characterized by the quality of a key component.94% of machines with a defective key component will fail. Whereas only 1% ofmachines with a non-defective key component will fail. 4% of machines have a defectivekey component. What is the probability that the machine will fail?F = failD = defectivePr(D) =Pr(F|D) =Pr(F|Dc) =Pr(F) =Theorem 1.11 Bayes’ Theorem – Part 1Let A and B be events defined on a sample space Ω ∋ A ≠ Ø, B ≠ Ø, Bc ≠ Ø. ThenExample 1.15 (Example 1.14 continues) Suppose the machine fails. What is the probability that the key component wasdefective?Pr(D|F) = 12
  • 13. Theorem 1.12 Law of Total Probability – Part 2Let A be a non-empty event defined on a sample space Ω, and let {B1, B2, …, Bn}be apartition of Ω ∋ Bj ≠ Ø ∀ j =1, 2, …, n. ThenPr( A) =Theorem 1.13 Bayes’ Theorem – Part 2Let A be a non-empty event defined on a sample space Ω, and let {B1, B2, …, Bn}be apartition of Ω ∋ Bj ≠ Ø ∀ j =1, 2, …, n. Then Pr( A | B j ) Pr( B j ) Pr( A | B j ) Pr( B j )Pr( B j | A) = = n Pr( A) ∑ Pr( A | B ) Pr( B ) i =1 i i 13
  • 14. 14
  • 15. 2 DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONSA discrete random is a random variable that can take on at most countable number ofvalues.Definition 2.1 Let X be a discrete random variable having cumulative distributionfunction F. Let x1, x2 , ... denote the possible values of X. Then f(x) is the probabilitymass function (pmf) of X, ifa) f(x) = P(X = x)b) f(xj) > 0, j = 1, 2, …c) f(x) = 0, if x = xj, j = 1, 2, ∞d) ∑ f (x ) = 1 j =1 jDefinition 2.2 The cumulative distribution function of a discrete random variable X, isdenoted by F(X), and is given byF(X) = ∑ f ( x) xj ≤xAnd satisfies the following propertiesa) F(X) = P(X ≤ x) = ∑ f (x ) xj ≤x jb) 0 ≤ F( X ) ≤ 1c) if x ≤ y then F( X ) ≤ F( Y )Example 2.1 Suppose X is a discrete random variable having pmf f and cmf F, wheref(1) = 0.1, f(2) = 0.4, f(3) = 0.2, f(4) = 0.3.1. Construct the cumulative distribution function of X. 15
  • 16. 2. Compute Pr(X ≤ 2).3. Compute Pr(X < 4).4. Compute Pr(X ≥ 2).Definition 2.3 The mean or expected value of X, denoted as µ or E(X), is µ = E ( X ) = ∑ xf ( x) xThe variance of X, denoted by Var (X) and given by σ 2 = V ( X ) = E ( x − µ )2 = ∑ ( x − µ )2 f ( x) = ∑ x 2 f ( x) − µ 2 x xThe standard deviation of X is σ = σ 2Definition 2.4 Let X be a discrete random variable with a probability mass function f(x).The expected value of X is denoted by E(X) and given by ∞ E( X ) = ∑ x j f ( x j ) j =12.1 Discrete Distributions2.1.1 Discrete Uniform DistributionSuppose a random experiment has a finite set of equally likely outcomes. If X is a randomvariable such that there is one-to-one correspondence between the outcomes and the setof integers {a, a + 1, … , b}, then X is a discrete uniform random variable havingparameters a and b.NotationRangeProbability Mass Function 16
  • 17. ParametersMeanVariance.Example 2.2 Let X ~ DU(1, 6). 1. Compute Pr(X = 2). 2. Compute Pr(X > 4)2.1.2 The Bernoulli Random VariableConsider a random experiment that either “succeeds” or “fails”. If the probability ofsuccess is p, and we let X = 0 if the experiment fails and X = 1 if it succeeds, then X is aBernoulli random variable with probability p. Such a random experiment is referred toas a Bernoulli trial.NotationRangeProbability Mass FunctionParameter 17
  • 18. MeanVariance2.1.3 The Binomial DistributionThe binomial distribution denotes the number of success in n independent Bernoullitrials with probability p of success on each trial.NotationRangeProbability Mass FunctionCumulative Distribution FunctionParametersMeanVarianceComments if n = 1, then X ~ Bernoulli(p) 18
  • 19. Example 2.3 Each sample of air has a 10% chance of containing a particular raremolecule.1. Find the probability that in the next 18 samples, exactly 2 contains the rare molecule.2. Determine the probability that at least four samples contain the rare molecule3. Determine the probability that there will be at least one sample with the rare moleculebut less than four?2.1.4 The Negative Binomial Random Variable The negative binomial random variable denotes the number of trials until the kth success in a sequence of independent Bernoulli trials with probability p of success on each trial.NotationRangeProbability Mass Function 19
  • 20. Cumulative Distribution FunctionParametersMeanVarianceExample 2.4 A high-performance aircraft contains three identical computers. Only one isused to operate the aircraft; the other two are spares that can be activated in case theprimary system fails. During one hour of operation, the probability of a failure in theprimary computer is 0.0005. 1. Assuming that each hour represents an independent trial, what is the mean timeto failure of all the three components? 3. What is the probability that all three computers fail within a 5-hour flight? 20
  • 21. Comments: If k = 1, then X ~ geom(p), i.e. X is a geometric random variable having aprobability of success p2.1.5 The Geometric DistributionIn a series of independent Bernoulli trials, with constant probability p of a success, let therandom variable X denote the number of trials until the first success. Then X has ageometric distribution.NotationRangeProbability Mass FunctionCumulative Distribution FunctionParametersMeanVarianceExample 2.3 Consider a sequence of independent Bernoulli trials with a probability ofsuccess p = 0.2 for each trial.(a) What is the expected number of trials to obtain the first success? 21
  • 22. (b) After the eight successes occurs, what is the expected number of trials to obtain the ninth success?2.1.6 The Hypergeometric Random VariableConsider a population consisting of N members, K of which are denoted as successes.Consider a random experiment during which n members are selected random from thepopulation, and let X denote the number of successes in the random sample. If themembers in the sample are selected from the population without replacement, then X isa hypergeometric random variable having parameters N, k and n.NotationRangeProbability mass functionParameters 22
  • 23. Comments If the sample is taken from the population with replacement, then X~bin(n,K/N). Therefore, if n<<N, we can use the approximation bin(n,K/ N) ≈ HG(N, K, n).Example 2.4 Suppose a shipment of 5000 batteries is received, 150 of them beingdefective. A sample of 100 is taken from the shipment without replacement. Let X denotethe number of defective batteries in the sample.1. What kind of random variable is X, and what is the range of X?2. Compute Pr(X = 5).3. Approximate Pr(X = 5) using the binomial approximation to the hypergeometric.2.1.7 The Poisson Random VariableThe Poisson random variable denotes the number of events that occurs in an interval oflength t when events occur at a constant average rate λ.Notation 23
  • 24. Probability Mass FunctionCumulative Distribution FunctionParametersCommentsThe Poisson random variable X equals the number of counts in the time interval t. Thecounts in each subinterval is independent of other subinterval.If np>5 or (1-p) > 5, then we can use the approximation, Poisson ≈ bin (n, p).MeanVarianceIt is important to use consistent units in the calculations of probabilities, means, andvariances involving Poisson random variable.Example 2.5 Contamination is a problem in the manufacture of optical storage disks. Thenumber of particles of contamination that occur on an optical disk has a Poisson 24
  • 25. distribution, and the average number of particles per centimeter squared of media surfaceis 0.1. The area of a disk under study is 100 squared centimeters.a) Find the probability that 12 particles occur in the area of a disk under study.b) Find the probability that zero particles occur in the area of the disk under the study.c) Find the probability that 12 or fewer particles occur in the area of a disk under study.2.1.8 Poisson ProcessUp to this time in the course, we have discussed the assignment of probabilities to eventsand random variables, and by manipulating these probabilities we can analyze “snap-shots” of systems behavior at certain point in the time, or under certain conditions. Now,we are going to study one of the most commonly recognized continuous-time stochasticprocesses that allow us to study important aspects of systems behavior over a timeinterval t.Definition 2.5 Let {N(t), t≥ 0} be a counting process. Then, {N(t), t≥ 0}is said to be aPoisson process having rate λ, λ > 0, iff a. start counting from zero 25
  • 26. b. The number of outcomes occurring in one time or (specific region) is independent of the number that occurs in any other disjoint time interval or region space, which can be interpreted that the Poisson process has no memory. c. the number of events in any interval (s, s + t) is a Poisson random variable with mean λt. d. The probability that more that more than one outcome will occur at the same time is negligible.This is denoted by N(t) ~ PP(λ), which λ refers to the average rate at which events occur.The part (c) of the definition implies that 1) 2) 4)Note that in order to be a Poisson proves the average event occurrence MUST BECONSTANT over time, otherwise the Poisson process would be an inappropriate model.Also note, that t can be interpreted as the specific “time”, “distance”, “area”, “volume” ofinterest.Example 2.6 Customers arrive to a facility according to a Poisson process with rate λ =120 customer per hour. Suppose we begin observing the facility at some point in time. a) What is the probability that 8 customers arrive during a 5-minute interval? b) On average, how many customers will arrive during a 3.2-minute interval? 26
  • 27. c) What is the probability that more than 2 customers arrive during a 1-minute interval?d) What is the probability that 4 customers arrive during the interval that begins 3.3 minutes after we start observing and ends 6.7 minutes after we start observing?e) On average, how many customers will arrive during the interval that begins 16 minutes after we start observing and ends 17.8 minutes after we start observing?f) What is the probability that 7 customers arrive during the first 12.2 minutes we observe, given that 5 customers arrive during the first 8 minutes? 27
  • 28. g) If 3 customers arrive during the first 1.2 minutes of our observation period, on average, how many customers will arrive during the first 3.7 minutes?h) If 1 customer arrives during the first 6 seconds or our observations, what is the probability that 2 customers arrive during the interval that begins 12 seconds after we observing and ends 30 seconds after we start observing?i) If 5 customers arrive during the first 30 seconds of our observations, on average, how many customers will arrive during the interval that begins 1 minute after we start observing and ends 3 minutes after we start observing?j) If 3 customers arrive during the interval that starts 1 minute after we start observing and ends 2.2 minutes after we start observing, on average, how many customers will arrive during the first 3.7 minutes? 28
  • 29. Example 2.7 (Binomial approximation)In a manufacturing process where glass products are produced, defects or bubbles occur,occasionally rendering the piece undesirable for marketing. It is known that, on average,1 in every 1000 of these items produced has one or more bubbles. What is the probabilitythat a random sample of 8000 will yield fewer than 7 items possessing bubbles? 29
  • 30. 3 CONTINUOUS RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONSAs stated earlier, a continuous random variable is a random variable that can take on anuncountable number of values.Definition 3.1 The probability density function of a continuous random variable X is anonnegative function f(x) defined ∀ real x ∋ for any set A of real numbersTheorem 3.1 Integral of a Density FunctionThe function f is a density function iffAll probability computations for a continuous random variable can be answered using thedensity function.Theorem3.2 Probability Computational Rules for Continuous Random VariablesLet X be a continuous random variable having cumulative distribution function F andprobability density function f. Then(a)(b)(c) 30
  • 31. (d)(e)The mean or expected value of X, denoted as µ or E(X), isThe variance of X, denoted by Var (X) and given byExample 3.1 Consider a continuous random variable X having the following densityfunction where c is a constant  c(1 − x 2 ) 0 ≤ x ≤ 1 f ( x) =  0 otherwise 1. What is the value of c? 2. Construct the cumulative distribution function of X. 31
  • 32. 3. Compute Pr(0.2< X ≤ 0.8) = 4. Compute Pr(X >0.5) =Part (d) of Theorem 1.3.2 states that the probability density function is the derivative ofthe cumulative distribution function. Altough this is true, it does not provide adequateintuition as to the interpretation of the density function. For a discrete random variable,the probability mass function actually assigned probabilities to the possible values of therandom variables. Theorem 1.3.2 (b) states that the probability of any specific value for acontinuous random variable is 0. The probability density function is not the probability ofa specific value. It is, however, the relative likelihood ((as compared to other possiblevalues) that the random variable will be near a certain value.Continuous random variables are typically specified in terms of the form of theirprobability density functions. In addition, some continuous random variables have beenwidely-used in probability modeling. We will consider some of these more commonly-used random variables, including: 1. the uniform random variable, 2. the exponential random variable, 3. the gamma random variable, 4. the Weibull random variable, 5. the normal random variable, 6. the lognormal random variable, 7. the beta random variable,3.1 The Uniform Continuous Random VariableNotationRangeProbability Density Function 32
  • 33. Cumulative Distribution FunctionParametersMeanVarianceComments As its name implies, the uniform random variable is used to representquantities that occur randomly over some interval of the real line.An observation of a U(0,1) random variable is referred to as a random number.Example 3.2 Verify that the equation for the cumulative distribution of the uniformrandom variable is correct.Example 3.3 The magnitude (measured in N) of a load applied to a steel beam isbelieved to be a U(2000, 5000) random variable. What is the probability that the loadexceeds 4200 N? 33
  • 34. 3.2 The exponential RandomThe random variable X that equal the distance (time) between successive counts of aPoisson process with mean λ (rate, events per time unit, i.e. arrivals per hour, failures perday, etc.) has an exponential distribution with parameter λ.NotationRangeProbability Density FunctionCumulative Distribution FunctionParametersMeanVarianceComments λ is called the rate of the exponential distribution. 34
  • 35. Example 3.4 In a large computer network, user log-ons to the system can be modeled asa Poisson process with a mean of 25 logs-on per hour. What is the probability that thereare no log-ons in an interval of 6 minutes?What is the probability that the time until the next log-on is between 2 and 3 minutes?Upon converting all units to hours,Determine the interval of time that the probability that no log-on occurs in the interval is0.90. Te question asks for the length of time x such that Pr(X>x) = 0.90.What is the mean time until the next log-on?What is the standard deviation of the time until the next log-on? 35
  • 36. Theorem 3.3 The Memoryless Property of the Exponential DistributionLet X be a continuous random variable. The X is an exponential random variable iffTheorem 3.4 The Conditional Form of the Memoryless PrpertyLet X be a continuous random variable. Then X is an exponential random variable iffFurthermore, no other continuous random variable possesses this property.There areseveral implications of the memoryless property of the exponential random variable.  First, if the exponential random variable is used to model the lifetime of a device, then at every point in time until it fails, the device is as good as new (from a probabilistic standpoint).  If the exponential random variable is used to model an arrival time, then at every point in time until the arrival occurs, it is as just began “waiting” for the arrival.Example 3.5 Suppose that the life length of a component is an exponential randomvariable with rate 0.0001. Note that time units are hours. Determine the following. a) What is the probability that the component lasts more than 2000 hours? b) Given that the component lasts at least 1000 hours, what is the probability that it lasts more than 2000 hours? 36
  • 37. Theorem 3.5 Expectation under the Memoryless Property Let X be an exponential random variable. ThenExample 3.6 (Example 3.5 continued) a) Given that the component lasts at least 1000 hours, what is the expected value of its life length? b) Given that the component has survived 1000 hours, on average, how much longer will it survive?3.3 The Normal DistributionNotationRangeProbability Density FunctionCumulative Distribution Function no closed form expressionParameters 37
  • 38. MeanVarianceCommentsStandard Normal Random VariableIf µ = 0 and σ = 1, then X is referred as the standard normal random variable. Thestandard random normal variable is often denoted by Z.The cumulative distribution of the standard normal random variable is denoted as Φ ( z ) = Pr( Z ≤ z )Appendix A Table I provides cumulative probabilities for a standard random variable.For example, assume that Z is a standard normal random variable. Appendix A Table Iprovides probabilities of the form Pr(Z≤ 1.53). Find in the column z 1.5 and find in therow 0.03, then Pr(Z≤ 1.53) = 0.93699.The same value can be obtained in Excel, type function icon (fx), statistical,NORMSDIST(z), enter 1.53, and Excel will give you the result in the cell=NORMSDIST(1.53) = 0.936992The functionIs denoted a probability from Appendix A Table I. It is the cumulative distributionfunction of a standard normal random variable. (see figure 4-13 page 124 from theMontgomery book).Example 3.7 (Example 4-12Montgomery)Some useful results concerning a normal distribution are summarized in Fig 4-1413.(textbook). For any random variable 38
  • 39. 1)2)3)4)If X ~ N(µ,σ2), then (X -µ)/ σ ~N(0,1), which is known as the z-transformation. That is, Zis a standard normal random variable.Suppose X is a normal random variable with mean µ and standard deviation σ. Then, 39
  • 40. Example 3.8 One key characteristic of a certain type of drive shaft is its diameter, andthe diameter is a normal distributed random variable having µ = 5 cm and σ = 0.08 cm.a) What is the probability that the diameter of a given drive shaft is between 4.9 and 5.05 cm?b) What diameter is exceeded by 90% of drive shafts?c) Provide tolerances, symmetric about the mean, that capture 99% of drive shafts. 40
  • 41. Example 3.9The diameter of a shaft in an optical storage drive is normally distributed with mean0.2508 inch and standard deviation 0.0005 inch. The specifications on the shaft are±0.0015 inch. What proportion of shafts conforms to specifications?3.3.1 Normal Approximation to the Binomial and Poisson DistributionsBinomial ApproximationIf X is a binomial random variable with parameter n and pIs approximately a standard random variable. To approximate a binomial probability witha normal distribution a correction (continuity) factor is apllied. 41
  • 42. The approximation is good for np > 5 and n(1-p) > 5.Poisson ApproximationIf X is a Poisson random variable with E(X) = λ and V(X) = λ,is approximately a standard random variable. The approximation is good for λ > 5.Example 3.10The manufacturing of semiconductor chips produces 2% defective chips. Assume thatchips are independent and that a lot contains 1000 chips.a) Approximate the probability that more than 25 chips are defective.b) Approximate the probability that between 20 and 30 chips are defective 42
  • 43. 3.4 Lognormal DistributionVariables in the system sometimes follow an exponential relationship, where theexponent is a random variable, say W, X = exp(W). If W has a normal distribution thenthe distribution of X is called a lognormal distribution.NotationRangeProbability density functionCumulative Distribution Function no closed form expressionParametersComments If Y ~ N(µ, σ2) and X = eY, then X ~ LN((µ, σ2) The lognormal random variable is often used to represent elapsed times, especially equipment repair times, and material properties.MeanVariance 43
  • 44. Example 3.11 A wood floor system can be evaluated in one way by measuring itsmodulus of elasticity (MOE) measured in 106 psi. One particular type of system is suchthat its MOE is a lognormal random variable having µ = 0.375 and σ = 0.25. 1. What is the probability that a system’s MOE is less than 2? 2. Find the value of MOE that is exceeded by only 1% of the systems?3.5 The Weibull DistributionThe Weibull distribution is often used to model the time until failure of many differentphysical systems. It is used in Reliability time-dependent failures models, where thefailure distribution may be used to model both increasing and decreasing failure rates.Notation 44
  • 45. RangeProbability Density FunctionCumulative Distribution FunctionParametersMeanVarianceComments If β = 1, then X ~ expon(1/η) The Weibull random variable is most often used to represent elapsed time, especially time to failure of a unit of equipment.Example 3.12 The time to failure of a power supply is a Weibull random variable havingβ = 2.0 and η = 1000.0 hours. The manufacturer sells a warranty such that only 5% of thepower supplies fail before the warranty expires. What is the time period of the warranty? 45
  • 46. 4 JOINT PROBABILITY DISTRIBUTIONSUp to this point we have considered issues related to a single random variable. Now weare going to consider situations in which we have two or more random variables that weare interested in studying.4.1 Two or more discrete random variablesDefinition 4.1 The function f(x, y) is a joint probability distribution or probabilitymass function of discrete random variables X and Y if1.2.3.Example 4.1 Let X denote the number of times a certain numerical control machine willmalfunction: 1, 2 or 3 times on a given day. Let Y denote the number of times atechnician is called on an emergency call. Their joint probability distribution is given as f (x, y ) x 1 2 3 1 0.05 0.05 0.1 y 2 0.05 0.1 0.35 3 0 0.2 0.1a) Find P(X<3, Y = 1)b) Find the probability that the technician is called at least 2 times and the machine failsno more than 1 time. 46
  • 47. c) Find P(X>Y)When studying joint probability distribution we are also interested in the probabilitydistributions of each variable individually, which is referred as the marginal probabilitydistribution.Theorem 4.1 Let X and Y be discrete random variables having joint probability massfunctions f(x, y). Let x1 , x2 ,... denote the possible values of X, and let y1 , y2 ,... denote thepossible values of Y. Let f x ( x) denote the marginal probability mass function of X, andlet f y ( y ) denote the (marginal) probability mass function of Y. Then, 47
  • 48. Example 4.2 Let X and Y be discrete random variables such thatf(1, 1) = 1/9 f(1, 2) = 1/6 f(1, 3) = 1/8f(2, 1) = 1/18 f(2, 2) = 1/9 f(2, 3) = 1/9f(3, 1) = 1/9 f(3, 2) = 1/9 f(3, 3) = 1/6Find the marginal probability mass function of X and Y 48
  • 49. Definition 4.2 The function f(x, y) is a joint probability density function of continuousrandom variables X and Y if1.2.3.Example 4.3 A candy company distributes boxes of chocolates with a mixture of creams,toffees, and nuts coated in both light and dark chocolates. For a randomly selected box,let X and Y, respectively, be the proportions of the light and dark chocolates that arecreams and suppose that joint density function is 2   (2 x + 3 y ), 0 ≤ x ≤ 1, 0 ≤ y ≤ 1f ( x, y ) =  5  0,  elsewhere  a) verify condition 2 49
  • 50. c) Find P[( X , Y ∈ A], where A={(x,y),|0 ≤ x ≤ 1/2, 1/4 ≤ y ≤ 1/2}Theorem 4.2 Marginal Probability Density FunctionLet X and Y be continuous random variables having probability density function f(x, y).le. Let f x ( x) denote the marginal probability density function of X, and let f y ( y ) denotethe (marginal) probability density function of Y. Then, 50
  • 51. Example 4.4 Let X and Y be continuous random variables such that f(x, y) = 0.75 e-0.3yFind the marginal probability density function of X and Y. Theorem 4.3 The Law of the Unconscious StatisticianLet X and Y be discrete (continuous) random variables having joint probability mass(density) function f(x,y). Let x1, x2, … denote the possible values of X, and let y1, y2, …denote the possible values of Y. Let g(X,Y) be a real-valued function. Then 51
  • 52. Example 4.5 Suppose X and Y are discrete random variables having joint probabilitymass function f(x,y). Let x1, x2, … denote the possible values of X, and let y1, y2, … denotethe possible values of Y. What is E(X+Y)? 52
  • 53. Theorem 4.4 Expectation of a Sum of Random VariablesLet X1, X2, …, Xn be random variables, and let a1, a2, …an be constants. ThenExample 4.6 What is the E(3X – 2Y + 4)?Theorem 4.5 Independent Discrete Random Variables Let X and Y be randomvariables having joint probability mass function f(x, y). Let fx(x) denote the marginalprobability mass function of X, and let fy(y) denote the marginal probability massfunction of Y. Then X and Y are said to be independent iff 53
  • 54. Theorem 4.6 Independent Continuous Random Variables Let X and Y be randomvariables having joint probability density function f(x, y). Let fx(x) denote the marginalprobability density function of X, and let fy(y) denote the marginal probability densityfunction of Y. Then X and Y are said to be independent iffExample 4.6 Consider example 4.2. Are X and Y independents?Example 4.7 Consider example 4.4. Are X and Y independent?Definition 4.3 Let X and Y be random variables. The covariance of X and Y is denoted asCov(X,Y) and given by 54
  • 55. A positive covariance indicates that X tends to increase (decrease) as Y increases(decreases). A negative covariance indicates that X tends to decrease (increase) as Yincreases (decreases).Example 4.8 Example 4.2 continued. Find the covariance of X and Y.Theorem 4.7 Covariance of Independent Random VariablesLet X and Y be random variables. If X and Y are independent, thenCov(X, Y) = 0.Theorem 4.8 Variance of the Sum of Random VariablesLet X1, X2, … , XN be random variables. ThenTheorem 4.9 Variance of the Sum of Independent Random VariablesLet X1, X2, … , XN be independent random variables. Then 55
  • 56. Definition 4.4 Let X and Y be two random variables. The correlation between X and Y isdenoted by ρxy and given byNote that correlation and covariance have the same interpretation regarding therelationship between the two variables. However, correlation does not have units and isrestricted to the range (-1, 1). Therefore, the magnitude of the correlation provides someidea of the strength of the relationship between the two random variables. 56
  • 57. 5 RANDOM SAMPLES, STATISTICS AND THE CENTRAL LIMIT THEOREMDefinition 5.1 Independent random variables X1, X2, … ,Xn are called a random sample.A randomly selected sample means that if a sample of n objects is selected, each subsetof size n is equally likely to be selected. If the number of objects in the population ismuch larger than n, the random variables X1, X2, … ,Xn that represents the observationsfrom the sample can be shown to be approximately independent random variables withthe same distribution.Definition 5.2 A statistic is a function of the random variables in a random sample.Given the data, we calculate statistics all the time, such as the sample mean X and thesample standard deviation S. Each statistic has a distribution and it is the distribution thatdetermines how well it estimates a quantity such as μ.We begin our discussions by focusing on a single random variable, X. To perform anymeaningful statistical analysis regarding X, we must have data.Let X be some random variable of interest. A random sample on X consists of nobservations on X: x1, x2, … , xn. We assume that these observations are independent andidentically distributed. The value of n is referred to as the sample size.Definition 5.3 Descriptive statistics refers to the process of collecting data on a randomvariable and computing meaningful quantities (statistics) that characterize the underlyingprobability distribution of the random variable.There are three points of interest regarding this definition. • Performing any type of statistical analysis requires that we collect data on one or more random variables. • A statistic is nothing more than a numerical quantity computed using collected data. • If we knew the probability distribution which governed the random variable of interest, collecting data would be unnecessary.Types of Descriptive Statistics 1. measures of central tendency • sample mean (sample average) • sample median • sample mode (discrete random variables only) 57
  • 58. 2. measures of variability • sample range • sample variance • sample standard deviation • sample quartiles Microsoft Excel has a Descriptive Statistics tool within its Data Analysis ToolPak.Computing the Sample Mean • Most of your calculators have a built-in method for entering data and computing the sample mean. • Note the sample mean is a point estimate of the true mean of X. In other words,Computing the Sample Median To compute the sample median, we first rank the data in ascending order and re- number it: x(1), x(2), …. , x(n). The sample median corresponds to the value that has 50% of the data above it and 50% of the data below it.Computing the Sample Mode The sample mode is the most frequently occurring value in the sample. It is typically only of interest in sample data from a discrete random variable, because sample data on a continuous random variable often does not have any repeated values. 58
  • 59. Compute the Sample RangeComputing the Sample Variance • Why do we divide by n − 1? We divide by n − 1 because we have n − 1 degrees of freedom. This refers to the fact that if we know the sample mean and n − 1 of the data values, we can compute the remaining data point. • Note that the sample variance is a point estimate of the true variance. In other words,Computing the Sample Standard Deviation• Note that the sample standard deviation is a point estimate of the true standarddeviation.Theorem 5.1. If X1, X2, … ,Xn is a random sample of size n taken from a population withmean μ and variance σ2, μ and variance σ2, and if X is the sample mean, the limiting formof the distribution of 59
  • 60. as n→∞, is the standard normal distribution. 5.1 Populations and Random Samples The field of statistical inference consists of those methods used to drawconclusions about a population. These methods utilize the information contained in arandom sample of observations from the population. Statistical inference may be divided into two major areas: • parameter estimation • hypothesis testingBoth of these areas require a random sample of observations from one or morepopulations, therefore, we will begin our discussions by addressing the concepts ofrandom sampling. Definition 5.4 A population consists of the totality of the observations with whichwe are concerned. • We almost always use a random variable/probability distribution to model the behavior of a population. Definition 5.5 The number of observations in the population is called the size of thepopulation. • Populations may be finite or infinite. However, we can typically assume the population is infinite. • In some cases, a population is conceptual. For example, the population of items to be manufactured is a conceptual population. Definition 5.6 A sample is a subset of observations selected from a population. • We model these observations using random variables. • If our inferences are to be statistically valid, then the sample must be representative of the entire population. In other words, we want to ensure that we take a random sample. 60
  • 61. Definition 5.7 The random variables X1, X2, … , Xn are a random sample of size nif X1, X2, … , Xn are independent and identically distributed. • After the data has been collected, the numerical values of the observations are denoted as x1, x2, … , xn. • The next step in statistical inference is to use the collected data to compute one or more statistics of interest. 5.2 Point Estimates ˆ Definition 5.8 A statistic, Θ , is any function of the observations in a random sample. • In parameter estimation, statistics are used to estimate quantities of interest. • • The measures of central tendency and variability we considered in “Descriptive Statistics” are all statistics. • Definition 5.9 A point estimate of some population parameter θ is a single ˆ ˆ numerical value θ of a statistic Θ . •Estimation problems occur frequently in engineering. The quantities that we will focuson are: • the mean µ of a population • the standard deviation σ of a population • the proportion p of items in a population that belong to a class of interest – p is the probability of success for a Bernoulli trialThe point estimates that we use are: • • • 61
  • 62. 5.3 Sampling Distributions A statistic is a function of the observations in the random sample. These observationsare random variables, therefore, the statistic itself is a random variable. All randomvariables have probability distributions. Definition 5.10 The probability distribution of a statistic is called a samplingdistribution. • The sampling distribution of a statistic depends on the probability distribution which governs the entire population, the size of the random sample, and the method of sample selection. Theorem 5.3 The Sampling Distribution of the Mean If X1, X2, … , Xn are IID N(µ,σ2) random variables, then the sample mean is anormal random variable having mean and variance .Thus, if we are sampling from a normal population then the sampling distribution of themean is normal. But what if we are not sampling from a normal population? Theorem 5.4 The Central Limit Theorem If X1, X2, … , Xn is a random sample of size n taken from a population with meanµ and variance σ2, then as n → ∞,is a standard normal random variable. • The quality of the normal approximation depends on the true probability distribution governing the population and the sample size. • For most cases of practical interest, n ≥ 30 ensures a relatively good approximation. • If n < 30, then the underlying probability distribution must not be severely non-normal. Example 5.1 A plastics company produces cylindrical tubes for various industrialapplications. One of their production processes is such that the diameter of a tube isnormally distributed with a mean of 1 inch and a standard deviation of 0.02 inch. (a) What is the probability that a single tube has a diameter of more than 1.015 inches? 62
  • 63. X = diameter of a tube (measured in inches) ~ N( ) (b) What is the probability that the average diameter of five tubes is more than 1.015 inches? n= X = average diameter ~ N( ) (c) What is the probability that the average diameter of 25 tubes is more than 1.015 inches? n= X = average diameter ~ N( ) Example 5.2 The life length of an electronic component, T, is exponentiallydistributed with a mean of 10,000 hours. (a) What is the probability that a single component lasts more than 7500 hours? (b) What is the probability that the average life length for 200 components is more than 9500 hours? E(T) = hours σT = hours 63
  • 64. Note that . (c) What is the probability that the average life length for 10 components is more than 9500 hours? n is too small to use the CLT approximation S 10 Note that T = . 10 If we had tried to use the CLT: Now consider the case in which we are interested in studying two independentpopulations. Let the first population have mean µ1 and standard deviation σ1, and let thesecond population have mean µ2 and standard deviation σ2.If we are interested in comparing the two means, then the obvious point estimate ofinterest is µ1 − µ 2 = X 1 − X 2 . ˆ What is the sampling distribution of this statistic? 64
  • 65. Theorem 5.4 The Sampling Distribution of the Difference in Two Means If we have two independent populations with means µ1 and µ2 and standarddeviations σ1 and σ2, and if a random sample of size n1 is taken from the first populationand a random sample of size n2 is taken from the second population, then the samplingdistribution ofis standard normal as n1 and n2 → ∞. If the two populations are normal, then thesampling distribution of Z is exactly standard normal. • Again, the approximation is relatively accurate if n1 ≥ 30 and n2 ≥ 30. Example 5.3 The life length of batteries produced by Battery Manufacturer A is acontinuous random variable having a mean of 1500 hours and a standard deviation of 100hours. The life length of batteries produced by Battery Manufacturer B is a continuousrandom variable having a mean of 1400 hours and a standard deviation of 200 hours. (a) Suppose 50 batteries of each type are tested. What is the probability that Battery Manufacturer A’s sample average life length exceeds Battery Manufacturer B’s by more than 75 hours? (b) How would your answer change if only 12 batteries of each type were tested? There is not enough information to answer the question. If we assume normality, then we could proceed. 65
  • 66. 5.4 Confidence Intervals A point estimate provides only a single number for drawing conclusions about aparameter. And if another random sample were selected, this point estimate wouldalmost certainly be different. In fact, this difference could be drastic. For this reason, a point estimate typically does not supply adequate information toan engineer. In such cases, it may be possible and useful to construct a confidenceinterval which expresses the degree of uncertainty associated with a point estimate. Definition 5.11 If θ is the parameter of interest, then the point estimate andsampling distribution of θ can be used to identify a 100(1 − α )% confidence interval onθ. This interval is of the form:L and U are called the lower-confidence limit and upper-confidence limit. If L and Uare constructed properly, then . The quantity (1 − α) is called the confidence coefficient. • The confidence coefficient is a measure of the accuracy of the confidence interval. For example, if a 90% confidence interval is constructed, then the probability that the true value of θ is contained in the interval is 0.9. • The length of the confidence interval is a measure of the precision of the point estimate. A general rule of thumb is that increasing the sample size improves the precision of a point estimate.Confidence intervals are closely related to hypothesis testing. Therefore, we will addressconfidence intervals within the context of hypothesis testing. 66
  • 67. 6 FORMULATING STATISTICAL HYPOTHESESFor many engineering problems, a decision must be made as to whether a particularstatement about a population parameter is true or false. In other words, we must eitheraccept the statement as being true or reject the statement as being false.Example 6.1 Consider the following statements regarding the population of engineeringstudents at the Philadelphia University. 1. The average GPA is 3.0. 2. The standard deviation of age is 5 years. 3. 30% are afraid to fly 4. The average age of mothers is the same as the average age of fathers.Definition 6.1 A statistical hypothesis is a statement about the parameters of one ormore populations. • It is worthwhile to note that a statistical hypothesis is a statement about the underlying probability distributions, not the sample data.Example 6.2 (Ex. 6,1 continued) Convert each of the statements into a statisticalhypothesis. 1. 2. 3. 4.To perform a test of hypotheses, we must have a contradictory statement about theparameters of interest.Example 6.3 Consider the following contradictory statements. 1. No, it’s more than that. 2. No, it’s not. 3. No, it’s less than that. 67
  • 68. 4. No, fathers are older.The result of our original statement and our contradictory statement is a set of twohypotheses.Example 6.4 (Ex. 6.1 continued) Combine the two statements for each of the examples. 1. 2. 3. 4.Our original statement is referred to as the null hypothesis (H0). • The value specified in the null hypothesis may be a previously established value (in which case we are trying to detect changes to that value), a theoretical value (in which case we are trying to verify the theory), or a design specification (in which case we are trying to determine if the specification has been met).The contradictory statement (H1) is referred to as the alternative hypothesis. • Note that an alternative hypothesis can be one-sided (1, 3, 4) or two-sided (2). • The decision as to whether the alternative hypothesis should be one-sided or two- sided depends on the problem of interest.Type I ErrorRejecting the null hypothesis H0 when it is true 68
  • 69. For example, the true mean of example 2.4.1 is 3.0. However, for the randomly selectedsample we could observe that the test statistic x falls into the critical region. Therefore,we could reject the null hypothesis in favor of the alternative hypothesis H1.Type II ErrorFailing to reject the null hypothesis when it is false. 6.1 Performing a Hypothesis Test Definition 6.2 A procedure leading to a decision about a particular null andalternative hypothesis is called a hypothesis test. • Hypothesis testing involves the use of sample data on the population(s) of interest. • If the sample data is consistent with a hypothesis, then we “accept” that hypothesis and conclude that the corresponding statement about the population is true. • We “reject” the other hypothesis, and conclude that the corresponding statement is false. However, the truth or falsity of the statements can never be known with certainty, so we need to define our procedure so that we limit the probability of making an erroneous decision. • The burden of proof is placed upon the alternative hypothesis.Basic Hypothesis Testing ProcedureA random sample is collected on the population(s) of interest, a test statistic is computedbased on the sample data, and the test statistic is used to make the decision to eitheraccept (some people say “fail to reject”) or reject the null hypothesis.Example 6.5 A manufactured product is used in such a way that its most importantdimension is its width. Let X denote the width of a manufactured product. Supposehistorical data suggests that X is a normal random variable having σ = 4 cm. However,the mean can change due to fluctuations in the manufacturing process. Therefore, wewish to perform the following hypothesis test. H0: H1: 69
  • 70. The following procedure has been proposed.Inspect a random sample of 25 products. Measure the width of each product. If thesample mean is less than 188 cm or more than 192 cm, reject H0.For the proposed procedure, identify the following:(a) sample size(b) test statistic(c) critical region(d) acceptance regionIs the procedure defined in Ex. 6.5 a good procedure? Since we are only taking a randomsample, we cannot guarantee that the results of the hypothesis test will lead to us makingthe correct decision. Therefore, the question “Is this a good procedure?” can be brokendown into two additional questions. 1. If the null hypothesis is true, what is the probability that we accept H0? 2. If the null hypothesis is not true, what is the probability that we accept H0?Example 6.6 (Ex. 6.5 continued) If the null hypothesis is true, what is the probabilitythat we accept H0? 70
  • 71. note assumptionsTherefore, if the null hypothesis is true, then there is a 98.76% chance that we will makethe correct decision. However, that also means that there is a 1.24% chance that we willmake the incorrect decision (reject H0 when H0 is true). • Such a mistake is called a Type I error, or a false positive. • α = P(Type I error) = level of significance • In our example, α = 0.0124. When constructing a hypothesis test, we get to specify α.If the null hypothesis is not true (i.e. the alternative hypothesis is true), then accepting H0would be a mistake. • Accepting H0 when H0 is false is called a Type II error, or a false negative. • • • Unfortunately, we can’t answer this question (find a value for β) in general. Since the alternative hypothesis is µ ≠ 190 cm, there are an uncountable number of situations in which the alternative hypothesis is true. • We must identify specific situations of interest and analyze each one individually.Example 6.7 (Ex. 6.5 continued) Find the probability of a Type II error when µ = 189 cmand µ = 193 cm. For µ = 189 cm: 71
  • 72. For µ = 193 cm: Note that as µ moves away from the hypothesized value (190 cm), β decreases.If we experiment with other sample sizes and critical/acceptance regions, we will see thatthe values of α and β can change significantly. However, there are some general “truths”for hypothesis testing. 1. We can explicitly control α (given that the underlying assumptions are true). 2. Type I and Type II error are inversely related. 3. Increasing the sample size is the only way to simultaneously reduce α and β. 4. We can only control β for one specific situation.Since we can explicitly control α, the probability of a Type I error, rejecting H0 is astrong conclusion. However, we can only control Type II errors in a very limitedfashion. Therefore, accepting H0 is a weak conclusion. In fact, many statisticians usethe terminology, “fail to reject H0” as opposed to “accept H0.” • Since “reject H0” is a strong conclusion, we should put the statement about which it is important to make a strong conclusion in the alternative hypothesis.Example 6.8 How would the procedure change if we wished to perform the followinghypothesis test?H0: µ ≥ 190 cm 72
  • 73. H1: µ < 190 cmProposed hypothesis testing procedure: Inspect a random sample of 25 observations onthe width of a product. If the sample mean is less than 188 cm, reject H0. 6.1.1 Generic Hypothesis Testing ProcedureAll hypothesis have a common procedure. The textbook identifies eight steps in thisprocedure. 1. From the problem context and assumptions, identify the parameter of interest. 2. State the null hypothesis, H0. 3. Specify an appropriate alternative hypothesis, H1. 4. Choose a significance level α. 5. State an appropriate test statistic. 6. State the critical region for that statistic. 7. Collect a random sample of observations on the random variable (or from the population) of interest, and compute the test statistic. 8. Compare the test statistic value to the critical region and decided whether or not to reject H0. 6.2 Performing Hypothesis Tests on µ when σ is KnownIn this section, we consider making inferences about the mean µ of a single populationwhere the population standard deviation σ is known. • We will assume that a random sample X1, X2, … , Xn has been taken from the population. • We will also assume that either the population is normal or the conditions of the Central Limit Theorem apply.Suppose we wish to perform the following hypothesis test. 73
  • 74. It is somewhat obvious that inferences regarding µ would be based on the value of thesample mean. However, it is usually more convenient to standardize the sample mean.Using what we know about the sampling distribution of the mean, it is reasonable toconclude that the test statistic will beIf the null hypothesis is true, then the test statistic is a standard normal random variable.Therefore, we only reject the null hypothesis if the value of Z0 is unusual for anobservation on a standard normal random variable.Specifically, we reject H0 if:where α is the specified level of significance. The acceptance region is thereforeObviously, the acceptance and critical regions can be converted to expressions in terms ofthe sample mean. Reject H0 if X > a or X < b where 74
  • 75. Example 6.9 Let X denote the GPA of an engineering student at the PhiladelphiaUniversity. It is widely known that, for this population, σ = 0.5. The population mean isnot widely known, however, it is commonly believed that the average GPA is 3.0. Wewish to test this hypothesis using a sample of size 25 and a level of significance of 0.05.(a) Identify the null and alternative hypotheses.(b) List any required assumptions.(c) Identify the test statistic and the critical region. Reject H0 if(d) Suppose 25 students are sampled and the sample average GPA is 3.18. State and interpret the conclusion of the test. Z0 =(e) What is the probability of a Type I error for this test?(f) How would the results change if we had used α = 0.10? 75
  • 76. Critical region changes.We may also modify this procedure if the test is one-sided. This modification onlyrequires a change in the critical/acceptance regions. If the alternative hypothesis isthen a negative value of Z0 would not indicate a need to reject H0. Therefore, weonly reject H0 ifLikewise, if the alternative hypothesis isExample 6.10 The Glass Bottle Company (GBC) manufactures brown glass beveragecontainers that are sold to breweries. One of the key characteristics of these bottles istheir volume. GBC knows that the standard deviation of volume is 0.08 oz. They wish toensure that the mean volume is not more than 12.2 oz using a sample size of 30 and alevel of significance of 0.01.(a) Identify the null and alternative hypotheses. 76
  • 77. (b) Identify the test statistic and the critical region.(c) Suppose 30 bottles are measured and the sample mean is 12.23. State and interpret the conclusion of the test. 6.2.1 Computing P-Values We have already seen that the choice of the value for the level of significance canimpact the conclusions derived form a test of hypotheses. As a result, we may beinterested in answering the question: How close did we come to making the oppositeconclusion? We answer this question using an equivalent decision approach that can beused as an alternative to the critical/acceptance regions. This approach is called the P-value approach.Definition 6.3 The P-value for a hypothesis test is the smallest level of significance thatwould lead to rejection of the null hypothesis.How we compute the P-value depends on the form of the alternative hypothesis.We reject H0 if 77
  • 78. Example 6.11 (Ex. 6.9 continued) Compute the P-value for the test. Note when α = 0.05, But, when α = 0.10,Example 612 (Ex. 6.10 continued) Compute the P-value for the test. Since α = 0.01, P > α 6.2.2 Type II ErrorIn hypothesis testing, we get to specify the probability of a Type I error ( α). However,the probability of a Type II error (β) depends on the choice of sample size (n).Consider first the case in which the alternative hypothesis is H1: µ ≠ µ0.Before we can proceed, we must be more specific about “H0 is false”. We willaccomplish this by saying:where δ ≠ 0. 78
  • 79.  X − µ0  β = P − Z α / 2 ≤  ≤ Zα / 2 µ = µ 0 + δ    σ n   σ σ  β = P µ 0 − Z α / 2  ≤ X ≤ µ 0 + Zα / 2 µ = µ0 + δ    n n   σ σ   µ 0 − Zα / 2 − ( µ0 + δ ) µ 0 + Zα / 2 − (µ0 + δ )  β = P  n n ≤Z≤  σ σ     n n If the alternative hypothesis is H1: µ > µ0, then .If the alternative hypothesis is H1: µ < µ0. .Example 6.13 (Ex. 6.10 continued) Let X denote the GPA of an engineering student atthe Philadelphia University. It is widely known that, for this population, σ = 0.5. Thepopulation mean is not widely known, however, it is commonly believed that the averageGPA is 3.0. We wish to test this hypothesis using a sample of size 25 and a level ofsignificance of 0.05. In Example 16.5, we formulated this hypothesis test as 79
  • 80. The corresponding test statistic and critical region are given by X − 3.0 Z0 = 0.5 25(a) If µ = 3.2, what is the Type II error probability for this test? δ = µ − µ0 = β = P( − 3.96 ≤ Z ≤ −0.04 ) = 0.4840(b) If µ = 2.68, what is the Type II error probability for this test? δ = µ − µ0 = β = P(1.24 ≤ Z ≤ 5.16 ) = 0.1075(c) If µ = 2.68, what is the power of the test? power = 80
  • 81. (d) If µ = 3.32, what is the power of the test? power = 0.8925Example 6.14 (Ex. 6.11 continued) The Glass Bottle Company (GBC) manufacturesbrown glass beverage containers that are sold to breweries. One of the key characteristicsof these bottles is their volume. GBC knows that the standard deviation of volume is0.08 oz. They wish to ensure that the mean volume is not more than 12.2 oz using asample size of 30 and a level of significance of 0.01. Example 16.6, we formulated thishypothesis test as H0: µ ≤ 12.2 H1: µ > 12.2The corresponding test statistic and critical region are given by Reject H0 if(a) If µ = 12.27 oz, what is the Type II error probability for this test? δ = µ − µ0 = 0.07  0.07 30  β = P Z ≤ 2.3263 −  = P( Z ≤ −2.47 ) = 0.0068  0.08   (b) If µ = 12.15 oz, what is the Type II error probability for this test? This is a poor question. If µ = 12.15 oz, then “technically” the null hypothesis is true. If we are truly concerned with detecting this, we should have used a two- sided alternative hypothesis. 81
  • 82. 6.2.3 Choosing the Sample SizeThe expressions for β allow the determination of an appropriate sample size. To choosethe proper sample size for our test, we must specify a value of β for a specified value ofδ.For the case in which H1: µ ≠µ0, the symmetry of the test allows us to always specify apositive value of δ. If we specify a relatively small value of β (≤ 0.1), then the lower sideof the equation becomes negligible. So, the equation for β reduces to:This yields: δ n Z 1− β = − Z β = Z α / 2 − σ δ n = Zα / 2 + Z β σFor both cases in which the alternative hypothesis is one-sided: 82
  • 83. Example 6.14 (Ex. 6.10 continued) Let X denote the GPA of an engineering student atthe Philadelphia University. It is widely known that, for this population, σ = 0.5. Thepopulation mean is not widely known, however, it is commonly believed that the averageGPA is 3.0. We wish to test this hypothesis using a sample of size n and a level ofsignificance of 0.05. In Example 16.5, we formulated this hypothesis test as H0: µ = 3.0 H1: µ ≠ 3.0The corresponding test statistic and critical region are given by(a) If we want β = 0.10 at µ = 3.2, what sample size should we use? δ = 0.2 n = 66(b) If we want β = 0.10 at µ = 3.25, what sample size should we use? δ = 0.25 n= ( Z 0.025 + Z 0.10 ) 2 0.5 2 = (1.96 + 1.282) 2 0.5 2 = 42.04 0.25 2 0.25 2 n = 43(c) If we want β = 0.05 at µ = 3.2, what sample size should we use? δ = 0.2 n= ( Z 0.025 + Z 0.05 ) 2 0.5 2 = (1.96 + 1.645) 2 0.52 = 81.2 0.2 2 0.2 2 83
  • 84. n = 82Example 6.15 (Ex. 6.11 continued) The Glass Bottle Company (GBC) manufacturesbrown glass beverage containers that are sold to breweries. One of the key characteristicsof these bottles is their volume. GBC knows that the standard deviation of volume is0.08 oz. They wish to ensure that the mean volume is not more than 12.2 oz using asample size of n and a level of significance of 0.01. Example 16.6, we formulated thishypothesis test asThe corresponding test statistic and critical region are given byIf we wish to have a test power of 0.95 at µ = 12.25 oz, what is the required sample sizefor this test? 6.2.4 Choosing the Sample SizeThe expressions for β allow the determination of an appropriate sample size. To choosethe proper sample size for our test, we must specify a value of β for a specified value ofδ.For the case in which H1: µ ≠µ0, the symmetry of the test allows us to always specify apositive value of δ. If we specify a relatively small value of β (≤ 0.1), then the lower sideof the equation becomes negligible. So, the equation for β reduces to:  δ n β = P Z ≤ Z α / 2 −    σ   84
  • 85. This yields: δ n Z 1− β = − Z β = Z α / 2 − σ δ n = Zα / 2 + Z β σ (Z α/2 + Zβ ) σ 2 2 n= δ2For both cases in which the alternative hypothesis is one-sided: (Z α + Zβ ) σ 2 2 n= δ2Example 6.14 (Ex. 6.10 continued) Let X denote the GPA of an engineering student atthe Philadelphia University. It is widely known that, for this population, σ = 0.5. Thepopulation mean is not widely known, however, it is commonly believed that the averageGPA is 3.0. We wish to test this hypothesis using a sample of size n and a level ofsignificance of 0.05. In Example 16.5, we formulated this hypothesis test as H0: µ = 3.0 H1: µ ≠ 3.0The corresponding test statistic and critical region are given by X − 3.0 Z0 = 0.5 n Reject H0 if Z0 < −Zα/2 = −Z0.025 = −1.96 or if Z0 > Zα/2 = 1.96(a) If we want β = 0.10 at µ = 3.2, what sample size should we use? δ = 0.2 n= ( Z 0.025 + Z 0.10 ) 2 0.5 2 = (1.96 + 1.282) 2 0.5 2 = 65.7 0.2 2 0.2 2 n = 66(b) If we want β = 0.10 at µ = 3.25, what sample size should we use? 85
  • 86. δ = 0.25 n= ( Z 0.025 + Z 0.10 ) 2 0.5 2 = (1.96 + 1.282) 2 0.5 2 = 42.04 0.25 2 0.25 2 n = 43(c) If we want β = 0.05 at µ = 3.2, what sample size should we use? δ = 0.2 n= ( Z 0.025 + Z 0.05 ) 2 0.5 2 = (1.96 + 1.645) 2 0.52 = 81.2 0.2 2 0.2 2 n = 82Example 6.15 (Ex. 6.11 continued) The Glass Bottle Company (GBC) manufacturesbrown glass beverage containers that are sold to breweries. One of the key characteristicsof these bottles is their volume. GBC knows that the standard deviation of volume is0.08 oz. They wish to ensure that the mean volume is not more than 12.2 oz using asample size of n and a level of significance of 0.01. Example 16.6, we formulated thishypothesis test as H0: µ ≤ 12.2 H1: µ > 12.2The corresponding test statistic and critical region are given by X − 12.2 Z0 = 0.08 n Reject H0 if Z0 > Zα = Z0.01 = 2.3263If we wish to have a test power of 0.95 at µ = 12.25 oz, what is the required sample sizefor this test? δ = 0.05 β = 0.05 n= ( Z 0.01 + Z 0.05 ) 2 0.082 = ( 2.326 + 1.645) 2 0.082 = 40.4 0.05 2 0.05 2 86
  • 87. n = 41 6.3 Statistical Significance A hypothesis test is a test for statistical significance. When we reject H 0, we arestating that the data indicates a statistically significant difference between the true meanand the hypothesized value of the mean. When we accept H 0, then we are stating thatthere is not a statistically significant difference. Statistical difference and practicalsignificance are not the same. This is especially important to recognize when the samplesize is large. 6.3.1 Introduction to Confidence Intervals As we have previously discussed, the sample mean is the most often used pointestimate for the population mean. However, we also pointed out that two differentsamples would most likely result in two different sample means. Therefore, we defineconfidence intervals as a means of quantifying the uncertainty in our point estimate.If θ is the parameter of interest, then the point estimate and sampling distribution of θ canbe used to identify a 100(1 − α )% confidence interval on θ. This interval is of theform: L ≤ θ ≤ U.L and U are called the lower-confidence limit and upper-confidence limit. If L and U are constructed properly, then P(L ≤ θ ≤ U) = 1 − α.The quantity (1 − α) is called the confidence coefficient. The confidence coefficient is ameasure of the accuracy of the confidence interval. For example, if a 90% confidenceinterval is constructed, then the probability that the true value of θ is contained in theinterval is 0.9. 87
  • 88. The length of the confidence interval is a measure of the precision of the pointestimate. A general rule of thumb is that increasing the sample size improves theprecision of a point estimate. 6.3.2 Confidence Interval on µ when σ is Known We can use what we have learned to construct a 100(1 − α )% confidenceinterval on the mean, assuming that (a) the population standard deviation is known, and(b) the population is normally distributed (or the conditions of the Central Limit Theoremapply). P( − Z α / 2 ≤ Z ≤ Z α / 2 ) = 1 − α  X −µ  P − Z α / 2 ≤ ≤ Zα / 2  = 1−α  σ n     σ σ  P X − Z α / 2  ≤ µ ≤ X + Zα / 2  = 1−α   n nSuch a confidence interval is called a two-sided confidence interval. We can alsoconstruct one-sided confidence intervals for the same set of assumptions (σ known,normal population or Central Limit Theorem conditions apply).The 100(1 − α)% upper-confidence interval is given by  σ  P µ ≤ X + Z α   = 1−α   nand the 100(1 − α)% lower-confidence interval is given by  σ  P µ ≥ X − Z α   = 1−α .   nExample 6.16 Let X denote the GPA of an engineering student at the PhiladelphiaUniversity. It is widely known that, for this population, σ = 0.5. The population mean isnot widely known, however, we have a collected a sample of size 25 from the population.The resulting sample mean was 3.18. 88
  • 89. (a) What assumptions, if any, are required to use this data to construct a confidence interval on the mean GPA? GPA is normally distributed.(b) Construct a 95% confidence interval on µ and interpret its meaning. σ 0.5 X ± Z 0.025 = 3.18 ± 1.96 n 25 2.984 ≤ µ ≤ 3.376 P( 2.984 ≤ µ ≤ 3.376 ) = 0.95(c) Construct a 99% confidence interval on µ and compare it to the confidence interval obtained in part (b). σ 0.5 X ± Z 0.005 = 3.18 ± 2.58 n 25 2.922 ≤ µ ≤ 3.438 more accurate, but less precise(d) Construct a 95% upper-confidence interval on µ and interpret its meaning. σ 0.5 X + Z 0.05 = 3.18 + 1.645 n 25 µ ≤ 3.3445 P( µ ≤ 3.3445) = 0.95(e) Construct a 95% lower-confidence interval on µ and interpret its meaning. σ 0.5 X − Z 0.05 = 3.18 − 1.645 n 25 µ ≥ 3.0155 P( µ ≥ 3.0155) = 0.95(f) Combine the two confidence intervals obtained in parts (d) and (e). Is this confidence interval superior to the one constructed in part (b)? 89
  • 90. 3.0155 ≤ µ ≤ 3.3445 No, it is only a 90% confidence interval.6.3.3 Choosing the Sample Size for a Confidence Interval on µ when σ is Known The percentage of a confidence interval is a measure of the accuracy of theconfidence interval. The half-width of the confidence interval, E, is a measure of the precision of theconfidence interval. For a two-sided confidence interval, E = (U – L)/2. For an upper-confidence interval, E = U − θ and for a lower-confidence interval, E = θ − L. For a given level of accuracy (α), we can control the precision of the confidenceinterval using the sample size. For the two-sided confidence interval on µ, we specify avalue of E and note that: σ E = Zα / 2 . n Then, we can solve for n. 2 Z σ  n =  α/2   E  For the one-sided confidence intervals: 2 Z σ  n= α  .  E Example 6.17 (Ex. 6.16 continued)(a) If we wish to construct a 95% confidence interval on µ that has a half-width of 0.1, how many students should we survey? 2 2  Z σ   1.96 ⋅ 0.5  n =  0.025  =   = 96.04  E   0.1  n = 97 90
  • 91. (b) If we wish to construct a 95% upper-confidence interval on µ that has a half- width of 0.1, how many students should we survey? 2 2  Z σ   1.645 ⋅ 0.5  n =  0.05  =   = 67.65  E   0.1  n = 68(c) If we wish to construct a 90% confidence interval on µ that has a half-width of 0.1, how many students should we survey? 2 2  Z σ   1.645 ⋅ 0.5  n =  0.05  =   = 67.65  E   0.1  n = 686.3.4 Using Confidence Intervals to Perform Hypothesis Tests on µ when σ is Known Thus far, we have considered two methods of evaluating hypothesis tests: criticalregions and P-values. A third, equivalent method is to use a confidence interval. 1. Specify: µo, α, n 2. If H1: µ ≠ µo, construct a 100(1 − α)% confidence interval on µ. If H1: µ > µo, construct a 100(1 − α)% lower-confidence interval on µ. If H1: µ < µo, construct a 100(1 − α)% upper-confidence interval on µ. 3. Reject H0 is µo is not contained in that confidence interval.Example 6.17 (Ex. 6.10 continued) Let X denote the GPA of an engineering student atthe Philadelphia University. It is widely known that, for this population, σ = 0.5. Thepopulation mean is not widely known, however, it is commonly believed that the averageGPA is 3.0. We wish to test this hypothesis using a sample of size 25 and a level ofsignificance of 0.05.From Ex. 6.10: H0: µ = 3.0 H1: µ ≠ 3.0 91
  • 92. Suppose the sample mean is 3.18. Use a confidence interval to evaluate the hypothesistest. α = 0.05, H1: ≠ 95% confidence interval From Ex. 6.16: 2.984 ≤ µ ≤ 3.376 3.0 is in the confidence interval fail to reject H0Example 6.18 (Ex. 6.11 continued) The Glass Bottle Company (GBC) manufacturesbrown glass beverage containers that are sold to breweries. One of the key characteristicsof these bottles is their volume. GBC knows that the standard deviation of volume is0.08 oz. They wish to ensure that the mean volume is not more than 12.2 oz using asample size of 30 and a level of significance of 0.01.From Ex. 3.2.2: H0: µ ≤ 12.2 H1: µ > 12.2Suppose the sample mean is 12.23. Use a confidence interval to evaluate the hypothesistest. α = 0.01, H1: > 99% lower-confidence interval σ 0.08 X − Z 0.01 = 12.23 − 2.3263 n 30 µ ≥ 12.1960 12.2 is in the confidence interval fail to reject H0 92
  • 93. 6.4 Hypothesis Test on μ and σ unkown What if σ is Unknown? Suppose we are interested in studying the mean of a population, but we do notknow the value of the population standard deviation? • We can use the procedures defined in section 2.3 and replace σ with S, provided that the sample size is large (n ≥ 30). • When the sample size is small and σ is unknown, then we must assume that the population is normally distributed. The t Distribution Suppose we wish to perform the following hypothesis test. H0: µ = µ0 H1: µ ≠ µ0 Suppose we have collected a random sample of size n and that we have used thissample data to compute the sample mean X and the sample standard deviation S. If σ were known then we would compute the test statistic: X − µ0 Z0 = . σ nTherefore, a logical approach is to replace σ with S. The resulting test statistic is: X − µ0 T0 = . S nBefore we can proceed, we should analyze the sampling distribution of this test statistic. Theorem 6.1 The t Distribution Let X1, X2, … , Xn be a random sample from a normal population having mean µ. The quantity X −µ T= S n 93
  • 94. has a t distribution with n – 1 degrees of freedom. While we won’t discuss the details of the t distribution, it is important to recognizetwo points regarding the t probability density function. • First, it is symmetric about 0. • Second, as the number of degrees of freedom increases, the t distribution approaches the standard normal distribution. This explains why it is OK to use the procedures from section 2.3 when n ≥ 30 (at 29 degrees of freedom there is little difference between t and Z).Example 6.19 Suppose T has a t distribution with 7 degrees of freedom. Find thefollowing:(a) P(T > 2.365) Excel function TDIST(x, degrees of freedom, 1 or 2 tails)=P(2.3625, 7, 1) = 0.025 Note Excel gives you the value P(X>x)(b) P(T > 1.415) 0.10(c) P(T < −3.499) P(T > 3.499) = 0.005(d) P(T > −2.8) = 1 – Pt (T<2.88) = 0.9867(e) the value a such that P(T >a) = 0.05 a = t0.05,7 = 1.895(f) the value of a such that P(T > a) = 0.01 a = t0.01,7 = 2.998(g) the value of a such that P(T < a) = 0.9975 a = t0.0025,7 = 4.029 94
  • 95. The Critical Region If the null hypothesis is true, then T0 has a t distribution with n − 1 degrees offreedom. Therefore, we reject H0 if T0 > tα/2,n-1 or T0 < −tα/2,n-1. • If the alternative hypothesis is H1: µ > µ0, then we reject H0 if T0 > tα,n-1. • If the alternative hypothesis is H1: µ < µ0, then we reject H0 if T0 < −tα,n-1.Example 6.20 Let X denote the age of an Engineering Statistics student at thePhiladelphia University. Suppose we wish to test the hypothesis that the mean age is21.0 years using a sample of size 14 and a level of significance of 0.05.(a) State the null and alternative hypothesis. H0: µ = 21 H1: µ ≠ 21(b) What assumptions are required before we can proceed with this test of hypotheses? Age is normally distributed.(c) How could we test the validity of this assumption? Probability plot of the data.(d) What would have to be done to avoid making this assumption? Survey at least 16 more students (n ≥ 30).(e) Identify the test statistic and acceptance region for this test. X − µ0 X − 21 T0 = = S n S 14 Fail to reject H0 if −tα/2,n-1 ≤ T0 ≤ tα/2,n-1 Fail to reject H0 if −2.160 ≤ T0 ≤ 2.160 NOTE: In Excel ® the function is TINV(probability, degrees of freedom) Probability: is the associated probability with a two-sided t-distribution 95
  • 96. (f) Suppose the sample mean is 22.67 years and the sample standard deviation is 4.08 years. State and interpret the conclusion for this test. 22.67 − 21 T0 = = 1.532 4.08 14 Fail to reject H0 and conclude that µ = 21.(g) What would be required to solve this problem using a Z-test instead of a t-test? n ≥ 30 or σ knownExample 6.21 A construction company purchases shear pins for use in several of theirprojects. These pins are supposed to have a mean tensile strength of at least 500 psi, butdestructive testing is required to measure the tensile strength of a pin. Therefore, asample of 12 pins are evaluated and used to verify that the tensile strength is acceptableusing a level of significance of 0.10.(a) State the null and alternative hypothesis. H0: H1:(b) Identify the test statistic and critical region for this test. Reject H0 if. Reject H0 if. Note in Excel use TINV(2*(0.1),11) because it is one-sided test and excel return two-sided test.(c) Suppose the sample mean is 491.3 psi and the sample standard deviation is 13.2 psi. State and interpret the conclusion for this test. 491.3 − 500 T0 = = −2.2832 13.2 12 96
  • 97. P-Values As before, we can use P-values to draw conclusions regarding a t-test. H1: µ ≠ µ0 P = 2 P ( t n −1 > T0 ) H1: µ > µ0 P = P(tn-1 > T0) H1: µ < µ0 P = P(tn-1 < T0) We reject H0 if P < α.Example 6.22 (Ex. 6.20 continued) Perform the hypothesis test using a P-value.Example 6.23 (Ex. 6.21 continued) Perform the hypothesis test using a P-value. Type II Errors and Sample SizeUnfortunately, we cannot construct an expression for β for these tests, and as a result, wecannot construct expressions for computing an appropriate sample size. Some cases havebeen analyzed numerically, and the corresponding OC curves are in your text – ChartsVII(e), VII(f), VII(g) and VII(h). The value on the horizontal axis is µ − µ0 d= . σ Note that the value in the denominator must be our best guess for σ. 97
  • 98. Example 6.24 (Ex. 6.20 continued)(a) If the true population mean is 22.0 years and the true population standard deviation is 1.5 years, then what is the probability of a Type II error? Chart V (a)(b) Suppose the true population standard deviation is 1.25 years and we wish to have a power of 0.9 when the true population mean is 22.5 years. What sample size should we use? Confidence Intervals As before, we can also use the sample data to construct a 100(1−α)% confidenceinterval on the population mean.  S S  P X − t α / 2, n −1  ≤ µ ≤ X + t α / 2, n −1  = 1−α   n n The 100(1 − α)% upper-confidence interval is given by  S  P µ ≤ X + t α , n −1   = 1−α   nand the 100(1 − α)% lower-confidence interval is given by  S  P µ ≥ X − t α , n −1   = 1−α .   nWe can then use these confidence intervals to perform the hypothesis test. Specify: µ0, α, n 98
  • 99. If H1: µ ≠ µ0, construct a 100(1 − α)% confidence interval on µ. If H1: µ > µ0, construct a 100(1 − α)% lower-confidence interval on µ. If H1: µ < µ0, construct a 100(1 − α)% upper-confidence interval on µ. Reject H0 is µ 0 is not contained in that confidence interval.Example 6.25 (Ex. 6.20 continued) Perform the hypothesis test using an appropriateconfidence interval. α = 0.05, H1: ≠ 95% confidence intervalExample 6.26 (Ex. 6.21 continued) Perform the hypothesis test using an appropriateconfidence interval. α = 0.10, H1: < Therefore, with a 90% confidence interval (90% of the times ) the mean will fall µ ≤ 496.49, therefore 500 is not in the confidence interval Z-Tests versus t-Tests Even if we know the population standard deviation, we can use the t-test insteadof the Z-test. The reason we don’t do this is: • A Z-test is more powerful than the corresponding t-test. • A confidence interval based on Z is tighter than the corresponding confidence interval based on t. 99
  • 100. 6.5 Inference on the Variance of a Normal Population The Chi-Square Distribution Sometimes it is of interest to perform a hypothesis test on the population standarddeviation. For example, we may wish to test the value of σ prior to testing µ.Assumptions • population is normally distributed (cannot make use of the Central LimitTheorem)Potential TestsTest Statistic Theorem 3.5.1 The Chi-Square Distribution Let X1, X2, … , Xn be a random sample from a normal population. The quantity χ 2 = ( n − 1) S 2 σ2has a chi-square distribution with n – 1 degrees of freedom. • A table of probabilities for the chi-square distribution is provided in the Appendix of your text. • Note that the chi-square distribution is not symmetric. In fact, the range of the chi-square random variable is (0,∞). Example 6.27 Suppose Y has a chi-square distribution with 13 degrees of freedom.Find the following: (a) (b) P(Y < 5.89) = 100
  • 101. (c) P(Y > 0)(d) the value a such that P(Y >a) = 0.005In Excel CHIINV( probability, degrees of freedom )(e) the value of a such that P(Y < a) = 0.01 Performing the TestCritical RegionsH1: σ ≠ σ 0 2 2H1: σ > σ 0 2 2H1: σ < σ 0 2 2P-ValuesH1: σ ≠ σ 0 2 2H1: σ > σ 0 2 2H1: σ < σ 0 2 2OC Function and Sample Size σmust use Charts VII(i, j, k, l, m, n) with λ = σ0Confidence Intervals 101
  • 102.  ( n − 1) S 2 ( n − 1) S 2  = 1 − α two-sided P 2 ≤σ2 ≤ 2   χ χ1−α / 2, n −1   α / 2, n −1   ( n − 1) S 2 P σ 2 ≤ 2   = 1−α upper    χ 1−α , n −1   2 ( n − 1) S 2  lower P σ ≥  =1−α  χ α , n −1  2   Example 6.28 In an effort to improve their quality, the Glass Bottle Company(GBC) has completed a variance reduction initiative. As a result of this initiative, GBCnow believes that the standard deviation of the volume of their brown glass bottles is 0.05oz. They wish to test this hypothesis using a sample of size 20 and a level of significanceof 0.05.(a) State any required assumptions. Volume is normally distributed.(b) What if further testing revealed that this assumption was invalid? You would be stuck.(c) State the null and alternative hypotheses, the test statistic and critical region. H0: H1:(d) Suppose the sample standard deviation is 0.065 oz. State and interpret the conclusion of the test. 102
  • 103. (e) Construct a 95% confidence interval on the population standard deviation. Example 6.29 Let X denote the age of an Engineering Statistics student at thePhiladelphia University and consider the following hypothesis test. H0: H1: We wish to perform this test using a level of significance of 0.05.(a) Suppose we wish to detect a true standard deviation of 6.0 years with probability 0.95. What sample size should we use? power of the test = 0.95 Chart VII(k) ⇒ n = 15(b) If this sample size is used, what is the probability of a Type II error if σ = 4.5 years?(c) Suppose this sample size is used and S = 4.4025 years. Use a P-value to determine the conclusion resulting from the test. 103
  • 104. 6.6 Population Proportions Sometimes we wish to perform a hypothesis test on a population proportion p.Examples of population proportions are: • proportion of registered voters that favor term limits • proportion of manufactured items that are defective • probability of winning a game • probability of successfully completing a mission The general definition of p is the proportion of members of the population thatbelong to the class of interest. In some cases, p corresponds to the probability of successfor a Bernoulli trial. Testing Population ProportionsAssumptions• population is very large (or infinite)Potential TestsTest Statistic For a random sample of size n, we compute X = the number of members sampled that belong to the class of interest or X = the number of successes. Note that we are going to take advantage of the normal approximation to thebinomial. Therefore, all of our results are “approximate”, and the larger n is, the betterthe approximation is.Critical Regions H1: p ≠ p0 H1: p > p0 104
  • 105. H1: p < p0P-Values H1: p ≠ p0 H1: p > p0 H1: p < p0OC Function H1: p ≠ p0  p 0 (1 − p 0 ) p 0 (1 − p 0 )   p0 − p − Zα / 2 p0 − p + Zα / 2   n n  β = P ≤Z≤   p (1 − p ) p (1 − p )     n n   p 0 (1 − p 0 )   p0 − p + Zα   n  H1: p > p0 β = P Z ≤   p (1 − p )     n   p 0 (1 − p 0 )   p0 − p − Zα   n  H1: p < p0 β = P Z ≥   p (1 − p )     n Choice of Sample Size  Z α / 2 p 0 (1 − p 0 ) + Z β p (1 − p )  2 two-sided test n=   p − p0    p 0 (1 − p 0 ) + Z β p (1 − p )  2  Zα one-sided test n=   p − p0   Confidence Intervals X Recall p = ˆ . n  p (1 − p ) ˆ ˆ p (1 − p )  ˆ ˆ two-sided P p − Z α / 2  ˆ ≤ p ≤ p + Zα / 2 ˆ  = 1−α   n n  105
  • 106.  p (1 − p )  ˆ ˆ upper P p ≤ p + Z α  ˆ  = 1−α   n   p (1 − p )  ˆ ˆ lower P p ≥ p − Z α  ˆ  = 1−α   n  Reject H0 if p0 is not in the confidence interval.Choice of Sample Size for Confidence Intervals 2 2 Z  Z  two-sided n =  α / 2  p (1 − p ) n =  α / 2  0.25  E   E  2 Z  one-sided n =  α  0.25  E Example 6.30 A beverage producer purchases their brown glass bottles from GBC.However, they purchase their caps from another supplier. Prior to using a shipment ofthese caps in their bottling process, they inspect 500 of them. Each inspected cap isclassified as defective or non-defective. The producer has agreed with their supplier thatthe true proportion defective should not exceed 0.04. They wish to use the inspectiondata to perform an appropriate test using a level of significance of 0.10.(a) State the null and alternative hypotheses, the test statistic and the critical region. H0: p = 0.04 H1: p > 0.04(b) For a particular shipment, suppose that 27 of the inspected caps are defective. State and interpret the conclusion of the test. Z0 = 1.5975 .(c) Compute the P-value for the test. 106
  • 107. (d) What is the power of the test when p = 0.08?  0.04(1 − 0.04 )   0.04 − 0.08 + 1.282  β = P Z ≤  = P ( Z ≤ −2.3709 ) = 0.0089 500  0.08(1 − 0.08)       500 (e) If the producer wishes to have a Type II error probability of 0.05 when p = 0.1, then what sample size should they use?Example 6.31 Let p denote the proportion of engineering students at the PhiladelphiaUniversity that are afraid to fly.(a) Suppose we wish to construct a 95% confidence interval on p such that the error in our point estimate is 15% or less. What sample size should we use?(b) Suppose this sample size is used and 9 of the surveyed students are afraid to fly. Construct a point estimate of p.(c) Construct a 95% confidence interval on p.(d) Use this confidence interval to evaluate the following hypothesis test. H0: p = 0.3 H1: p ≠ 0.3 0.3 is in the confidence interval ⇒ fail to reject H0(e) What is the level of significance for the test evaluated in part (d)? 107
  • 108. 0.05 (95% confidence interval) 108
  • 109. 7 DECISION MAKING FOR TWO SAMPLESComparing Two PopulationsIn many cases, we wish to use hypothesis testing to compare two populations. We denotethese as Population 1 and Population 2, and we use the population numbers as subscriptson parameters, statistics, and sample sizes. Population 1 • parameter of interest: µ1, σ1 or p1 • sample size: n1 ˆ • statistic of interest: X 1 , S1 or p1 Population 2 • parameter of interest: µ2, σ2 or p2 • sample size: n2 ˆ • statistic of interest: X 2 , S2 or p 2 7.1 Comparing Two Population Means, Variance KnownAssumptions • the two populations are independent • σ1 and σ2 are known • the populations are normally distributed (or the conditions of the Central Limit Theorem hold)Parameter of Interest µ1 − µ2Point EstimatePotential Tests H0: µ1 − µ2 = ∆0 H0: µ1 − µ2 ≤ ∆0 H0: µ1 − µ2 ≥ ∆0 109
  • 110. Test StatisticCritical Region H1: µ1 − µ2 ≠ ∆0 Reject H0 if Z0 > Zα/2 or if Z0 < −Zα/2 H1: µ1 − µ2 > ∆0 Reject H0 if Z0 > Zα H1: µ1 − µ2 < ∆0 Reject H0 if Z0 < −ZαP-Values H1: µ1 − µ2 ≠ ∆0 H1: µ1 − µ2 > ∆0 H1: µ1 − µ2 < ∆0OC Function ∆ = µ1 − µ2 H1:µ1− µ2 ≠ ∆0      ∆ − ∆0 ∆ − ∆0  β = P − Z α / 2 − ≤ Z ≤ Zα / 2 −   σ 12 σ 2 2 σ 12 σ 2  2  + +   n1 n 2 n1 n 2       ∆ − ∆0  β = P Z ≤ Z α −   σ 12 σ 2  2  +   n1 n 2  H1: µ1 − µ2 > ∆0      ∆ − ∆0  β = P Z ≥ − Z α −   σ 12 σ 2  2  +   n1 n 2  H1: µ1 − µ2 < ∆0 110
  • 111. Choice of Sample Size n = n1 = n2 two-sided test one-sided testUsing OC Curves for Computing β and Sample Size H1: µ1 − µ2 ≠ ∆0 use Charts VII(a) and VII(b) H1: µ1 − µ2 > ∆0 use Charts VII(c) and VII(d) H1: µ1 − µ2 < ∆0 use Charts VII(c) and VII(d)Confidence Intervals two-sided  σ 12 σ 2 2 σ 12 σ 2  2 P ( X − X ) − Z + ≤ µ1 − µ 2 ≤ ( X 1 − X 2 ) + Z α / 2 +  = 1−α  1 2 α/2 n1 n 2 n1 n 2    upper  σ 12 σ 2  2 P µ 1 − µ 2 ≤ ( X 1 − X 2 ) + Z α +  = 1−α  n1 n 2    lower  σ 12 σ 2  2 P µ 1 − µ 2 ≥ ( X 1 − X 2 ) − Z α +  = 1−α  n1 n 2    Note: Fail to reject if zero (0) is within the confidence intervalChoice of Sample Size for Confidence Intervals 111
  • 112. two-sided one-sidedExamplesExample 7.1 An automobile manufacturer wishes to purchase its batteries from anoutside supplier. The key quality characteristic of these batteries is voltage. Twosuppliers are competing for the manufacturer’s business (Supplier 1 and Supplier 2).Both suppliers have normally distributed voltage and a standard deviation in voltage of0.15 v. Therefore, the manufacturer only needs to compare their means. The QC directorhas decided to make this comparison using a two-sample hypothesis test and a level ofsignificance of 0.05.(a) State the null and alternative hypothesis. H0: µ1 − µ2 = 0 H1: µ1 − µ2 ≠ 0(b) Identify the test statistic.(c) What sample size should be used if the manufacturer wishes to detect a difference of 0.25 volts with probability 0.9?(d) Identify the critical region. Reject H0 if Z0 > 1.96 or if Z0 < −1.96(e) If the sample size prescribed in part (c) is used and the resulting sample means are X 1 = 12.06 and X 2 = 11.98 , then what is the conclusion resulting from the test? 112
  • 113. Fail to reject H0 and conclude that µ1 − µ2 = 0.(f) Find the P-value for this test. P = 2 P ( Z > 1.0 6 ) = 0.2861(g) What is the Type II error probability for this test if µ1 − µ2 = 0.15? β = P(−3.96 ≤ Z ≤ −0.04) = 0.4840(h) Construct a 95% confidence interval on µ1 − µ2. 0.15 2 0.15 2 (12.06 − 11.98) − 1.96 + ≤ µ1 − µ 2 8 8 0.15 2 0.15 2 ≤ (12.06 − 11.98) + 1.96 + 8 8 −0.067 ≤ µ1 − µ2 ≤ 0.227(i) If we wanted the confidence interval in part (h) to have a half-width of 0.05 volts or less, how many additional batteries would have to be tested? 113
  • 114. Example 7.2 Let X1 denote the GPA of a female engineering student at the PhiladelphiaUniversity, and let X2 denote the GPA of a male engineering student at the PhiladelphiaUniversity. It is common knowledge that σ1 = σ2 = 0.5. A female student has made thedangerous claim that females are better engineering students than males, and she wishesto support this claim with GPA data on 8 female and 12 male engineering students. Theresults of this data collection are X 1 = 3.44 and X 2 = 3.27 .(a) Formulate an appropriate hypothesis test. H0: µ1 − µ2 = 0 H1: µ1 − µ2 > 0(b) What assumption(s), if any, do we need to make to proceed with this hypothesis test? GPA is normally distributed. The two populations are independent.(c) Does the data support the female student’s claim? Demonstrate that all three hypothesis testing approaches yield the same result. Use a level of significance of 0.10. Critical Region Reject H0 if Z0 > Zα = 1.282P-Value (smallest level of significance)Confidence Interval 90% lower-confidence interval µ1 − µ 2 ≥ −0.1226 114
  • 115. 7.2 Inference on the Means of Two Populations, Variances Unknown If the population variances are unknown but the sample sizes exceed 30, then wecan use the two-sample Z-test (lecture 20), replacing σ1 with S1 and σ2 with S2.Otherwise, we must use one of three two-sample t-tests. 7.2.1 Variances Unknown but Equal Assumptions • the two populations are independent • σ1 and σ2 are unknown, but σ = σ1 = σ2 • the populations are normally distributedParameter of Interest µ1 − µ2Point Estimate X1 − X 2Potential Tests H0: µ1 − µ2 = ∆0 H0: µ1 − µ2 = ∆0 H0: µ1 − µ2 = ∆0 H1: µ1 − µ2 ≠ ∆0 H1: µ1 − µ2 > ∆0 H1: µ1 − µ2 < ∆0Test Statistic 2The pooled estimator of σ2, denoted by S p is defined by 115
  • 116. Critical Region ν = n1 + n2 − 2 H1: µ1 − µ2 ≠ ∆0 Reject H0 if T0 > tα/2,ν or if T0 < −tα/2,ν H1: µ1 − µ2 > ∆0 Reject H0 if T0 > tα,ν H1: µ1 − µ2 < ∆0 Reject H0 if T0 < −tα,νP-Values H1: µ1 − µ2 ≠ ∆0 P = 2 P ( t ν > T0 ) H1: µ1 − µ2 > ∆0 P = P(tν > T0) H1: µ1 − µ2 < ∆0 P = P(tν < T0)Using OC Curves for Computing β and Sample Size ∆ = µ1 − µ2 n = n1 = n2 ∆ − ∆0 d= 2σ sample sizes on the OC curves are actually 2n − 1 two-sided test use Charts VII(e) and VII(f) one-sided test use Charts VII(g) and VII(h) n* + 1 n= 2Confidence Intervals two-sided  1  P ( X 1 − X 2 ) − t α / 2,ν S p ≤ µ 1 − µ 2 ≤ ( X 1 − X 2 ) + t α / 2,ν S p 1 1 1  = 1−α  + +  n1 n 2 n1 n 2   upper  1 1  P µ1 − µ 2 ≤ ( X 1 − X 2 ) + tα ,ν S p  +  = 1−α  n1 n2   lower 116
  • 117.  1  P µ 1 − µ 2 ≥ ( X 1 − X 2 ) − t α ,ν S p 1  = 1−α  +  n1 n 2   Example 7.3 A bridge construction firm purchases their shear pins from anoutside supplier. The key characteristic of these pins is their tensile strength. Currently,the firm purchases their pins from Supplier 1. Recently, a sales representative fromSupplier 2 visited the firm. Supplier 2 claimed that, although their pins have the samestandard deviation as Supplier 1, their mean tensile strength is larger. As a result of thisvisit, the construction firm has asked you to test this claim using 15 of each supplier’spins and a level of significance of 0.01.(a) State the null and alternative hypotheses.(b) State any assumptions that are required to perform this test. Tensile strength is normally distributed.(c) Identify the test statistic and the critical region.The results of the test were: X 1 = 22351.5 psi S1 = 98.8 psi X 2 = 22461.5 psi S2 = 103.4 psi(d) Estimate the two suppliers’ standard deviation. 14( 98.8) + 14(103.4) 2 2 S = 2 p = 10226.5 28 117
  • 118. Sp = 101.1 psi(e) State the conclusion resulting from the test and find the P-value for the test. T0 = ( 22351.5 − 22461.5) = −2.9797 1 1 101.1 + 15 15(f) What is the power of the test if Supplier 2’s mean tensile strength is 100 psi more than Supplier 1?(g) If we wish to detect a superiority (for Supplier 2) of 150 psi with probability 0.90, then what sample size should we have used?(h) Construct a 99% confidence interval that is relevant to this problem. 7.2.2 Variances Unknown and Unequal Assumptions 118
  • 119. • the two populations are independent • σ1 and σ2 are unknown and not believed to be equal • the populations are normally distributed Parameter of Interest µ1 − µ2 Point Estimate X1 − X 2 Potential Tests Test Statistic T0 = (X 1 − X 2 ) − ∆0 S 12 S 2 2 + n1 n 2 When the null hypothesis is true, this statistic has approximately a t distribution. Therefore, all the methods in this section are approximate.Critical Region 2  S 12 S 2  2  +  n n2   1 ν= −2 (round down) (S n1 1 2 S2 n + 2 2 ) 2 ( ) 2 n1 + 1 n2 + 1H1: µ1 − µ2 ≠ ∆0 Reject H0 if T0 > tα/2,ν or if T0 < −tα/2,νH1: µ1 − µ2 > ∆0 Reject H0 if T0 > tα,νH1: µ1 − µ2 < ∆0 Reject H0 if T0 < −tα,νP-ValuesH1: µ1 − µ2 ≠ ∆0 P = 2 P ( t ν > T0 )H1: µ1 − µ2 > ∆0 P = P(tν > T0)H1: µ1 − µ2 < ∆0 P = P(tν < T0) 119
  • 120. Confidence Intervalstwo-sided  S 12 S 2 2 S 12 S 2  2P ( X 1 − X 2 ) − t α / 2,ν + ≤ µ 1 − µ 2 ≤ ( X 1 − X 2 ) + t α / 2,ν +  = 1−α  n1 n 2 n1 n 2   upper  S2 S2 P µ1 − µ 2 ≤ ( X 1 − X 2 ) + tα ,ν 1 + 2  = 1 − α  n1 n2   lower  S 12 S 22 P µ 1 − µ 2 ≥ ( X 1 − X 2 ) − t α ,ν +  = 1−α  n1 n 2   Example 7.4 The bridge construction firm purchases another type of shear pin from anoutside supplier. Currently, the firm purchases their pins from Supplier A. Recently, asales representative from Supplier B visited the firm. Supplier B claimed that their meantensile strength is larger. As a result of this visit, the construction firm has asked you totest this claim using 23 of each supplier’s pins and a level of significance of 0.05.(a) State the null and alternative hypotheses. H0: µA − µB = 0 H1: µA − µB < 0(b) Identify the test statistic and the critical region. Reject H0 if T0 < −t0.05,νThe results of the test were: X A = 5263.8 psi SA = 37.3 psi X B = 5512.2 psi SB = 52.8 psi 120
  • 121. (c) State the conclusion resulting from the test and find the P-value for the test. T0 = ( 5263.8 − 5512.2) = −18.43 37.3 2 52.8 2 + 23 23 2  37.3 2 52.8 2    23 + 23     ν= − 2 = 41.2 = 41 (37.3 2 23 ) 2 + (52.8 2 23 ) 2 23 + 1 23 + 1(d) Construct a 95% confidence interval that is relevant to this problem. 7.3 Pairs of Observations, the Paired t- Test A special case of the two-sample t-tests occurs when the observations of the twopopulations of interest are collected in pairs. An automobile manufacturer wishes to compare the mean tread wear for frontaxle tires as compared to rear axle tires. Tires are installed on 10 cars and they are drivenunder defined experimental conditions. The resulting observations would be:(X11, X21) (total wear on front tires of car 1, total wear on rear tires of car 1)(X12, X22) (total wear on front tires of car 2, total wear on rear tires of car 2) We only use this approach when we believe that pairing the observations resultsin a large positive correlation between the observations within each pair. If we don’tbelieve this to be true, then we are better off using the tests from the previous twosections. 7.3.1 The Paired t-Test Let (X11, X21), (X12, X22), … , (X1n, X2n) be a set of n paired observations. Definethe differences between each pair of observations as Dj = X1j − X2j j = 1, 2, … n. 121
  • 122. Having computed the difference for each pair, we then compute the average of thedifferences, D , and the standard deviation of the differences SD. Assumptions • the Dj’s are iid normal random variables Parameter of Interest µD = µ1 − µ2 Point Estimate D Potential Tests H0: µ1 − µ2 = ∆0 H0: µ1 − µ2 = ∆0 H0: µ1 − µ2 = ∆0 H1: µ1 − µ2 ≠ ∆0 H1: µ1 − µ2 > ∆0 H1: µ1 − µ2 < ∆0 Test Statistic D − ∆0 T0 = SD n Critical Region H1: µ1 − µ2 ≠ ∆0 Reject H0 if T0 > tα/2,n-1 or if T0 < −tα/2,n-1 H1: µ1 − µ2 > ∆0 Reject H0 if T0 > tα,n-1 H1: µ1 − µ2 < ∆0 Reject H0 if T0 < −tα,n-1 P-Values H1: µ1 − µ2 ≠ ∆0 P = 2 P ( t n −1 > T0 ) H1: µ1 − µ2 > ∆0 P = P(tn-1 > T0) H1: µ1 − µ2 < ∆0 P = P(tn-1 < T0) Confidence Intervals two-sided 122
  • 123.  S S  P D − t α / 2, n −1 D ≤ µ 1 − µ 2 ≤ D + t α / 2, n −1 D  = 1 − α    n n upper  S  P µ 1 − µ 2 ≤ D + t α , n −1 D  = 1 − α    n lower  S  P µ 1 − µ 2 ≥ D − t α , n −1 D  = 1 − α    n Example 7.5 A consumer ratings magazine has developed a test to compare themean life length of two brands of batteries. Five electronic toys were used to make thecomparison. Two of each type of toy were purchased, and one toy of each type waspowered by each type of battery (Type 1 and Type 2). The resulting durations of toyusage are given below. Toy Battery Type 1 Battery Type 2 1 52.6 hr 61.4 hr 2 103.4 hr 112.8 hr 3 68.2 hr 67.1 hr 4 88.4 hr 92.3 hr 5 111.6 hr 121.5 hr(a) State the null and alternative hypotheses. H0: µ1 − µ2 = 0 H1: µ1 − µ2 ≠ 0(b) State any assumptions that you are making. Dj = X1j − X2j, j = 1, 2, … , 5 D1, D2, … , D5 are iid normal random variables.(c) Identify the test statistic and the critical region (use α = 0.05). 123
  • 124. Reject H0 if T0 > tα/2,n-1 = 2.776 or if T0 < −tα/2,n-1 = −2.776(d) State the conclusion resulting from the test and find the P-value for the test. Reject H0 and conclude that µ1 − µ2 ≠ 0.(e) Construct a 95% confidence interval that is relevant to this problem. 124
  • 125. 7.4 Inference on the Ratio of Variances of Two Normal Populations 7.4.1 The F Distribution One of the most useful distributions in statistics is the F distribution. Let W and Ybe two independent random each one with u and v number of degrees of freedomrespectively. Then the ratio of two W F= u Y vhas the probability density function u  u + v  u  2 ( u 2 ) −1 Γ   x f ( x) =  2  v  , 0<x < ∞ ( u +v) / 2  u   v   u   Γ   Γ     x + 1  2   2   v   Then, the ratio is said to follow the F distribution with u degrees of freedom inthe numerator and v number of degrees of freedom in the denominator. Theorem 7.1 The F DistributionConsider two independent normal populations having standard deviations σ1 and σ2respectively. Suppose a random sample of size n1 is taken from the first population, andlet S1 denote the sample standard deviation. Suppose a random sample of size n2 is takenfrom the second population, and let S2 denote the sample standard deviation. Then: S 12 σ 12 F= S 22 σ 2 2has an F distribution with n1 − 1 numerator degrees of freedom and n2 − 1 denominatordegrees of freedom.Example 7.6 Suppose W has an F distribution with 8 numerator degrees of freedom and13 denominator degrees of freedom.(a) P(W > 1.49) = Note that Excel and the tables give you the P(W > x) 125
  • 126. (b) P(W < 2.77) =(c) P(W > 0) = 1(e) Find w such that P(W > w) = 0.025(f) Find w such that P(W < w) = 0.99 ⇒ P(W >w) = 1-0.99 = 0.01 1 1 w = f0.01,8,13 = FINV(0.01, 8, 13) = 4.30208 = = FINV(0.99,13,8) 0.232447 Note that 1 f 1−α , n1 −1, n2 −1 = f α , n2 −1, n1 −1 7.4.2 Hypothesis Testing on the Ratio of Two Variances Suppose we wish to compare the standard deviations of two populations.Assumptions• the two populations are independent• the populations are normally distributedParameter of InterestPoint EstimatePotential Tests H0: σ 12 = σ 2 2 H0: σ 12 = σ 2 2 H0: σ 12 = σ 2 2 H1: σ 12 ≠ σ 2 2 H1: σ 12 > σ 2 2 H1: σ 12 < σ 2 2 126
  • 127. Test Statistic S 12 F0 = 2 S2 • If the null hypothesis is true, then the test statistics has an F distribution.Performing the TestCritical Region H1: σ 12 ≠ σ 2 Reject H0 if F0 > f α / 2, n1 −1, n2 −1 or if F0 < f 1−α / 2, n1 −1,n2 −1 2 H1: σ 12 > σ 2 Reject H0 if F0 > f α , n1 −1, n2 −1 2 H1: σ 12 < σ 2 2 Reject H0 if F0 < f 1−α , n1 −1,n2 −1P-Values H1: σ 12 ≠ σ 2 2 { ( ) ( P = 2 min P f n1 −1, n2 −1 > F0 , P f n1 −1,n2 −1 < F0 )} H1: σ 12 > σ 2 2 ( P = P f n1 −1, n2 −1 > F0 ) H1: σ 12 < σ 2 2 P = P( f n1 −1, n2 −1 <F ) 0Using OC Curves for Computing β and Sample Size σ1 λ= n = n1 = n2 σ2 two-sided test use Charts VII (o) and VII (p) one-sided test use Charts VII (q) and VII (r) σ2 Use λ = for H1: σ 12 < σ 2 . 2 σ1Confidence Intervals S2 σ 2 S2  two-sided P 12 f 1−α / 2, n2 −1, n1 −1 ≤ 12 ≤ 12 f α / 2, n2 −1,n1 −1  = 1 − α S   2 σ 2 S2  σ 2 S 2  upper P 12 ≤ 12 f α , n2 −1, n1 −1  = 1 − α σ   2 S2  σ 2 S 2  lower P 12 ≥ 12 f 1−α , n2 −1, n1 −1  = 1 − α σ   2 S2  127
  • 128. Reject H0 if 1 is not in the confidence interval.Example 7.7 A company which manufactures aircraft is studying the purchase of theglass used to make the windshields for one of its aircraft. Two suppliers, Supplier 1 andSupplier 2, are competing for this business. The characteristic of the glass that is ofparticular interest is its thickness. It has already been determined that both suppliers havethe same mean thickness. However, Supplier 1 claims to have superior performance withrespect to variability. The manufacturer wishes to test this claim using a sample of size10 (for each supplier) and a level of significance of 0.01.(a) State the null and alternative hypothesis, the test statistic, and the critical region. S 12 F0 = 2 S2(b) State any required assumptions. Glass thickness is normally distributed, and the two suppliers are independent.(c) Suppose S1 = 0.023 in and S2 = 0.028 in. State and interpret the conclusion of this test.(d) What is the P-value for this test? Because of the nature of the F distribution tables, I do not recommend using P- values to perform this hypothesis test. 128
  • 129. (e) Construct a confidence interval on σ1 / σ2 that is relevant to this problem. 99% upper confidence interval(f) What is the power of this test if σ1 = 0.25σ2?(g) What sample size should we have used if we want the Type II error probability to be 0.1 when σ1 = 0.2σ2? Example 7.8 Let X1 denote the GPA of a female engineering student at thePhiladelphia University, and let X2 denote the GPA of a male engineering student at thePhiladelphia University. Consider the following hypothesis test. 129
  • 130. Suppose we wish to perform this test using a level of significance of 0.05.(a) If we want the power of the test to be 0.9 when σ1 = 3.2σ2, then what sample size should we use?(b) If this sample size is used, what is the probability of a Type II error when σ1 = 2.4σ2?(c) If this sample size is used, what is the probability of a Type I error? α = 0.05(d) Identify the test statistic and the critical region for this test.(e) Suppose the test is implemented and S1 = 0.5315 and S2 = 0.5476. State and interpret the conclusion resulting from this test.(f) Verify that a confidence interval would have lead to the same conclusion. 95% confidence interval 130
  • 131. 7.5 Performing the Test on Two Population ProportionsAssumptions• the two populations are independent and largeNote that all procedures are approximate based on the normal approximation to thebinomial.Parameter of Interest p1 − p2Point Estimate p1 − p 2 ˆ ˆPotential Tests H0: p1 = p2 H0: p1 = p2 H0: p1 = p2 H1: p1 ≠ p2 H1: p1 > p2 H1: p1 < p2Test Statistic p1 − p 2 ˆ ˆ Z0 = 1 1  p (1 − p )  +  ˆ ˆ   n1 n 2  X1 + X 2 where p = ˆ n1 + n 2Critical Region H1: p1 ≠ p2 Reject H0 if Z0 > Zα/2 or if Z0 < −Zα/2 H1: p1 > p2 Reject H0 if Z0 > Zα H1: p1 < p2 Reject H0 if Z0 < −ZαP-Values H1: p1 ≠ p2 P = 2 P( Z > Z 0 ) H1: p1 > p2 P = P(Z > Z0) H1: p1 < p2 P = P(Z < Z0)OC Function q1 = 1 − p1 q2 = 1 − p2 n p + n2 p 2 n1q1 + n2 q 2 p= 1 1 q= n1 + n 2 n1 + n2 131
  • 132. p1q1 p 2 q 2 σ= + n1 n2 H1: p1 ≠ p2  − Z α / 2 pq (1 n1 + 1 n 2 ) − ( p1 − p 2 ) Z α / 2 pq (1 n1 + 1 n 2 ) − ( p1 − p 2 )  β = P ≤Z≤   σ σ     Z pq (1 n1 + 1 n 2 ) − ( p1 − p 2 )  H1: p1 > p2 β = P Z ≤ α   σ     − Zα pq (1 n1 + 1 n 2 ) − ( p1 − p 2 )  H1: p1 < p2 β = P Z ≥   σ   Choice of Sample Size n = n1 = n2 two-sided test n= (Z α/2 ( p1 + p 2 )( q1 + q 2 ) / 2 + Z β p1 q1 + p 2 q 2 ) 2 ( p1 − p 2 ) 2 one-sided test n= (Z α ( p1 + p 2 )( q1 + q 2 ) / 2 + Z β p1 q1 + p 2 q 2 ) 2 ( p1 − p 2 ) 2Confidence Intervals q1 = 1 − p1 ˆ ˆ q2 = 1 − p2 ˆ ˆ ˆ ˆ ˆ ˆ p1q1 p 2 q 2 σ= ˆ + n1 n2 two-sided P( p1 − p 2 − Z α / 2σ ≤ p1 − p 2 ≤ p1 − p 2 + Z α / 2σ ) = 1 − α ˆ ˆ ˆ ˆ ˆ ˆ upper P( p1 − p 2 ≤ p1 − p 2 + Z α σ ) = 1 − α ˆ ˆ ˆ lower P( p1 − p 2 ≥ p1 − p 2 − Z α σ ) = 1 − α ˆ ˆ ˆ Reject H0 if 0 is not in the confidence interval.Example 7.9 The California legislature is considering a new bill which will affect thecitizens of Los Angeles and San Francisco. Newspaper reports have indicated that 80% 132
  • 133. of the citizens of these cities oppose the provisions of the bill. The legislature has hiredan independent firm to determine if, in fact, the proportion is the same for citizens of bothcities. Other sources have indicated that the proportion is actually larger for citizens ofLos Angeles. The firm intends to use a level of significance of 0.05.(a) State the null and alternative hypotheses, the test statistic and the critical region. p1 = proportion of Los Angeles citizens who oppose the bill p2 = proportion of San Francisco citizens who oppose the bill(b) If p1 = 0.85 and p2 = 0.75, the firm wishes to have a Type II error probability of 0.10. How many citizens of each city should they survey?(c) Suppose the sample size identified in part (b) is used and the results are X1 = 228 and X2 = 211. Evaluate the results of the test using a P-value. 133
  • 134. .(d) What is the power of this test if p1 = 0.82 and p2 = 0.78?(e) Construct a confidence interval that is relevant to this problem. 134