Upcoming SlideShare
×

U unit7 ssb

238 views

Published on

vtu m4 notes

Published in: Engineering, Technology, Education
0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
238
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
3
0
Likes
0
Embeds 0
No embeds

No notes for slide

U unit7 ssb

1. 1. Probability-II: PART – B Unit VII Engineering Mathematics-IV Subject Code: 10Mat41 Part B Unit : VI Probability-I Dr. S. S. Benchalli Associate Professor and Head Department of Mathematics Basaveshwar Engineering College Bagalkot – 587102, Karnataka Email: sbenchalli@gmail.com Mobile:8762644634 Random Variables: In most statistical problems we are concerned with one number or a few numbers that are associated with the outcomes of experiments. In the inspection of a manufactured product we may be interested only in the number of defectives; in the analysis of a road test we may be interested only in the average speed and the average fuel consumption. All these numbers are associated with situations involving an element of chance – in other words, they are values of random variables. In the study of random variables we are usually interested in their probability distributions, namely, in the probabilities with which they take on the various values in their range. For example, In tossing a coin the outcomes are H (Heads) or T (Tails), and in tossing a die the outcomes are of integers. However we frequently wish to assign a specific number to each outcome of the experiment, in coin tossing, it may be convenient to assign 1 to H and 0 to T, and such an assignment of numerical values is called a random variable. More generally, we have the following definition Definition: A random variable X is a rule that assigns a numerical value to each outcome in a sample space S.
2. 2. In other wards If f is a function from S into the set R of all real numbers and X = f(s), s ε S, then X is called a random variable on S. For this experiment the sample space is S={H,T }. Let us define a function f : S R by f(s) = 1 if s = H = 0 if s = T Then X =f(s) is a random variable on S. For the outcome H, the value of this random variable is 1 and for the outcome T, its value is zero. If X and Y are two random variables defined on the sample space S and a and b are two real numbers i) aX +bY is a random variable. In particular X – Y is a random variable ii) XY is a random variable iii) If X(s) ≠ 0 for all s ε S then 1/ X is also a random variable Definition: The event consisting of all outcomes for which X = x is denoted as { X= x} and the probability of this event is denoted as P(X = x ). The Random variables are classified as discrete random variables (DRV) and continuous random variables (CRV). If the random variable assumes values in steps or at the most countable number of values, it is called a discrete random variable is said to be continuous if it assumes all the values between the two limits. For example, number of defective items in a lot is a DRV, length of life of electric bulbs is an example of CRV. Example: In the experiment of tossing two coins, we have the sample space S = {HH,TH,HT,TT}. We assign uniform probability ¼ to each element of S. Consider a random variable X which assigns to each element of S “ the number of heads” in that element. Thus X: S R is given by X(HH) =2, X(HT)= X(TH)=1; X(TT) = 0 i.e. Number of heads (X) :{HH,HT.TH,TT} {0,1,2} (R) Range of X = {0,1,2} Now X-1 (0)= X-1 (No head) = {s ε S : X(x) = 0} = {TT} X-1 (1)= {HT,TH}, X-1 (2)={HH} Therefore the probabilities of the events are: P(X = 0) = 1/4 P(X = 1) = ½, P(X = 2) = 1/4
3. 3. Definition: The probability distribution f(x) of a random variable X is a description of the set of possible values of X (range of X ), along with the probability associated with each of the possible values ’x’. Example: The probability distribution of the random variable X = Number of heads in the previous example Definition: The probability distribution f(x) of a random variable X is a description of the set of possible values of X (range of X ), along with the probability associated with each of the possible values ’x’. Example: The probability distribution of the random variable X = Number of heads in the previous example X = x 0 1 2 f(x)=P(X=x) ¼ ½ 1/4 We are interested not only in the probability f(x), for the value of a random variable ‘x’, but also in the probability F(x) that the value of a random variable is less than ( or ) equal to x. We refer to the function that assigns a value F(x) to each x within the range of a random variable as the cumulative distribution function. Definition: The cumulative distribution function for a random variable X is defined by F(x)= P(X ≤ x ), where x is any real number ( i.e. -∞ < x < ∞) Now two important properties of F(x) are given by I) If a < b then P(a < X ≤ b ) = F(b) – F(a) P(a ≤ X ≤ b) = P(X = a) + F(b) - F(a). Discrete Probability Distributions: Definition: Let X be a discrete random variable. The discrete probability function f(x) for X is given by f(x) = P(X =x) for real x Example: In tossing three coins, if X = Number of heads, we have the probability distribution as X = x 0 1 2 3
4. 4. f(x) = P(X=x) 1/8 3/8 3/8 1/8 Since probabilities cannot be negative, a probability function f(x) cannot assume negative values. The probability associated with a sample space is 1. Thus if we add the values of f(x) over all possible values of X, the total should be 1. In fact, these two properties completely characterize the probability function of a discrete random variable. Properties that identify a probability function for a discrete random variable 1. f(x) ≥ 0 for each real number x 2. Note that the discrete probability function f(x) can also be called as Probability mass function. Any function f(x) satisfying above properties 1 and 2 above will automatically be a discrete probability function or probability mass function. Example: Check whether the following can serve as (discrete) probability function a) f(x) = (x-2) / 2 for x = 1,2,3,4 b) h(x) = x2 / 25 for x = 0, 1,2,3,4 Solution: a) The function cannot serve as a probability distribution because f(1) is negative b) The function cannot serve as a probability distribution because the sum of the five probabilities is 6/5 and not 1. Example: Five defective bulbs are accidentally mixed with twenty good ones. It is not possible to just look at a bulb and tell whether or not it is defective. Find the probability distribution of the number of defective bulbs, if four bulbs drawn at random from this lot. Solution: Let X denote the number of defective bulbs in 4. Clearly X can take values 0,1,2,3,or 4 Number of defective bulbs = 5 Number of good bulbs = 20 Total number of bulbs =25 P( X = 0) = P(no defective) = P(all 4 good ones ) = 20 C4 / 25 C4 =969/2530. ∑ = xall 1f(x)
5. 5. P( X = 0) = P(no defective) = P(all 4 good ones ) = 20 C4 / 25 C4 =969/2530. P(X=1) = P(one defective & 3 good ones) = (5 C1 x 20 C3 ) / 25 C4 = 1140 / 2530. P(X=2) = P(2 defective & 2 good ones) = (5 C2 x 20 C2 ) / 25 C4 =380/2530. P(X=3) = P(3 defective & 1 good ones) = (5 C3 x 20 C1 ) / 25 C4 =40/2530. P(X=4) = P(all 4 defective ) = (5 C4) / 25 C4 =1/2530 Therefore the probability distribution of the random variable X is X : 0 1 2 34 P(X):969/2530 1140/2530 380/2530 40/2530 1/2530 Example: Four bad apples are mixed accidently with 20 good apples. Obtain the probability distribution of the number of bad apples in a draw of 2 apples at random. Solution: Let X denote the number of bad apples drawn. Then X is a random variable which can take the values 0, 1 or 2 There are 4 + 20 = 24 apples in all and the exhaustive number of cases of drawing two apples is 24 C2. Therefore P(X=0) = 20 C2 / 24 C2 = 95/138. P(X = 1) = (4 C1 x 20 C1 ) / 24 C2 = 40/138. P(X = 2 ) = 4 C2 / 24 C2 = 3/138. Hence the probability distribution of X is X : 0 1 2 P(x) : 95/138 40/138 3/138
6. 6. Theoretical Distributions: In the previous section, we studied the experimental frequency distributions in which the actual data were collected; classified and tabulated in the form of a frequency distribution such data are usually based on sample studies. The statistical measures like the averages, dispersion, skewness, krutosis, correlation etc, for the sample frequency distributions not only give us the nature and form of the sample data but also help us in formulating certain ideas about the characteristics of the populations. However, a more scientific way of drawing inferences about the population characteristics is through the study of theoretical distributions which we shall discuss in the section We have already defined the random variable, mathematical expectation, probability and distribution function, etc in terms of probability function. These provide us the necessary tools for the study of theoretical distributions. Binomial Distribution : Binomial distribution is also known as the ‘Bernoulli’ distribution after the Swiss mathematician James Bernoulli who discovered in 1700 and was first published in 1713, eight years after his death. This distribution can be used under the following conditions i) The random experiment is performed repeatedly a finite and fixed numbers of times. In other wards n, the number of trials is finite and fixed ii) The outcome of each trial may be classified into two mutually disjoint categories, called success ( the occurrence of the event ) and failure ( the non-occurrence of the event). iii) All the trials are independent i.e the result of any trial, is not affected in any way by the preceding trials and doesn’t affect the result of succeeding trials. iv) The probability of success ( happening of an event ) in any trial is p and is constant for each trial q = 1- p, is then termed as the probability of failure and is constant for each trial. For example, if we toss a fair coins n times (which is fixed and finite ) then the outcome of any trial is one of the mutually exclusive events viz head (success ) and tail (failure). Further, all the trials are independent, since the result of any throw of a coin does not affect and is not affected by the The result of other throws. Moreover, the probability of success ( head ) in any trial is ½, which is constant for each trial. Hence the coin tossing problems will give rise to Binomial distribution More precisely, we expect a binomial distribution under the following conditions i) n, the number of trials is finite ii) Trials are independent iii) P, the probability of success is constant for each trial. Then q=1-p, is the probability of failure in any trial.
7. 7. iv) Probability function of Binomial distribution: Consider the probability distribution table v) xi : 0 1 2 ….. x n vi) P(xi) : qn n C1qn-1 p n C2qn-2 p2 …. n Cxqn-x px Pn vii) Where n is a given positive integer, p is a real number such that 0 ≤ p < 1 and q = 1-p. viii) The probability function for this distribution is denoted by b(n,p,x) given by b(n,p,x) = n Cx qn-x px ; x=0,1,2,….n ix) This probability function is called Binomial probability function and corresponding distribution is called Binomial distribution. x) i) b(n,p,x) > 0 for x=0,1,2,….n xi) ii) xii) Mean µ = np xiii) Variance : xiv) Variance = npq xv) The standard deviation of Binomial distribution is Example :Let X be a binomially distributed random variable based on 6 repetitions of an experiment. If p =0.3, evaluate the following probabilities i) P(X≤ 3) ii) P(X > 4) Solution : Given P-0.3 and n = 6 Hence q=1-P = 0.7 and b(n,P,x ) = b (6,0.3.x) = 6 Cx (0.7)6-x (0.3)x = P(x) say i) In this case X≤ 3 Hence X can take values 0,1,2 and 3. Therefore P(X≤ 3) = P(0) + P(1) + P(2)+P(3) = (0.7)6 +6 C1(0.7)5 (0.3) + 6 C2(0.7)4 (0.3)2 +6 C3(0.7)3 (0.3)3 ii) In this case X > 4 Hence x takes values 5 and 6 Therefore P(X> 4) = P(5)+P(6) = 6 C0(0.7)5 (0.3)5 + (0.3)6 Example: The probability that a pen manufactured by a company will be defective is 0.1. If 12 such pens are selected at random, find the probability that I) Exactly two pens will be defective II) At most two pens will be defective npqσ = ∑ − 2 i 2 i µ)P(xx ∑− = n x xpnb 0 1),,( ∑− = n x xpnb 0 1),,(
8. 8. None will be defective Solution: Let the probability that a pen manufactured is defective = p. Then p=0,1 q=1-p = 0.9 and n =12 Hence b(n, p, x) = 12 Cx(0.9)12-x (0.1)x = P(x) say i) Probability that exactly two pens will be defective = P(X=2)=12 C2(0.9)10 (0.1)2 =0.2301. ii) Probability that at most 2 pens will be defective =P(X≤2) =P(0)+P(1)+P(2) = (0.9)12 +12 C1(0.9)11 (0.1)+12 C2(0.9)10 (0.1)2 iii) Probability that none of the pens will be defective =P(X=0)=P(0)=(0.9)12=0.2824295. Poisson Distribution ( As a limiting case of Binomial distribution) Poisson distribution was derived in 1837 by a French Mathematician Simeon D Poisson. Poisson distribution may be obtained as a limiting case of Binomial probability distribution under the following conditions. i) n, the number of trials is indefinitely large I, e n ∞ ii) P, the constant probability of success for each trial is indefinitely small i.e P 0. iii) N p =m (say ) is finite. Under the above three conditions the Binomial probability function b(n ,p ,x) = n Cx px qn-x tends to the probability function of the Poisson distribution given below P(X=x)= (µx e-µ ) / x! , x = 0,1,2…. Where X is the number of successes (Occurrences of event), µ=np and e=2.71828 and x! = x(x-1)(x- 2)….(3)(2)(1). Note that: 1) Poisson distribution function is usually denoted by P(µ,x) 2. Poisson distribution is a discrete probability distribution, since the variable X can take only Integral values 0,1,2,… ∞ 3) Putting x =0,1,2,…. ∞ in the above deﬁniƟon, we obtain the probabiliƟes of 0,1,2…., successes respectively which are tabulated below Number of successes 0 1 2 3 ….. r ---∞ ProbP(x i ) e -µ (µe -µ )/1! (µ 2 e -µ )/2! (µ 3 e -µ )/3! …… (µ r e -µ )/r! ………
9. 9. From the above table i) P(µ,x )>0 ii) Mean: The mean of Poisson distribution is µ which is finite Variance: V = µ- µp for Poisson distribution µ is finite and p is small. Hence V= µ. Utility or Importance of Poisson distribution: Poisson distribution can be used to explain the behavior of the discrete random variables where the probability of occurrence of the event is very small and the total number of possible cases is sufficiently large. As such Poisson distribution has found application in a variety of fields such as Queuing Theory (waiting time problems), Insurance, Physics, Biology, Business, Economics and Industry. The following are the some practical situations where Poisson distribution can be used. i) Number of telephone calls arriving at a telephone switch board in unit time (say per minute) ii) Number of customers arriving at the super market; say per hour. iii) To count the number of bacteria's per unit iv) Number of accidents taking place per day on a busy road. Example: Between the hours 2pm and 4pm the average number of phone calls per minute coming into the switch board of a company is 2,35. Find the probability that during one particular minute, there will be at most 2 phone calls. [ Given e-2.35 =0.095374] Solution : If the random variable X denotes the number of telephone calls per minute, then X will follow Poisson distribution with parameter µ=2.35 and probability function. P(X=x)= (µx e-µ ) / x! = ((2.35)x e-2.35 ) / x! ; x = 0, 1, 2…. The probability that during one particular minute there will be at most 2 phone calls is given by. P(X≤ 2) = P( X =0) + P(X=1) +P(X=2) =e-2.35 (1+2.35+(2.35)2 / 2! ) from definition =0.5828543. Example: It is know from past experience that in a certain plant there are on the average 4 industrial accidents per month. Find the probability that in a given year there will be less than 4 accidents. Assume Poisson distribution (e-4 = 0.0183). Solution: In the usual notations we are given µ= 4. If the random variable X denotes the number of accidents in the plant per month, then by Poisson probability law 1),( 0 =∑ ∞ =i ixP µ
10. 10. P(X=x)= (µx e-µ ) / x! = ((4)x e-4 ) / x! The required probability that there will be less than 4 accidents is given by P(X< 4) = P( X =0) + P(X=1) +P(X=2)+P(X=3) =e-4 (1+4+(4)2 / 2!+(4)2 / 3! ) from definition =0.4332 Exponential Distribution: A continuous variable X assuming all non-negative values is said to have an exponential distribution with parameter α > 0 if its probability density function denoted by e(α,x) is given by we have i) e(α,x) > 0 ii) Mean : The mean of the exponential distribution is given by 1/α Variance = 1/α2 Standard deviation = 1/α The probability density function for exponential distribution is Example: The life time of a certain kind of battery is a random variable which has exponential distribution will mean of 200 hours, find the probability that such a battery will i) last at most 100 hours ii) last any where from 400 to 600 hours. Solution: If the random variable X denotes the life time of batteries then X follows exponential distribution with parameter σ=200 hours. The probability density function X is given by    ≤ > = − 0for x0 0for xαe ),e( αx xα 1),( 0 0 =      − = ∞ ∞ ∞− ∞ − ∫ ∫ α αα αx e dxxe     < > = − 0for x,0 0x,, 1 P(x) fore x σ σ 200 200 11 F(x) xx ee −− == σ σ
11. 11. i)P (X≤ 100) = = [ 1-e-0.5 ] ii) Example: The length of a telephone conversation has been found to have an exponential distribution with mean 3 minutes. What is the probability that a call may last i) more than 1 minute ii) less than 3 minutes Solution: If the random variable X denotes the length of telephone conversation in minutes, then X follows exponential distribution with the parameter σ=3 minutes. The p.d.f is given by The required probabilities are i)P[more than 1 min.] = P[X >1] = 1 - P[ X ≤ 1 ] = Normal Distribution: The distributions discussed so for, namely, Binomial distribution and Poisson distribution, are discrete probability distributions, since the variables under study were discrete random variables. Now we confine the discussion to continuous probability distributions which arise when the underlying variable is a continuous one. The normal distribution is the most important continuous distribution in statistics. The normal distribution is used extensively in sampling theory and statistical quality control. Definition: A continuous random variable X having probability density function Is said to have normal or Gaussian distribution with mean µ and variance σ2 Note: The mean µ and standard deviation σ are called the parameters of the Normal distribution     −==≤ −− ∫ )200( 200 1 200 1 )100( 200 100 0 200 xx edxeXP [ ] 32 600 400 200 200 1 600400 −− − −==≤≤ ∫ eeexP x 3 3 11 F(x) xx ee −− == σ σ 3 11 0 3 1 0 3 3 3 1 1 3 1 1 −−− =    −−=− ∫ eedxe xx ( ) ∞<<−∞      = −− xexf x , 2 1 )( 2 2 2σ µ πσ
12. 12. Properties of Normal Distribution: The normal probability curve with mean µ and standard deviation σ is given by The standard normal probability curve is given by the equation It has the following properties. 1) The graph of f(x) is the famous bell shaped curve, the top of the bell is directly above the mean ( µ ) . 2) The curve is symmetrical about the line x = µ or ( z = 0 ) I,e it has the same shape on either side of the line x = µ or ( z = 0 ) This is because the equation of the curve Φ(z) remains unchanged if we change z to –z. 3) The maximum value of f(x) occurs when x = µ and is given by 4) The points of inflexion occurs at x = µ-σ and x = µ + σ 5) The actual size of the bell shaped normal curve depends on the value of µ and σ. 6) No Portion of the curve lies below the x-axis, since f(x) being the probability can never be negative Definition: The probability that x lies between a and b is written P(a ≤ x ≤ b ) and is given by the area under the normal curve between a and b Note: Since the function is difficult to integrate, readymade tables are used. ( ) ∞<<−∞      = −− xexf x , 2 1 )( 2 2 2σ µ πσ ∞<<−∞= − zeZ z , 2 1 )( 2 2 π φ       = πσ 2 1 )(xf
13. 13. Use of the table : To use the table for all possible values of µ and σ2 we perform a process known as standardizing x to obtain standard normal variable which is given the special symbol z. The standard Normal variable Z: Z is the normal variable with mean = 0 and variance = 1 So z ~ N (0,1) We can find the area under the standard normal curve by referring to standard normal tables which give cumulative probabilities. Symbol Φ(z) is used for cumulative probability i.e Φ(z) = P( Z < z ) i.e = Area under the standard curve between A and B The area under standard normal curve between 0 and a positive value of z is given in the normal probability table, using this table, we can evaluate the probability. Example 1) Find the area under the standard normal curve a) between z =0 and z = 1.2 b) Between z = -0.68 and z = 0. Solution: a) = 0.3849 (from the table) b) Required area = area between z =0 and z = +0.68 (by symmetry) 2 z B A 2 e 2π 1 φ(z)whereφ(z)dzb)xP(a − ==≤≤ ∫ ∫ − =≤≤ 1.2 0 2 z dze 2π 1 1.2)xP(0 2 ∫ − =≤≤ 0 0.68- 2 z dze 2π 1 0)xi.eP(-0.68 2 2518.0dze 2π 1 0.68 0 2 z2 == ∫ −