Introduction to Probability and Statistics 6th Week (4/12)Special Probability Distributions (1)
Chebyshev’s Inequality Chebyshev’s inequality guarantees that in any data sample or probability distribution, "nearly all" values are close to the mean The precise statement being that no more than 1/k2 of the distribution’s values can be more than k standard deviations away from the mean. The inequality has great utility because it can be applied to completely arbitrary distributions (unknown except for mean and variance), for example it can be used to prove the weak law ofPafnuty Lvovich large numbers.Chebyshev (1821 –1894)
Law of Large Numbers The law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.
Discrete Probability DistributionWhat kinds of PD do we have to know to solve real-world problems?
Discrete Uniform Distribution• Consider a case with rolling a fair dice• Each random variable has same probability → Uniform distribution
Discrete Uniform Distribution• Probability density function :• Expectation:• Variance :
• Example Suppose that we have a box containing 45 numbered balls. In this case, we randomly select a ball and its number is X: (1) Probability distribution for X (2) Expectation and Variance for X (3) P(X>40)• Solution (1) (2) (3)
(Discrete) Binomial DistributionBernoulli experiment: Only two kinds of results are possible p = 0.85 q = 1- p = 0.15
(Discrete) Binomial DistributionBinomial Distribution
(Discrete) Binomial DistributionSome Properties of the Binomial Distribution
(Discrete) Binomial DistributionSome Properties of the Binomial Distribution (1) (2) (3) (4)
(Discrete) Binomial DistributionSome Properties of the Binomial Distribution μ = n/2 을 중심으로 좌우대칭 : 대칭이항분포 (symmetric binomial distribution) Tail in right Tail in left
Example Two factories, S and L, produce smart phones and their failure ratios are 5%. If you buy 7 phones from S and 13 phones from L, what is the probability to have at least one failed phone? And what is the probability that you have one failed phone? Assume that the failure rates are independent.Solution X (from S) and Y (from L) : X ∼ B (7, 0.05) , Y ∼ B (13, 0.05) , X, Y : Independent X + Y ∼ B (20, 0.05) Only one phone is failed At least one phone is failed
Criteria for a Binomial Probability ExperimentAn experiment is said to be a binomial experimentprovided1. The experiment is performed a fixed number oftimes. Each repetition of the experiment is called atrial.2. The trials are independent. This means theoutcome of one trial will not affect the outcome of theother trials.3. For each trial, there are two mutually exclusiveoutcomes, success or failure.4. The probability of success is fixed for each trial ofthe experiment.
Notation Used in the Binomial Probability Distribution• There are n independent trials of the experiment• Let p denote the probability of success so that1 – p is the probability of failure.• Let x denote the number of successes in nindependent trials of the experiment. So, 0 < x < n.
EXAMPLE Identifying Binomial ExperimentsWhich of the following are binomial experiments?(a) A player rolls a pair of fair die 10 times. The numberX of 7’s rolled is recorded.(b) The 11 largest airlines had an on-time percentage of84.7% in November, 2001 according to the Air TravelConsumer Report. In order to assess reasons fordelays, an official with the FAA randomly selects flightsuntil she finds 10 that were not on time. The number offlights X that need to be selected is recorded.(c ) In a class of 30 students, 55% are female. Theinstructor randomly selects 4 students. The number Xof females selected is recorded.
EXAMPLE Constructing a Binomial Probability DistributionAccording to the Air Travel Consumer Report,the 11 largest air carriers had an on-timepercentage of 84.7% in November, 2001.Suppose that 4 flights are randomly selectedfrom November, 2001 and the number of on-timeflights X is recorded. Construct a probabilitydistribution for the random variable X using atree diagram.
(Discrete) Hypergeometric Distribution It is similar to the binomial distribution. But the difference is the method of sampling Binomial experiment: Sampling with replacement Hypergeometric experiment: Sampling without replacement Normal shooting Russian rouletteEach trial has same probability Each trial may have different probability
(Discrete) Hypergeometric DistributionA box contains N balls, where r balls are white (r<N)Suppose that we randomly select n balls from the box, what is the numberof white balls (X)?Assumption: Sampling without replacement n 개의 items N개 추출 의 items 개 개 개 개
Total 50 chips are in a box. Among those, 4 are out of order (failedchips). If you select 5 chips:(1) Probability distribution for the failed chip in these selected chips(2) Probability to have one or two failed chips for this case(3) Mathematical expectation and variance (1) Random variable: X(2)(3) N = 50, r = 4, n = 5
Multivariate Hypergeometric Distribution 개 개 개 개 개 개 X1 , X2 , X3 : Joint Probability Function
In a box, there are 3 red balls, 2 blue balls, and 5 yellow balls. You select 4 balls.(1) Joint probability function for X, Y, and Z(2) Probability to select 1 red ball, 1 blue ball, and 2 yellow balls. 4 x개 개 y개 5개 z개 2개 3개 (1) Joint probability function: (2)
(Discrete) Poisson Distribution- Describe an event that rarely happens.- All events in a specific period are mutually independent.- The probability to occur is proportional to the length of the period.- The probability to occur twice is zero if the period is short.
(Discrete) Poisson DistributionIt is often used as a model for the number of events (such as the number oftelephone calls at a business, number of customers in waiting lines, numberof defects in a given surface area, airplane arrivals, or the number ofaccidents at an intersection) in a specific time period. If z > 0 Satisfy the PF condition Probability function :
(Discrete) Poisson Distribution .Ex.1. On an average Friday, a waitress gets no tip from 5 customers. Find theprobability that she will get no tip from 7 customers this Friday.The waitress averages 5 customers that leave no tip on Fridays: λ = 5.Random Variable : The number of customers that leave her no tip this Friday.We are interested in P(X = 7).Ex. 2 During a typical football game, a coach can expect 3.2 injuries. Find theprobability that the team will have at most 1 injury in this game.A coach can expect 3.2 injuries : λ = 3.2.Random Variable : The number of injuries the team has in this game.We are interested in
(Discrete) Poisson Distribution .Ex. 3. A small life insurance company has determined that on the average it receives 6death claims per day. Find the probability that the company receives at least sevendeath claims on a randomly selected day. P(x ≥ 7) = 1 - P(x ≤ 6) = 0.393697Ex. 4. The number of traffic accidents that occurs on a particular stretch of roadduring a month follows a Poisson distribution with a mean of 9.4. Find the probabilitythat less than two accidents will occur on this stretch of road during a randomlyselected month. P(x < 2) = P(x = 0) + P(x = 1) = 0.000860
(Discrete) Poisson Distribution Comparison of the Poisson distribution (black dots) and the binomial distribution with n=10 (red line), n=20 (blue line), n=1000 (green line). All distributions have a mean of 5. The horizontal axis shows the number of events k. Notice that as n gets larger, the Poisson distribution becomes an increasingly better approximation for the binomial distribution with the same mean
Discrete Probability Distributions: Summary• Uniform Distribution• Binomial Distributions• Multinomial Distributions• Geometric Distributions• Negative Binomial Distributions• Hypergeometric Distributions• Poisson Distribution
Continuous Probability DistributionsWhat kinds of PD do we have to know to solve real-world problems?
From a survey, the frequency of traffic accidents X is given by -3x f(x) = 3e (0 ≤ x)(1)Probability to observe the second accident after one month of the firstaccident?(2)Probability to observe the second accident within 2 months(3)Suppose that a month is 30 days, what is the average day of theaccident?(1)(2)(3) μ=1/3, accordingly 10 days.
• Survival function :• Hazard rate, Failure rate:
A patient was told that he can survive average of 100 days. Suppose that theprobability function is given by(1) What is the probability that he dies within 150 days.(2) What is the probability that he survives 200 days λ=0.01 이므로 분포함수와 생존함수 : -x/100 -x/100 F(x)=1-e , S(x)=e(1) 이 환자가 150 일 이내에 사망할 확률 : -1.5 P(X < 150) = F(150) = 1-e = 1-0.2231 = 0.7769(2) 이 환자가 200 일 이상 생존할 확률 -2.0 P(X ≥ 200) = S(200) = e = 0.1353
(Continuous) Exponential Distribution ⊙ Relation with Poisson Process(1) If an event occurs according to Poisson process with the ratio λ , the waiting time between neighboring events (T) follows exponential distribution with the exponent of λ.
(Continuous) Gamma Distribution ⊙ Relation with Exponential DistributionExponential distribution is a special gamma distribution with α = 1.IF X1 , X2 , … , Xn have independent exponential distribution with the sameexponent 1/β, the sum of these random variables S= X1 + X2 + … +Xn results ina gamma distribution, Γ(n, β).
If the time to observe an traffic accident (X) in a region have the followingprobability distribution -3x f(x) = 3e , 0<x<∞Estimate the probability to observe the first two accidents between the firstand second months. Assume that the all accidents are independent. X1 : Time for the first accident X2 : Time between the first and second accidents Xi ∼ Exp(1/3) , I = 1, 2 S = X1 + X2 : Time for two accidents S ∼ Γ(2, 1/3) Probability function for S :Answer:
(Continuous) Chi Square DistributionA special gamma distribution α = r/2, β = 2 PD E(X) Var(X)
(Continuous) Chi Square DistributionA random variable X follows a Chi Square Distribution with a degree offreedom of 5, Calculate the critical value to satisfy P(X < x0 )=0.95 Since P(X < x0 )=0.95, P(X > x0 )=0.05. From the table, find the point with d.f.=5 and α=0.05
(Continuous) Chi Square DistributionWhy do we have to be bothered?