Successfully reported this slideshow.
Upcoming SlideShare
×

# QT1 - 05 - Probability Distributions

2,966 views

Published on

Class notes used in Quantitative Techniques - I course at Praxis Business School, Calcutta

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### QT1 - 05 - Probability Distributions

1. 1. Probability Distributions Q U A N T T E C H I N T E U Q I A S E V I T 1 0 S
2. 2. Contents <ul><li>Probability Distributions </li></ul><ul><li>Random Variable </li></ul><ul><ul><li>Expected Value of a Random Variable </li></ul></ul><ul><li>Bernoulli Random Variable </li></ul><ul><ul><li>Bernoulli Distribution </li></ul></ul><ul><li>Binomial Random Variable </li></ul><ul><ul><li>Binomial Distribution </li></ul></ul><ul><li>Poisson Random Variable </li></ul><ul><ul><li>Poission Distribution </li></ul></ul><ul><li>How can we predict the length of queues at retail counter ? </li></ul>
3. 3. Frequency Distributions (Again!) <ul><li>The results of two experiments are given here </li></ul><ul><ul><li>To test hearing ability, words are read out and the listener has to repeat what he has heard </li></ul></ul><ul><ul><li>People desiring to join the army are tested – among other things – on the size of the test </li></ul></ul>
4. 4. From Frequency Distributions to Probability Distributions <ul><li>Actual frequency observed in a group of army recruits </li></ul><ul><li>Is it possible for us to define a mathematical expression </li></ul><ul><ul><li>that will allow us to predict the relative frequency distribution ? </li></ul></ul><ul><ul><li>Without actual experiments ? </li></ul></ul><ul><li>If YES then that mathematical expression is a probability distribution </li></ul>
5. 5. Probability Distribution <ul><li>Frequency Distribution </li></ul><ul><ul><li>Is a listing of the observed frequencies of all the outcomes of an experiment that did occur when the experiment was performed </li></ul></ul><ul><li>Probability Distribution </li></ul><ul><ul><li>Is a listing of the probabilities of all the possible outcomes that could occur if the experiment was performed </li></ul></ul><ul><li>Discrete Probability Distribution </li></ul><ul><ul><li>Can take on a finite set of values that can be listed </li></ul></ul><ul><ul><ul><li>Size of chest </li></ul></ul></ul><ul><ul><ul><li>Score in hearing test </li></ul></ul></ul><ul><li>Continuous Probability Distribution </li></ul><ul><ul><li>Can take on any set of values within a certain range </li></ul></ul><ul><ul><ul><li>Wealth level in a population </li></ul></ul></ul><ul><ul><ul><li>Protein content in eggs </li></ul></ul></ul>
6. 6. Prediction <ul><li>Can we predict the relative frequency or probability of </li></ul><ul><ul><li>Acetic Acid in any collection of eggs ? </li></ul></ul><ul><ul><li>Billionaires in any year </li></ul></ul><ul><li>Based on the data given here ? </li></ul><ul><li>YES : If you can find the probability distribution </li></ul>
7. 7. Models !
8. 8. Basic Modelling Technique Toss a Coin; Head or Tail How many Heads in 10 Coin Tosses Chest Size of Army Recruits Amount of Protein in an Egg Number of Members in a Family Money spent by a family at multiplex Number of Defects per thousand Percentage of People who die before 30 Experimental Data or observation Can be represented by a set of random numbers That obeys certain rules .. 1 or 0 Integer between 0 and 10 Integer between 30 and 50 Any number between 3 and 5 Integer between 0 and 10 Any positive number less than 2000 Integer between 0 and 5 Any number from 0 and 100
9. 9. Random Variable <ul><li>A variable is random if it takes on different values as a result of the outcomes of a genuinely random experiment </li></ul><ul><li>A random variable can be </li></ul><ul><ul><li>Discrete : takes on a fixed set of values </li></ul></ul><ul><ul><li>Continuous : takes on any value in a range </li></ul></ul><ul><li>Can be thought of as a value or magnitude that changes from occurrence to occurrence without any predictable sequence </li></ul><ul><ul><li>While meeting some necessary conditions </li></ul></ul>
10. 10. Expected Value <ul><li>Expected value is fundamental idea in the study of probability distributions </li></ul><ul><li>A RV X can take on values x 1 ,x 2 ,x 3 ...... x n </li></ul><ul><li>Expected Value </li></ul><ul><ul><li>Summation of ( Value of RV x Probability of occurrence of this value) across all possible value of the Random Variable </li></ul></ul><ul><ul><li>S x i P(X = x i ) </li></ul></ul><ul><li>The expected value of a RV is equivalent to the mean of the population / sample that it is representing or modelling </li></ul><ul><ul><li>So is the case with the standard deviation </li></ul></ul>
11. 11. Expected Value / Mean
12. 12. Tossing Coins
13. 13. Bernoulli Random Variable <ul><li>Consider a variable R that can take on only two values : 0 and 1 </li></ul><ul><ul><li>An experiment can result in only two outcomes, A and B </li></ul></ul><ul><ul><ul><li>R = 1 when Event A happens </li></ul></ul></ul><ul><ul><ul><li>R = 0 when Event B happens </li></ul></ul></ul><ul><ul><li>The probability for the two events are as follows </li></ul></ul><ul><ul><ul><li>P(A) = P(R=1) = p <= 1 </li></ul></ul></ul><ul><ul><ul><li>P(B) = P(R=0) = q <=1 </li></ul></ul></ul><ul><ul><ul><li>p + q = 1 </li></ul></ul></ul>R=1 R=0 p q p+q=1 p Probability Cumulative probability
14. 14. Physical Bernoulli Processes <ul><li>Experiment : Coin is tossed </li></ul><ul><ul><li>Event A : Head </li></ul></ul><ul><ul><li>Event B : Tail </li></ul></ul><ul><ul><li>P(A) = 0.5 </li></ul></ul><ul><ul><li>P(B) = 0.5 </li></ul></ul><ul><ul><li>P(R=1) = 0.5 = p </li></ul></ul><ul><ul><li>P(R=0) = 0.5 = q </li></ul></ul><ul><ul><li>p + q = 1 </li></ul></ul><ul><li>Experiment : Dice is rolled </li></ul><ul><ul><li>Event A : 1 or 2 is on top </li></ul></ul><ul><ul><li>Event B : 3,4,5,6 on top </li></ul></ul><ul><ul><li>P(A) = 2/6 </li></ul></ul><ul><ul><li>P(B) = 4/6 </li></ul></ul><ul><ul><li>P(R=1) = 1/3 = p </li></ul></ul><ul><ul><li>P(R=0) = 2/3 = q </li></ul></ul><ul><ul><li>p + q = 1 </li></ul></ul><ul><li>How would the situation change if side 1 is loaded and made heavy ? </li></ul>
15. 15. Another Bernoulli Event <ul><li>A yoga club consists of 10 members of whom </li></ul><ul><ul><li>2 Doctors, 3 Engineers, 3 CA and 2 Other </li></ul></ul><ul><ul><li>4 women, 6 men </li></ul></ul><ul><ul><li>7 married, 3 single </li></ul></ul><ul><li>If we were to choose a club secretary by lottery, what is the probability that the secretary is a </li></ul><ul><ul><li>Doctor ? </li></ul></ul><ul><ul><li>Woman ? </li></ul></ul><ul><ul><li>Married ? </li></ul></ul><ul><li>Case 1 </li></ul><ul><ul><li>Event A : Doctor </li></ul></ul><ul><ul><li>Event B : Not Doctor </li></ul></ul><ul><ul><li>P(R=1) = 2/10 </li></ul></ul><ul><ul><li>P(R=0) = 8/10 </li></ul></ul><ul><li>Case 2 </li></ul><ul><ul><li>Event A : Woman </li></ul></ul><ul><ul><li>Event B : Not Woman </li></ul></ul><ul><ul><li>P(R=1) = 4/10 </li></ul></ul><ul><ul><li>P(R=0) = 6/10 </li></ul></ul><ul><li>Case 3 ? </li></ul>
16. 16. Another Bernoulli Scenario <ul><li>A Quality Control inspector checks cartons of breakfast cereals for </li></ul><ul><ul><li>Correctness of weight </li></ul></ul><ul><ul><li>Damage to cartons </li></ul></ul><ul><li>Historical data shows that </li></ul><ul><ul><li>1% cartons are underweight </li></ul></ul><ul><ul><li>0.5% cartons are damaged </li></ul></ul><ul><ul><li>0.5% cartons have both flaws </li></ul></ul><ul><li>If he inspects 1000 cartons how many would he expected to reject </li></ul><ul><li>Hence .. what is the probability of rejecting any carton ? </li></ul><ul><li>Event W = underweight </li></ul><ul><ul><li>P(W) = 0.01 </li></ul></ul><ul><li>Event D = damaged </li></ul><ul><ul><li>P(D) = 0.005 </li></ul></ul><ul><li>Event DW = both </li></ul><ul><ul><li>P(WD) = 0.005 </li></ul></ul><ul><li>Event A = reject </li></ul><ul><ul><li>A happens if EITHER W happens OR D happens </li></ul></ul><ul><ul><li>W and D are not mutually exclusive </li></ul></ul><ul><ul><li>P(A) = P(W) + P(D) – P(WD) = 0.01 </li></ul></ul><ul><li>Event B = accept </li></ul>
17. 17. Characteristics of a Bernoulli Event <ul><li>Each event has only two possible outcomes </li></ul><ul><ul><li>head or tail, success or failure, yes or no </li></ul></ul><ul><li>Probability of a “success” ( or head, or “yes” ...) remains constant across multiple occurances </li></ul><ul><ul><li>If the demographics of the membership change, then probability of a Doctor ( or woman, or married person ) being the secretary of the club might change. In that case it is no more two events from the same distribution </li></ul></ul><ul><li>Any two Bernoulli events of the same family are independent of each other </li></ul><ul><ul><li>If a member cannot become a secretary a second time, then our Bernoulli distribution is no more relevant </li></ul></ul>
18. 18. Simulating a few Bernoulli Event <ul><li>Generate a random number between 0 and 1 </li></ul><ul><li>If the number is less than “p” </li></ul><ul><ul><li>Event is “Success”, Bernoulli Random Variable = 1 </li></ul></ul><ul><li>Else </li></ul><ul><ul><li>Event is “failure”, Bernoulli Random Variable = 0 </li></ul></ul><ul><li>By changing success probability we can simulate any kind of event </li></ul>
19. 19. Simulating Many Bernoulli Events
20. 20. Tossing More Coins
21. 21. Bernoulli to Binomial <ul><li>Yoga Club </li></ul><ul><ul><li>Doctor as club secretary </li></ul></ul><ul><ul><li>P(A) = 0.2 </li></ul></ul><ul><ul><li>P(R=1) = 0.2 </li></ul></ul><ul><ul><li>P(R=0) = 0.8 </li></ul></ul><ul><li>What is the probability that in next 10 years there will be a doctor as a secretary </li></ul><ul><ul><li>In exactly three years </li></ul></ul><ul><ul><li>In three or more years </li></ul></ul><ul><li>Quality Control </li></ul><ul><ul><li>Carton is rejected </li></ul></ul><ul><ul><li>P(A) = 0.01 </li></ul></ul><ul><ul><li>P(R=1) = 0.01 </li></ul></ul><ul><ul><li>P(R=0) = 0.99 </li></ul></ul><ul><li>What is the probability that in a batch of 1000 cartons the inspector will find </li></ul><ul><ul><li>Exactly 50 defective cartons </li></ul></ul><ul><ul><li>More than 50 defective cartons </li></ul></ul>
22. 22. Binomial Random Variable <ul><li>Binomial RV is defined as the </li></ul><ul><ul><li>Probability of r number of successes </li></ul></ul><ul><ul><li>In n Bernoulli experiments </li></ul></ul><ul><ul><li>Where the Bernoulli success probability is p </li></ul></ul><ul><li>Bin (n,p,r) </li></ul><ul><li>n! </li></ul><ul><li>= p r (1-p) n-r </li></ul><ul><li>r! (n-r)! </li></ul><ul><li>Yoga Club </li></ul><ul><ul><li>Probability of exactly 3 doctor secretaries in 10 years </li></ul></ul><ul><ul><li>Bin (n =10, p =0.4, r=3) </li></ul></ul><ul><li>Quality Control </li></ul><ul><ul><li>Probability of exactly 50 defective cartons in 1000 </li></ul></ul><ul><ul><li>Bin (n =1000, p =0.01,r=50) </li></ul></ul>
23. 23. Binomial Distribution : Experiment and Theory
24. 24. Practical Considerations <ul><li>Binomial as a collection of Bernoulli events </li></ul><ul><ul><li>The probability of success of each Bernoulli event must be identical </li></ul></ul><ul><ul><li>The Bernoulli events must be statistically independent. </li></ul></ul><ul><li>In a real situation .. </li></ul><ul><ul><li>A machine that is used to manufacture products gets – slowly but inevitably – worn out and so the probability of a defective part increases with time </li></ul></ul><ul><ul><li>Selections and elections may not be pure random processes. </li></ul></ul><ul><ul><ul><li>Club rules may debar people from contesting a second time ! </li></ul></ul></ul>Determination of the p value – the Bernoulli success probability – is very important but equally difficult
25. 25. Central Tendency & Dispersion of Binomial Distribution <ul><li>If R is a random variable which follows the binomial distribution </li></ul><ul><li>Where </li></ul><ul><ul><li>n = number of trials </li></ul></ul><ul><ul><li>p = probability of success in any trial </li></ul></ul><ul><ul><ul><li>= bernoulli probability </li></ul></ul></ul><ul><li>Mean of R </li></ul><ul><ul><li>Expected Value of R </li></ul></ul><ul><li>m = np </li></ul><ul><li>Standard Deviation of R </li></ul><ul><li>s = </li></ul><ul><li>s = </li></ul>
26. 26. Impact of 'p' on Binomial Probability
27. 27. Impact of 'n' on Binomial Probability
28. 28. Arrivals
29. 29. Poisson Distribution <ul><li>Describes processes that model number of events in one unit of time </li></ul><ul><ul><li>Arrivals of cars or trucks at a toll booth </li></ul></ul><ul><ul><li>Arrival of customers at a store </li></ul></ul><ul><ul><li>Arrival of products at a machine station </li></ul></ul><ul><ul><li>Number of accidents at a road junction </li></ul></ul><ul><li>Can be described by a discrete random variable </li></ul><ul><li>Coming From Binomial Distribution </li></ul><ul><ul><li>When number of trials in a binomial distribution becomes large ( > 30) it approaches Poisson </li></ul></ul><ul><li>Going Towards Exponential Distribution </li></ul><ul><ul><li>Poisson Distribution can be used to estimate the interval between two events </li></ul></ul><ul><ul><li>Used in Operations Simulation </li></ul></ul>
30. 30. Poisson and Binomial <ul><li>Define an interval of time T ( say hour or day) </li></ul><ul><ul><li>Determine the average number of arrivals ( l ) during T </li></ul></ul><ul><li>Define a small unit of time t ( say minute or second) </li></ul><ul><ul><li>n small units of time in the interval T : n=(T/t) </li></ul></ul><ul><ul><ul><li>Only 1 arrival can happen in one unit of time </li></ul></ul></ul><ul><ul><ul><li>Probability of arrival (“success”) in 1 unit of time = p </li></ul></ul></ul><ul><ul><ul><li>Number of arrivals ( that is “success” ) is a Binomial Distribution with parameters n, p </li></ul></ul></ul><ul><ul><ul><li>Mean of this m = np </li></ul></ul></ul><ul><ul><ul><li>= average number of arrivals in interval T </li></ul></ul></ul><ul><li>Binomial(n,p) is equivalent to Poisson( l ) </li></ul><ul><ul><li>if l = m = np </li></ul></ul>
31. 31. Equivalence of Binomial & Poisson <ul><li>Binomial Probability of x successes in n trials with success probability p </li></ul><ul><li>P(x / n,p) </li></ul><ul><li>n! </li></ul><ul><li>= p x (1-p) n-x </li></ul><ul><li>x! (n-x)! </li></ul><ul><li>Where np = average number of successes </li></ul><ul><li>Poisson Probability of exactly x arrivals in time interval T </li></ul><ul><li>l x . e - l </li></ul><ul><li>P(x / l ) = </li></ul><ul><li>x! </li></ul><ul><li>Where l = average number of arrivals in interval T </li></ul>The Poisson probability is a good approximation of the Binomial probability when the number of trials becomes very high Example : Events (“arrivals”) are happening every minute or second !
32. 32. Bernoulli => Binomial => Poisson
33. 33. Similarity of Processes <ul><li>Bernoulli Process </li></ul><ul><ul><li>Single event with success probability p </li></ul></ul><ul><li>Binomial Process </li></ul><ul><ul><li>Number of successes in n Bernoulli trials each with success probability p </li></ul></ul><ul><ul><li>Mean number of successes = m = np </li></ul></ul><ul><li>Poisson Process </li></ul><ul><ul><li>Number of arrivals in n time slots ( n is very high > 30) where the probability of arrival in a single time slot is p </li></ul></ul><ul><ul><li>Mean number of arrivals = l = np </li></ul></ul>
34. 34. Poisson Distribution in terms of average number of events <ul><li>Probability of exactly x arrivals in time interval T </li></ul><ul><li>l x . e - l </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l = average number of arrivals in interval T </li></ul><ul><li>Assumption </li></ul><ul><ul><li>The average ( mean ) number of events / unit time can be estimated from past data </li></ul></ul><ul><ul><li>Events are NOT simultaneous </li></ul></ul><ul><ul><ul><li>There will be some small gap between events (t) </li></ul></ul></ul><ul><ul><li>Events are independent of each other </li></ul></ul><ul><ul><li>Events are equally likely over the entire time unit </li></ul></ul><ul><ul><ul><li>p is constant </li></ul></ul></ul><ul><ul><ul><li>Not always true </li></ul></ul></ul>
35. 35. Poisson @ childbirth <ul><li>Probability of exactly x childbirths in one day </li></ul><ul><li>l x . e - l </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l = average number of childbirths in a day = 5 </li></ul><ul><li>P(x=0) = 0.00674 </li></ul><ul><li>P(x=1) = 0.03370 </li></ul><ul><li>P(x=2) = 0.08425 </li></ul><ul><li>P(x=3) = 0.14042 </li></ul><ul><li>P(x=4) = 0.17552 </li></ul><ul><li>What is the probability of three or fewer babies ? </li></ul><ul><li>What is the probability of at least 4 babies in a day ? </li></ul>
36. 36. Two Hospitals <ul><li>Probability of exactly x childbirths in one day </li></ul><ul><li>l x . e - l </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l = average number of childbirths in a day </li></ul><ul><li>Probability of exactly x childbirths in one day </li></ul><ul><li>(l 1 +l 2 ) x . e -( l1+l2) </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l 1 , l 2 = average number of births in the two hospitals </li></ul>
37. 37. Poisson Distribution <ul><li>Probability of exactly x arrivals in interval 0->T </li></ul><ul><li>l x . e - l </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l = average number of arrivals in interval [0 to T] </li></ul><ul><li>Probability of exactly x events in interval 0-> t </li></ul><ul><li>l 0 x . e - l 0 </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l 0 = average number of arrivals in interval [0 to t] </li></ul><ul><li>l 0 = l /n = l t/T </li></ul><ul><li>l 0 = ( l /T)t = l r t </li></ul><ul><li>l r = rate at which arrivals happen ! </li></ul>
38. 38. Poisson Distribution <ul><li>Probability of exactly x arrivals in interval 0-> t </li></ul><ul><li>l 0 x . e - l 0 </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Where l 0 = average number of arrivals in interval [0 to t] </li></ul><ul><li>l 0 = l /n = l t/T </li></ul><ul><li>l 0 = ( l /T)t = l r t </li></ul><ul><li>l r = rate at which arrivals happen ! </li></ul><ul><li>Probability of exactly x arrivals in interval 0-> t </li></ul><ul><li>( l r t) x . e -( l r t) </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>l r = number of arrivals per unit time </li></ul><ul><li>l r t = mean number of arrivals in interval t </li></ul>
39. 39. Exponential Distribution Interval between two Arrivals <ul><li>Number of arrivals is modelled by Poisson Distribution </li></ul><ul><li>Probability of exactly x arrivals in interval 0-> t </li></ul><ul><li>( l r t) x . e -( l r t) </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>l r = number of arrivals per unit time </li></ul><ul><li>l r t = mean number of arrivals in interval t </li></ul><ul><li>Interval between two arrivals are modelled by Exponential Distribution </li></ul><ul><li>Probability of time t units between two arrivals </li></ul><ul><li>P(t) = l r . e -( l r t) </li></ul><ul><li>Mean of t = 1/ l r </li></ul>
40. 40. Close Cousins in Randomness <ul><li>Poisson Variable </li></ul><ul><ul><li>Discrete Random Variable </li></ul></ul><ul><ul><li>Takes Integer Values </li></ul></ul><ul><ul><li>Represents number of events ( or arrivals) over a period of time </li></ul></ul><ul><ul><li>Mean Number = l r </li></ul></ul><ul><ul><ul><li>Rate of arrival </li></ul></ul></ul><ul><ul><ul><li>Number of arrivals per unit time </li></ul></ul></ul><ul><li>( l r t) x . e -( l r t) </li></ul><ul><li>P(x) = </li></ul><ul><li>x! </li></ul><ul><li>Exponential Variable </li></ul><ul><ul><li>Continuous Random Variable </li></ul></ul><ul><ul><li>Takes any positive value, not necessarily integer </li></ul></ul><ul><ul><li>Represents the interval between two events ( or arrivals) </li></ul></ul><ul><ul><li>Mean interval = 1/ l r </li></ul></ul><ul><li>P(t) = l r . e -( l r t) </li></ul>
41. 41. Modelling Queues