Upcoming SlideShare
×

# Discrete probability

1,106 views

Published on

1 Like
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
1,106
On SlideShare
0
From Embeds
0
Number of Embeds
466
Actions
Shares
0
26
0
Likes
1
Embeds 0
No embeds

No notes for slide
• From Multiplication Rule:
P(A ∩ B) = P(A|B) . P(B)
P(A).P(B) = P(A|B) . P(B) for Independent Event
P(A|B) = P(A)
• ### Discrete probability

1. 1. 1 Discrete Probability
2. 2. 2 Overview • Basic probability • Conditional probability • Baye’s theorem • Random variables – Probability distribution – Expectation, Variance, standard deviation – Correlation
3. 3. 3 Sample space • The sample space of an experiment, denoted by S, is the set of all possible outcomes of that experiment • One experiment consists of examining a single fuse to see whether it is defective. The sample space for this experiment can be abbreviated as S={N, D}, where N represents non-defective, D represents defective • If we examine three fuses in sequence then the outcome for the entire experiment is any sequence of N and D of length 3, so – S = {NNN, NND, NDN, NDD, DNN, DND, DDN, DDD}
4. 4. 4 Events • An event is any collection (subset) of outcomes contained in the sample space S • An event is said to be simple if it consists of exactly one outcome and compound if it consists of more than one outcome • Event ‘A’ is said to occur if the resulting experiment outcome is contained in A • A={NND, NDN, DNN} = the event that exactly one of the fuse is defective • B={NNN, NND, NDN, DNN} = the event that at most one of the fuse is defective • C={NND, NDN, NDD, DNN, DND, DDN, DDD} = the event that at least one of the fuse is defective
5. 5. 5 Some relations from set theory • An event is nothing but a set • The union of two events A and B, denoted by A U B (read as A or B) is the event consisting of all outcomes for which both A and B occur as well as outcomes for which exactly one occurs • The intersection of two events A and B, denoted by A ∩ B (read as A and B) is the event consisting of all outcomes that are in both A and B • The complement of and event A, denoted by A`, is the set of all outcomes in S that are not contained in A • When event A an B have no outcomes in common, they are said to be mutually exclusive or disjoint events
6. 6. 6 Axioms • Given an experiment and a sample space S, the objective of probability is to assign to each event a number P(A) called the probability of the event A, which will give precise measure of the chance that A will occur • Axiom 1: – For any event A, P(A) >= 0 • Axiom 2: – P(S) = 1 • Axiom 3: – If A1, A2… is collection of mutually exclusive events then • P(A1 U A2 U …) = P(A1) + P(A2) + … = Σ P(Ai)
7. 7. 7 Example • Consider an experiment in which a single coin is tossed. • Sample space S = {H,T} • The axiom specifies P(S)=1 • Since H and T are disjoint events H U T = S • P(S) = 1 => P(H) + P(T) = 1 • => P(T) = 1- P(H) • If P(H) =0.5 then P(T) =0.5
8. 8. 8 Properties of Probability • For any event A, P(A) = 1 – P (A`) • If A and B are mutually exclusive then P(A∩B) = 0 • For any two event A and B, P(A U B)=P(A) + P(B) - P(A∩B)
9. 9. 9 Finite Probability space • Finite equiprobable space – P(A) = (number of elements in A) / (number of elements in S) • Finite probability spaces – Let S = {a1, a2….an} – Assign each point ai in S a real number pi, called the probability of ai satisfying the following probability • Pi > 0 • p1+p2+p3….pn= 1 – We get the following probability distribution outcome a1 a2 … an probability p1 p2 … pn
10. 10. 10 • Three coins tossed and the number of heads observed. S={0,1,2,3} • Let A be the event that at least one head appears – A = {1, 2, 3} – P(A) = P(1) + P(2) + P(3) = 3/8 + 3/8 + 1/8 = 7/8 Example outcome 0 1 2 3 probability 1/8 3/8 3/8 1/8
11. 11. 11 Conditional probability • For any two events A and B with P(B) > 0, the conditional probability of A given that B has occurred is defined by – P(A|B) = P(A ∩ B) / P(B) • Example: A pair of fair dice is tossed. S consists of 36 ordered pairs (a, b). The probability of any point is 1/36. Find probability that one of the die is 2 if the sum is 6 • E = {sum is 6} and A = {2 appears on at least one die} • A = {(2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (1,2), (3,2), (4,2), (5,2), (6,2)} • E={(1,5), (2,4), (3,3), (4,2), (5,1)} • A ∩ E = {(2,4), (4,2)} • P(A/E) = 2/5
12. 12. 12 Multiplication rule • The definition of conditional probability yields the following Multiplication rule – P(A ∩ B) = P(A|B) . P(B)
13. 13. 13 Baye’s Theorem • The Baye’s theorem states that – P(A|B) = P(B|A) P(A) / P(B) • Alternative form of Bayes theorem • By law of totality – P(B) = P(A ∩ B) + P(Ac ∩ B) = P(B|A) P(A) + P(B| Ac )P(Ac ) – P(A|B) = P(B|A) P(A) / ( P(B|A) P(A) + P(B|Ac )P(Ac ) )
14. 14. 14 Example • A school has 60% boys and 40% girls. The female wear trousers and skirts in equal number. All boys wear trousers. An observer sees a student from distant wearing trousers. What is the probability that the student is a girl? • Event A is the student is a girl • Event B is the student wearing trouser • Find P(A|B) • P(A) = 0.4 • P(A`) = 0.6
15. 15. 15 Example • P(B|A) = 0.5 (Probability that the student wearing trouser given that the student is a girl) • P(B|A`) =1 (Probability that the student wearing trouser, given that the student is a boy) • P(B) probability of student wearing trouser regardless of any information = P(B|A) P(A) + P(B| Ac )P(Ac ) = 0.5 * 0.4 + 1*0.6 = 0.8 • P(A|B) = P(B|A) P(A) / P(B) = 0.5 * 0.4 /0.8 =0.25
16. 16. 16 Independent Event • A and B are independent if and only if • P(A ∩ B) = P(A) . P(B) • P(A ∩ B) = P(A) . P(B) => P(B|A) = P(B) => P(A|B) = P(A)
17. 17. 17 Random variables • Given a sample space S of some experiment, a random variable X is any rule that associates a number with each outcome in S • Consider a experiment of a fair coin tossed 3 times and the sequence of head(H) and tails(T) is observed • S={HHH, HHT, HTH, HTT, THH, THT, TTH, TTT) • Let X be a random variable that assign each point in S the largest number of successive heads that occurs • X is the random variable with range space Rx={0,1,2,3} Sample Point TTT HTH HTT THT TTH HHT THH HHH X 0 1 1 1 1 2 2 3
18. 18. 18 Random variable • Any random variable whose only possible values are 0 and 1 is called a Bernoulli random variable • Two types Random variables – Discrete : Range spaces is finite or countable – Continuous : Range space is continuous such as interval or a union of intervals
19. 19. 19 Probability distribution Let X be a discrete random variable, and suppose that the possible value that it can assume are given by x1, x2, x3,…, arranged in some order. Suppose also that these values are assumed with probabilities given by P(X = xk) = f(xk) k = 1, 2, … (1) It is convenient to introduce the probability function, also referred to as probability distribution, given by P(X = x) = f(x) (2) For x = xk, this reduces to (1) while for other values of x, f(x) = 0. In general, f(x) is a probability function if 1. f(x) ≥ 0 2. ∑ x f(x) = 1 Where the sum in 2. is taken over all possible values of x.
20. 20. 20 Probability distribution (Example) Find the probability function corresponding to the random variable X: Sample Point TTT HTH HTT THT TTH HHT THH HHH X 0 1 1 1 1 2 2 3 Assuming that the coin is fair, we have P(X=0) = P(TTT) = 1/8 P(X=1) = P(HTH U HTT U THT U TTH) = P(HTH) + P(HTT) + P(THT) + P(TTH) = 1/8 + 1/8 + 1/8 + 1/8 = ½ P(X=2) = P(HHT U THH) = P(HHT) + P(THH) = 1/8 + 1/8 = ¼ P(X=3) = P(HHH) = 1/8 Thus the probability function is given by the table: x 0 1 2 3 f(x) 1/8 1/2 1/4 1/8
21. 21. 21 Expectation of value (or mean) of X • The expected value or mean value of X, denoted by E(X) or μx is E(X) = μx = Σ x f(x) • A coin is tossed 3 times. Let X denote the largest number of successive heads • E(X)=0(1/8) + 1(4/8) + 2(2/8) + 3(1/8) = 1.375 x 0 1 2 3 f(x) 18 48 28 18
22. 22. Expectation also called the mean is a measure of central tendency. One of its use is in the average case analysis of algorithms. Note: E[X] = ∑ xi p(xi) if X is discrete = ∫x f(x) dx if X is continuous Expectation ...
23. 23. Sequential Search: Consider an array of N elements. If the element is in ith position then the no. of comparisons needed to find the element is i. Let us assume that the element we are searching for i in the array. Then, Asucc ::= ∑ pr( the element is in i th location)* Ti for 0 < i < n+1 ::= ∑ (1/n)*i = n(n+1)/2 * 1/n = (n+1)/2 Afail ::= no of expected comparisons if the element is not in the array= n. [Asucc = no of expected comparisons to be done to find the element, given that the element is in the array.Ti = no of comparisons to find the element at i.] Average Case Analysis
24. 24. If q is the probability that the element we are searching for is in the array, then 1-q is the probability that the element is not in the array. Therefore, A(n) = Expected no of comparisons by the sequential search algorithm. A(n) = q*Asucc + (1-q)*Afail = q*(n+1)/2 + (1-q)*n if q = ½ , that is here is 50-50 chance that the element to be in the array,then A(n) = 3n/4 + ¼ Hence on average ¾ of the array has to be checked to find the element.
25. 25. What if we are given the information that the array is sorted in ascending order ? If K is the element we are searching for and it is not in the array , then we can st comparing if we find that the element we are comparing against in the array is g than K, since all elements after that are also greater than K and declare that K is found. a1 a2 a3 a4 a5 a6 a7 a8 a9 K Stop. If a5 > K if Gi is 'i'th gap, there are n+1 gaps, G1 a1 G2 a2 G3 a3 G4 a4 G5 a5 G6 a6 G7 a7 G8 a8 G9 a9 G10 If K is not in the array then it can occupy one of the n+1 (Gi) locations. If K is in the array then it can be in one of the n (ai) locations.
26. 26. Asucc = no of expected comparisons made if the element K is in the array. ::= ∑ pr( the element K is in i th location)* Ti for 0 < i < n+1 ::= ∑ (1/n)*i = n(n+1)/2 * 1/n = (n+1)/2 Afail = no of expected comparisons made if K is not in the array. Ti = no of comparisons made to find that K is in Gi th gap. Afail ::= ∑ pr(K is in the Gi th gap)*Ti for 0< i < n+2 ::= ∑ 1/(n+1)*j + 1/(n+1)*n for 0 < j < n+1 ::= (n(n+1) + 2n)/ 2(n+1).
27. 27. A(n) = no of expected comparisons by the search algorithm on sorted array If q is the probability that the element is in the array. Then A(n) ::= q*Asucc + (1-q)*Afail ::= q(n+1)/2 + (1-q) (n/2 + n/n+1) ~ n/2 Therefore we end up doing n/2 comparisons if the array is sorted.
28. 28. Classical Problems: Monty Hall In the 1970's, there was a game hosted by Monty Hall and his assistant Carol. At one stage of the game, a contestant is shown three doors. There is a prize behind one door and that there are goats behind the other two. The contestant picks a door. To build suspense, Carol always opens a different door, revealing a goat. The contestant can then stick with his original door or switch to the other unopened door. He wins the prize only if he now picks the correct door. Should the contestant "stick" with his original door, "switch" to the other door, or it does not matter?
29. 29. Step 1: The Sample Space Let us consider an outcome as a triple of door numbers: 1. The number of the door concealing the prize. 2. The number of the door initially chosen by the contestant. 3. The number of the door Carol opens to reveal a goat. For example, the outcome (2, 1, 3) represents the case where the prize is behind door 2, the contestant initially chooses door 1, and Carol reveals the goat behind door 3. In this case, a contestant using the "switch" strategy wins the prize. Not every triple of numbers is an outcome; for example, (1, 2, 1) is not an outcome, because Carol never opens the door with the prize. Similarly, (1, 2, 2) is not an outcome, because Carol does not open the door initially selected by the contestant,
30. 30. Step 2 & 3 : Event of interest & Assigning Probabilities
31. 31. Step 4: Computing probability So the probability of the contestant winning with the "switch" strategy is the sum of the probabilities Pr {W } ::= Pr {(1, 2, 3)} + Pr {(1, 3, 2)} + Pr {(2, 1, 3)} + Pr {(2, 3, 1)} + Pr {(3, 1, 2)} + Pr {(3, 2, 1)} ::=2/3
32. 32. What is the probability that two persons in a group of 50 have the same birthday? Probability that 2 persons have different birthday is pr(D) ::= 365*364/365*365 Let us call it as q Now probability that 2 persons have same birthday is pr(S) ::= 1 – pr(D) ::= 1/365 Let us call it as p Now with 50 people we can make 50 C2 pairs = 1225 pr(1st pair has same birthday) = p pr(2nd pair has same birthday) = q*p pr(3rd pair has same birthday) = q*q*p ..... pr(1225th pair has same birthday)= q*q*q.......1224 times *p The Birthday problem
33. 33. Then , the probability that one of the pairs have the same birthday is Pr(A)::= Pr( 1st pair or 2nd pair ..... 1225th pair having the same birthday) ::=pr(1st pair) + pr(2nd pair) + .......... pr(1225th pair) ::= p + q*p + q*q*p + q*q*q*p + ..............+ q*q.. 1224 times p ::= p(1 + q + q*q + q*q*q + .... q*q*q 1224 times) ::= p*(1-q^1224)/ (1-q) Here q = 364/365 , therefore Pr(A) ::= 0.96!!!! There is 96 % chance that a group of 50 students will have at least one pair with the same birthday. What went wrong?
34. 34. There are two assumptions underlying these assertions. First, we assume that all birth dates are equally likely. Second, we assume that birthdays are mutually independent. Neither of these assumptions are really true. Birthdays follow seasonal patterns, so they are not uniformly distributed!
35. 35. 35 Variance and standard deviation • The variance of X denoted by V(X) or σx 2 is – V(X) = Σ(xi - μ)2 f(xi) = E((X- μ)2 ) • The standard deviation of X, denoted by σx is – σx = sqrt(V(X))
36. 36. 36 Joint Distribution or joint probability function • Let X and Y be random variables on S • Rx={x1, x2,….xn} and Ry={y1, y2,….yn} • The joint distribution or joint probability function of X and Y is the function h given by – h(xi, yj) ≡ P(X=xi, Y=yj) ≡ P({s belongs S: X(s)=xi, Y(s)=yj}) • The function h has the following properties – h(xi, yj) >=0 – Σ Σ h(xi, yj) =1
37. 37. 37 Correlation • If X and Y have joint distribution h(xi, yj) and the means μX and μY then the covariance of X and Y is given as – Cov(X, Y) = Σ(xi- μX )(yj- μY ) h(xi, yj) = E[(X- μX )(Y- μY ) • The Correlation of X and Y denoted by ρ(X,Y) is given as – ρ(X,Y) = Cov(X, Y) / σxσY
38. 38. 38 References • Introduction to probability and statistics – Schaum’s series • Proabability and statistics for engineers and the sciences – Jay L. Devore • http://www.stat.sc.edu/~west/javahtml/Lets MakeaDeal.html