Your SlideShare is downloading.
×

- 1. Probability
- 2. • Example: Soccer Game • You are off to soccer, and love being the Goalkeeper, but that • depends who is the Coach today: • with Coach Sam the probability of being Goalkeeper is 0.5 • with Coach Alex the probability of being Goalkeeper is 0.3 • Sam is Coach more often ... about 6 out of every 10 games (a probability of 0.6). • So, what is the probability you will be a Goalkeeper today? • Let's build the tree diagram. First we show the two possible coaches: Sam or Alex:
- 3. • The probability of getting Sam is 0.6, so the probability of Alex must be 0.4 (together the • probability is 1) • Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not being Goalie): If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):
- 4. • (When we take the 0.6 chance of Sam being coach and include the 0.5 chance that Sam will • let you be Goalkeeper we end up with an 0.3 chance.) • But we are not done yet! We haven't included Alex as Coach: • An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12. • Now we add the column: • 0.3 + 0.12 = 0.42 probability of being a Goalkeeper today • (That is a 42% chance)
- 5. Conditional Probability
- 6. Fundamentals for Bayes Theorem ● 0 ≤ P(A) ≤ 1, where P(A) is the probability of an event A. ● P(A) = 0, indicates total uncertainty in an event A. ● P(A) =1, indicates total certainty in an event A. ● Event: Each possible outcome of a variable is called an event. ● Sample space: The collection of all possible events is called sample space. ● Random variables: Random variables are used to represent the events and objects in the real world. ● Prior probability: The prior probability of an event is probability computed before observing new information. ● Posterior Probability: The probability that is calculated after all evidence or information has taken into account. It is a combination of prior probability and new information.
- 7. ● Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge. ● In probability theory, it relates the conditional probability and marginal probabilities of two random events. ● Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics. ● It is a way to calculate the value of P(B|A) with the knowledge of P(A|B).
- 8. • Bayes' Theorem is a way of finding a probability when we know certain other probabilities. The formula is: • P(A|B) = P(A) P(B|A)/P(B) • Which tells us: how often A happens given that B happens, written P(A|B), • When we know: how often B happens given that A happens, written P(B|A) • and how likely A is on its own, written P(A) • and how likely B is on its own, written P(B)
- 9. Bayes's theorem is expressed mathematically by the following equation that is given below. 𝐏 𝐗 ∣ 𝐘 = 𝐏 𝐘∣𝐗 𝐏 𝐗 𝐏 𝐘 Where X and Y are the events and P (Y) ≠ 0 P(X/Y) is a conditional probability that describes the occurrence of event X is given that Y is true. P(Y/X) is a conditional probability that describes the occurrence of event Y is given that X is true. P(X) and P(Y) are the probabilities of observing X and Y independently of each other. This is known as the marginal probability. Let us say P(Fire) means how often there is fire, and P(Smoke) means how often we see smoke, then: P(Fire|Smoke) means how often there is fire when we can see smoke P(Smoke|Fire) means how often we can see smoke when there is fire So the formula kind of tells us "forwards" P(Fire|Smoke) when we know "backwards" P(Smoke|Fire)
- 10. Example: dangerous fires are rare (1%) but smoke is fairly common (10%) due to barbecues, and 90% of dangerous fires make smoke We can then discover the probability of dangerous Fire when there is Smoke: P(Fire|Smoke) = P(Fire) P(Smoke|Fire)/P(Smoke) = 1% x 90%10% = 9%
- 11. ● Bayes' theorem can be derived using product rule and conditional probability of event A with known event B: ● As from product rule we can write: P(A ⋀ B)= P(A|B) P(B) or ● Similarly, the probability of event B with known event A: P(A ⋀ B)= P(B|A) P(A) ● According to above statements the equation for Bayes’ rule or Bayes’ theorem is as below: 𝐏 𝐀 ∣ 𝐁 = 𝐏 𝐁∣𝐀 𝐏 𝐀 𝐏 𝐁 ● P(A|B) is known as posterior, which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B. ● P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of evidence. ● P(A) is called the prior probability, probability of hypothesis before considering the evidence ● P(B) is called marginal probability, pure probability of an evidence. ● In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the Bayes' rule can be written as: 𝐏 𝐀𝐢 ∣ 𝐁 = 𝐏 𝐀𝐢 𝐏 𝐁∣𝐀𝐢 𝐢=𝟏 𝐊 𝐏 𝐀𝐢 ∗𝐏 𝐁∣𝐀𝐢 Where A1, A2, A3, ..............., An is a set of mutually exclusive and exhaustive events.
- 12. Example: In a particular pain clinic, 10% of patients are prescribed narcotic pain killers. Overall, 5% of the clinic's patients are addicted to narcotics(including pain killers and illegal substances). Out of all the people restricted pain pills, 8% are addicts. If a patient is an addict, what is the probability that they will be prescribed pain pills? A: Being prescribed pain pills 10% B: Being an addict 5% B|A: Out of all 8% addict for narcotics. P(A|B)=P(B|A)P(A)/P(B) P(A|B)=(0.08 * 0.1)/0.05 =0.16 That means If a patient is an addict, then there are 16% chances to prescribe pain pills.
- 13. ● According to the given statement the below are statistics: ● The prior probability of having cancer among population is P(Cancer) Which is P(Cancer)=0.08 ● The prior probability of not having cancer among population is P(ךCancer) Which is P(ךCancer)=1-0.08=0.992 ● The cancer test results are accurate up to 98%. It is indicated as P(+|Cancer)=0.98 and 2% of results are negative such as: P(|ךCancer)=0.2(1-0.98) ● Probability of positive result for no cancer is P(+|ךCancer)=1-0.97=0.3 ● Probability of negative result for no cancer is P(-|ךCancer)=0.97
- 14. Bayes’ Rule Let S1 , S2 , S3 ,..., Sk be mutually exclusive and exhaustive events with prior probabilities P(S1), P(S2),…,P(Sk). If an event A occurs, the posterior probability of Si, given that A occurred is ,...k , i S A P S P S A P S P A S P i i i i i 2 1 for ) | ( ) ( ) | ( ) ( ) | ( ) | ( ) ( ) | ( ) ( ) ( ) ( ) | ( ) | ( ) ( ) ( ) ( ) ( ) | ( Proof i i i i i i i i i i i i S A P S P S A P S P A P AS P A S P S A P S P AS P S P AS P S A P
- 15. We know: P(F) = P(M) = P(H|F) = P(H|M) = Example From a previous example, we know that 49% of the population are female. Of the female patients, 8% are high risk for heart attack, while 12% of the male patients are high risk. A single person is selected at random and found to be high risk. What is the probability that it is a male? Define H: high risk F: female M: male 61 . ) 08 (. 49 . ) 12 (. 51 . ) 12 (. 51 . ) | ( ) ( ) | ( ) ( ) | ( ) ( ) | ( F H P F P M H P M P M H P M P H M P .12 .08 .51 .49
- 16. Example Suppose a rare disease infects one out of every 1000 people in a population. And suppose that there is a good, but not perfect, test for this disease: if a person has the disease, the test comes back positive 99% of the time. On the other hand, the test also produces some false positives: 2% of uninfected people are also test positive. And someone just tested positive. What are his chances of having this disease?
- 17. We know: P(A) = .001 P(Ac) =.999 P(B|A) = .99 P(B|Ac) =.02 Example Define A: has the disease B: test positive 0472 . 02 . 999 . 99 . 001 . 99 . 001 . ) | ( ) ( ) | ( ) ( ) | ( ) ( ) | ( c A B P c A P A B P A P A B P A P B A P We want to know P(A|B)=?
- 18. Example A survey of job satisfaction2 of teachers was taken, giving the following results 2 “Psychology of the Scientist: Work Related Attitudes of U.S. Scientists” (Psychological Reports (1991): 443 – 450). Satisfied Unsatisfied Total College 74 43 117 High School 224 171 395 Elementary 126 140 266 Total 424 354 778 Job Satisfaction L E V E L
- 19. Example If all the cells are divided by the total number surveyed, 778, the resulting table is a table of empirically derived probabilities. Satisfied Unsatisfied Total College 0.095 0.055 0.150 High School 0.288 0.220 0.508 Elementary 0.162 0.180 0.342 Total 0.545 0.455 1.000 L E V E L Job Satisfaction
- 20. Example For convenience, let C stand for the event that the teacher teaches college, S stand for the teacher being satisfied and so on. Let’s look at some probabilities and what they mean. is the proportion of teachers who are college teachers. P(C) 0.150 is the proportion of teachers who are satisfied with their job. P(S) 0.545 is the proportion of teachers who are college teachers and who are satisfied with their job. P(C S) 0.095 Satisfied Unsatisfied Total College 0.095 0.055 0.150 High School 0.288 0.220 0.508 Elementary 0.162 0.180 0.342 Total 0.545 0.455 1.000 L E V E L Job Satisfaction
- 21. Example is the proportion of teachers who are college teachers given they are satisfied. Restated: This is the proportion of satisfied that are college teachers. P(C S) P(C | S) P(S) 0.095 0.175 0.545 is the proportion of teachers who are satisfied given they are college teachers. Restated: This is the proportion of college teachers that are satisfied. P(S C) P(S | C) P(C) P(C S) 0.095 P(C) 0.150 0.632 Satisfied Unsatisfied Total College 0.095 0.055 0.150 High School 0.288 0.220 0.508 Elementary 0.162 0.180 0.342 Total 0.545 0.455 1.000 L E V E L Job Satisfaction
- 22. Example P(C S) 0.095 P(C) 0.150 and P(C | S) 0.175 P(S) 0.545 Satisfied Unsatisfied Total College 0.095 0.055 0.150 High School 0.288 0.220 0.508 Elementary 0.162 0.180 0.342 Total 0.545 0.455 1.000 L E V E L Job Satisfaction P(C|S) P(C) so C and S are dependent events. Are C and S independent events?
- 23. Example Satisfied Unsatisfied Total College 0.095 0.055 0.150 High School 0.288 0.220 0.508 Elementary 0.162 0.180 0.658 Total 0.545 0.455 1.000 L E V E L Job Satisfaction P(C) = 0.150, P(S) = 0.545 and P(CS) = 0.095, so P(CS) = P(C)+P(S) - P(CS) = 0.150 + 0.545 - 0.095 = 0.600 P(CS)?
- 24. Tom and Dick are going to take a driver's test at the nearest DMV office. Tom estimates that his chances to pass the test are 70% and Dick estimates his as 80%. Tom and Dick take their tests independently. Define D = {Dick passes the driving test} T = {Tom passes the driving test} T and D are independent. P (T) = 0.7, P (D) = 0.8 Example
- 25. What is the probability that at most one of the two friends will pass the test? Example P(At most one person pass) = P(Dc Tc) + P(Dc T) + P(D Tc) = (1 - 0.8) (1 – 0.7) + (0.7) (1 – 0.8) + (0.8) (1 – 0.7) = .44 P(At most one person pass) = 1-P(both pass) = 1- 0.8 x 0.7 = .44
- 26. What is the probability that at least one of the two friends will pass the test? Example P(At least one person pass) = P(D T) = 0.8 + 0.7 - 0.8 x 0.7 = .94 P(At least one person pass) = 1-P(neither passes) = 1- (1-0.8) x (1-0.7) = .94
- 27. Suppose we know that only one of the two friends passed the test. What is the probability that it was Dick? Example P(D | exactly one person passed) = P(D exactly one person passed) / P(exactly one person passed) = P(D Tc) / (P(D Tc) + P(Dc T) ) = 0.8 x (1-0.7)/(0.8 x (1-0.7)+(1-.8) x 0.7) = .63