Your SlideShare is downloading. ×
  • Like
Lecture9 - Bayesian-Decision-Theory
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Lecture9 - Bayesian-Decision-Theory



Published in Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Introduction to Machine Learning Lecture 9 Bayesian decision theory – An introduction Albert Orriols i Puig i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull
  • 2. Recap of Lecture 5-8 LET’S START WITH DATA CLASSIFICATION Slide 2 Artificial Intelligence Machine Learning
  • 3. Recap of Lectures 5-8 We want to build decision trees How can I automatically generate these types of trees? Decide which attribute we should put in each node Decide a split point Rely on information theory We also saw many other improvements Slide 3 Artificial Intelligence Machine Learning
  • 4. Recap of Lecture 5-8 From kNN to CBR 15-NN 1-NN Key aspects Value of k Distance functions Slide 4 Artificial Intelligence Machine Learning
  • 5. Today’s Agenda Could we use probability to classify? p y y Where all began Some anecdotes on the correct use of probabilities b biliti Slide 5 Artificial Intelligence Introduction to C++
  • 6. Why Bother about Prob.? The world is a very uncertain place Almost 40 years of AI and ML dealing with uncertain domains Some researchers decided to employ ideas from probability to model concepts Before saying more let’s go to the beginning more… let s Slide 6 Artificial Intelligence Machine Learning
  • 7. Meeting the Reverend Thomas Bayes Two main works: Divine Benevolence or an Attempt to Benevolence, Prove That the Principal End of the Divine Providence and Government is the Happiness of Hi C t H i f His Creatures (1731) An Introduction to the Doctrine of Fluxions, and a Defence of the Mathematicians Against the Objections of the Author of the Analyst (published anonymously in 1736) But we are especially interested in: Essay Towards Solving a Problem in the Doctrine of Chances (1764) which was actually published p yp posthumously by Richard Price yy Slide 7 Artificial Intelligence Machine Learning
  • 8. Where These Ideas Came From? Bayes build his theory upon several ideas y yp Immanuel Kant (1724-1804) Copernican revolution: our understanding of the external world had its foundations not merely in experience, but in both experience and a priori concepts, th offering a d ii t thus ff i non-empiricist critique of rationalist philosophy Isaac Newton (1643-1727) Universal gravitation three laws of motion which dominated the scientific view of the physical universe for the next three centuries Slide 8 Artificial Intelligence Machine Learning
  • 9. What Was Bayes’ Point Bayesian p y probability y Notion of probability interpreted as partial belief rather than as frequency Bayesian estimation Calculate the validity of a proposition On the basis of a prior estimate of its probability and new relevant evidence E.g.: Before Bayes, forward probability Bf B f d b bilit given a specified number of white and black balls in an urn, what is the probability of drawing a black ball? p y g Bayes turned its attention to the converse problem given that one or more balls have been drawn, what can be said about the number of white and black balls in the urn? Slide 9 Artificial Intelligence Machine Learning
  • 10. Bayes’ Theorem Outputs the most probable hypothesis h∈H, given the data D + knowledge about prior probabilities of hypotheses in H Terminology: P(h|D): probability that h holds given data D. Posterior probability of h; confidence that h holds given D. P(h): prior probability of h (background knowledge we have about that h is a correct hypothesis) P(D): prior probability that training data D will be observed P(D|h): probability of observing D given h holds P (D | h )P (h ) P (h | D ) = P (D ) Slide 10 Artificial Intelligence Machine Learning
  • 11. Bayes’ Theorem Given H the space of possible hypothesis The Th most probable h b bl hypothesis i the one that maximizes P(h|D) h i is h h ii P(h|D): P (D | h )P (h ) hMAP ≡ arg max P (h | D ) = arg max = arg max P (D | h )P (h ) P (D ) h∈H Slide 11 Artificial Intelligence Machine Learning
  • 12. Is the Pope the Pope? The chances that a randomly chosen human being is the Pope y g p are about 1 in 6 billion Benedict XVI is the Pope p What are the chances that Benedict XVI is human? (Beck-Bornholdt (Beck Bornholdt and Dubben, 1996) Dubben Analogy to syllogistic reasoning: 1 in 6 billion Slide 12 Artificial Intelligence Machine Learning
  • 13. So, Is the Pope an Alien? Where is the trick? Probability of the data given a hypothesis H: P(D|H) ypo es s (|) Probability of the hypothesis ge given the da a P(H|D) e data: ( | ) P(D|H) is different from P(H|D) So, i th P S is the Pope An alien? A li ? Probability of being an alien P(A) Probability of being human P(H) Probability that the pope is an alien P( Pope | Alien) P( Alien) P( Alien | Pope) = p Human) + P( P P( P Pope | H Human) P( H Pope | Ali ) P( Ali ) Alien Alien Slide 13 Artificial Intelligence Machine Learning
  • 14. So, Is the Pope an Alien? What’s missing? g P(Pope|Alien) P(Human) P(H ) P(Alien) Considering Low values of P(Alien) and P(Pope|Alien) And large values of P(Human) f( ) We could “probably” say that the pope is not an alien! Slide 14 Artificial Intelligence Machine Learning
  • 15. More examples: Monty Hall Stick or switch Slide 15 Artificial Intelligence Machine Learning
  • 16. Stick or Switch I chose door number 3 Door 2 is uncovered a d contains sheep and co a s a s eep They give me the chance to change the door Should I? Use probability, not faith, to give an answer! Slide 16 Artificial Intelligence Machine Learning
  • 17. Stick or Switch I should switch! Slide 17 Artificial Intelligence Machine Learning
  • 18. Yet Another Example: The Defendant’s Fallacy The history of a murder A suspect was caught h DNA test was positive DNA test fails only 1 over 1 million times So, my suspect must be guilty, right? More specifically, it will be guilty with p = 0.999999. Agree? Slide 18 Artificial Intelligence Machine Learning
  • 19. The Defendant’s Fallacy Where is the trick now? P(coincides | innocent) as opposed to P(innocent|coincides) P(coincides | innocent) commonly misused as the probability of being innocent P(innocent | coincides) is the probability of being guilty ( ) p y gg y having that the test was positive! Does this really matter? Let’s L t’ assume a city of 10 million i h bit t it f illi inhabitants We apply the test to all the 10 million inhabitants How many of them will be positive? 10 Slide 19 Artificial Intelligence Machine Learning
  • 20. The Defendant’s Fallacy Two arguments g The prosecutor: There is 0.000001 that the suspect is innocent The d f d t In thi it f Th defendant: I this city of 10M people, the probability of th l th b bilit f the suspect being innocent is approximately 90% Who is right? The d f d t Th defendant Prove for that? You do the math Slide 20 Artificial Intelligence Machine Learning
  • 21. Next Class How we can use these concepts in machine learning Slide 21 Artificial Intelligence Introduction to C++
  • 22. Introduction to Machine Learning Lecture 9 Bayesian decision theory – An introduction Albert Orriols i Puig i l @ ll ld Artificial Intelligence – Machine Learning Enginyeria i Arquitectura La Salle gy q Universitat Ramon Llull