Lecture 2 - Probability


Published on

This is the 2nd of an 8 lecture series that I presented at University of Strathclyde in 2011/2012 as part of the final year AI course.

This lecture covers the fundamentals of probability theory, and is relatively basic to ensure that all students have a good grasp on the concept.

Published in: Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Lecture 2 - Probability

  1. 1. Intro to Probability
  2. 2. History of Probability • Until the 16th Century, nobody put together a systematic analysis of probability. • Cardano, an eminent Mathematician (and compulsive gambler) wrote “A Book on Games of Chance” in 1526. • He also included chapters on effective cheating strategies.2
  3. 3. Basics • If you have five things to choose from, and only one of them is right, you have a 1-in-5 chance of getting it right. ‣ Also 1/5 ‣ 20% ‣ 0.2 • If X represents “choosing right” we can say ‣ P(X) = 0.2 (or 20%, 1/5 etc)3
  4. 4. Monty Hall • Last lecture we talked about the “Monty Hall” problem • There are 3 possible doors - behind one of them is a car, and behind two are donkeys. • The aim is to win the car.4
  5. 5. The Twist • After you pick a door, the gameshow host opens one of the other doors to show a donkey. • You are offered the opportunity to change to the other door. • Should you?5
  6. 6. Proof of Monty Hall • Like we said yesterday, yes you should. • And here’s why6
  7. 7. Proof 1 - Simple • Initially you had a 1/3 chance of being right. • That means a 2/3 chance of being wrong. • If you were wrong, you should pick a different door, and you know which door to pick now.7
  8. 8. Proof 2 - Enumerate • Car at 1, 2, or 3 • Player picks 1 Car at 1 Car at 2 Car at 3 Host Opens Win Lose x 2 (twice) Host Opens Win Lose x 3 (twice) Switching has a 2/3 chance of winning8
  9. 9. Conditional Probability • Bayes’ Theorem of Conditional Probability • Hinges on the concept of dependent variables. • What is the chance that X happens given that Y has happened. • If X and Y are unrelated, it’s just the probability of X happening9
  10. 10. Example • What is the likelihood of flipping a coin and getting heads, if we have just flipped a coin and got heads. • One thing can’t affect the other. ‣ Probability of X given Y = P(X) = 1/2 • What is the likelihood that the next train will be late if the last train was late (Actually, although the events are related, this one is more based on Queue Theory...)10
  11. 11. Bayes’ Theorem • P(A|B) => Probability of A given B has happened. • P(A|B) = ( P(B|A) * P(A) ) / P(B) • In AI we make a lot of use of this theorem ‣ “Bayesian Classification” ‣ What is the likelihood that this is thing given that we have observed data11
  12. 12. Bayesian Monty Hall • What is the probability that Door1 wins, given we have seen that Door 2 does not? • 3 variables Car, Selection, Host - drawn from {1,2,3} • P(C = 1 | S = 1, H = 2)12
  13. 13. Proof • See Wikipedia entry on “Monty Hall Problem” for recap of maths shown in class13
  14. 14. Spam Filtering • Spam detection can be done with Bayes’ Theorem • What is the likelihood that this message is spam given it has these characteristics? • Characteristics are typically keywords, origin, header info etc.14
  15. 15. Spam Filtering • Variables Spam, Characteristics • P( S | C ) = ( P( C | S ) P ( C ) ) / P ( S ) • We can learn all the values of the RHS of this from “training data”. • Bayes’ Theorem then allows us to generalise to items that aren’t in the training data. (Note that actual spam filters are much more sophisticated, but still use Bayes)15
  16. 16. Training Data • Big data set • Pre-classified (by hand) • Statistical analysis builds up a picture of what spam looks like ‣ E.g. Emails that include “viagra” are typically spam • Future emails can be classified using the stats we learnt from the training data • Refine analysis by “Report Spam” and “Not Spam”16
  17. 17. Using Bayesian Classifiers • We’ll see next week how we can use Bayes’ Theorem in games to classify players into “stereotypes” • And we can use Utility Theory from last lecture to exploit these stereotypes17
  18. 18. Expected Value • Expected Value is another statistical measure. • “How much do I expect to win on average” • Yesterday we talked about an example ‣ Guaranteed £1 or even chance at £3 • P(X) = 1/2, Payout is 3 ‣ E(X) = £1.5018
  19. 19. Using Expected Value • Expected Value can be used to make informed choices. • If we get to play the £1/£3 game repeatedly, over time we will do better picking £3. • Note that if we play only once, we may win nothing. ‣ Which explains the result in £1,000,000/£3,000,000 game • Expected Value can be deceptive, but it can also be helpful.19
  20. 20. The St Petersburg Paradox • You pay a fee to enter a game where a coin is flipped repeatedly. The game ends when the first tails is shown. • The payout starts at £1 and doubles for every head that is shown. • When the game ends, you win whatever the payout has reached.20
  21. 21. The St Petersburg Paradox • What is a sensible entry fee? • Would you pay £1 to play? • Would you pay £10 to play?21
  22. 22. The St Petersburg Paradox • See Wikipedia entry on St Petersburg Paradox for recap of maths shown in class22
  23. 23. The St Petersburg Paradox • The Expected Value of this game is infinite. • Therefore it “makes sense” to pay any price to play. • But of course it doesn’t. ‣ The high payout cases are infinitesimally unlikely. • We’ll talk next week about how we can work around this.23
  24. 24. Iterated Games • If you repeatedly play a game we call it “Iterated”. • Iterating opens up a whole host of other options. • In games with equilibrium points, it doesn’t change • But in games without equilibrium points, it makes a massive difference. • In the same way we saw with Expected Values, we can “average out” equilibrium points for the game.24
  25. 25. Mixed Strategies • When a player has a choice of A, B, C etc. these are “Pure” strategies • When we are playing the same game repeatedly, we can also choose a “Mixed” strategy. • This is a probability distribution across two or more of the Pure strategies. ‣ E.g. P(A) = 2/3, P(C) = 1/325
  26. 26. Games Without Equilibria Odd Even Odd -1 1 Even 1 -126
  27. 27. Equilibria • Remember the definition of an equilibrium point • If Player 1 changes strategy, they can only do worse (assuming Player 2 does not change) • Likewise Player 2 cannot change their strategy unilaterally and do any better either. • For both players, this is the best they can hope to achieve27
  28. 28. The Odds/Evens Game • But this does not hold in Odds/Evens • Player 1 chooses Odd and Player 2 chooses Even ‣ Player 2 would do better to unilaterally change to Odd. • Player 1 chooses Even and Player 2 chooses Even ‣ Player 1 would do better to unilaterally change to Odd. • This game has no equilibria!28
  29. 29. Pseudo-Equilibria • Calculating appropriate mixed strategies is tough. • It’s not important to know how to do it for this course, just that it can be done. • However an easy approach that sometimes works ‣ Delete all dominated strategies (consider that a strategy may be dominated by a mixed strategy...) ‣ Find a combination that will give the same average payoff regardless of your opponent.29
  30. 30. Iterated Odd/Even • We talked previously about how best to play the Odd/Even game, and how to vary your strategy. • What works best is not to think or reason or plot or scheme. • A simple mixed strategy works best ‣ P(Odd) = 0.5, P(Even) = 0.5 • Regardless of your opponent, you will get the value of the game, which is 0.30
  31. 31. Iteration For Communication • In non-zero sum games, it may be to our advantage to telegraph to the other player our intentions. • But we have no way of communicating. • In an iterated game, we can send our intentions using the choice strategy. ‣ Our previous plays become a transcript of the message we are sending31
  32. 32. Optimal Prisoner’s Dilemma • The best strategy for Iterated Prisoner’s Dilemma is tit-for-tat. • Signal initially to your opponent that you are willing to cooperate. • Subsequently, play the strategy that the opponent played last time. • Punishes betrayal, rewards cooperation.32
  33. 33. Iterated Prisoner’s Dilemma • Why is this a good thing? • Consider the Prisoner’s Dilemma • We can signal to the other player that we are willing to cooperate with them. ‣ We gain the best mutual payout. ‣ Removes a lot of the risk.33
  34. 34. The Hangman Paradox • The Hangman Paradox is something to be wary of. • A prisoner has been sentenced to be executed. • He has been told that it will take place next week. • He has also been told that it will be a surprise.34
  35. 35. The Hangman Paradox • It can’t happen on Friday ‣ As that’s the last day of the week, if it did it would not be a surprise. • And if it can’t happen on Friday, equally it can’t happen on Thursday by the same logic. • By induction, he can’t be executed!35
  36. 36. The Hangman Paradox • Having realised that he can’t be executed, he now feels safe. • On Wednesday, the hangman arrives to execute him. • He is, as predicted, very surprised.36
  37. 37. Hangman Paradox for Iterated Games • It’s easy to fall into the same reasoning for iterated games. ‣ In the final iteration, there is no consequence to betrayal ‣ By induction, the case for cooperating at all falls apart • This might be true for a determinate number of iterations. • What about an indeterminate number?37
  38. 38. Summary • Lots of Probability ‣ Bayesian Probability ‣ Expected Value • Iterated Games • Mixed Strategies • Cooperation38
  39. 39. Next Week • Covering Poker in detail • Designing agents to play games • Mathematical models of players39