SlideShare a Scribd company logo
1 of 45
Download to read offline
Making Better Decisions -
  Opponent Modelling




             1
Monte Carlo in Poker
                      (Recap)

    • Yesterday we saw that Monte Carlo could be used to
     estimate the expected reward of an action by evaluating the
     delayed reward
    • We do this by simulating or "rolling out" games to their end
     state.
    • Assess the amount we won or lost


2
Game Tree and Monte
           Carlo
                 i
         F           R
             C
                         Opponent


                              Chance




3
Random Walks
                    in the Game Tree


    • When we walk the Game Tree at random, we pick nodes to
     follow at random.
    • We assume (for now) that this is an unbiased choice
    • This means every choice has the same probability of being
     chosen



4
Can We Do Better?



    • Random walks are all well and good
    • But a uniform distribution across action choices isn't
     accurate
       ‣ Certain situations will make sensible players more likely to use
         certain actions
    • How can we bring this bias into play in the walk?



5
Classifying Opponents


    • The way we do this is to work out what type of player
     someone is.
    • We observe them to get a better understanding of how they
     operate.
    • In Poker and other games, we can use all sorts of statistical
     measures to quantify a player's type.


6
Action Prediction

    • Once we know what kind of player someone is, we can flip
     things on their head.
    • We answered "what is the likelihood this player is type X
     given we have seen this type of play"
    • We can now answer "what is the likelihood this player will
     make action Y given they are of type X"
    • Remember from Bayes Theorem last week, these questions
     are closely linked

7
Simple (Human)
                          Classification


    • Pro Poker players try to quantify their opponents into one of
     several classes based on 3 measures
       ‣ Voluntarily Put in Pot (VPiP)
       ‣ Won at Showdown (WSD)
       ‣ Pre-flop Raise (PFR)




8
Player Stereotypes




    • Players can be
       ‣ Tight / Loose (how likely they are to play hands)
       ‣ Passive / Aggressive pre-flop
       ‣ Passive / Aggressive post-flop




9
Utilising Stereotypes

     • If we can classify players we can use this against them
     • For instance, we might discover that passive players can be
      chased off by aggressive play
     • Or we understand that when a super-conservative player
      decides to raise, we need to be careful
     • We can build heuristic rule bases around this like we saw
      before.
     • Or we can be much smarter

10
Better Classifications




     • Humans are getting by on 3 dimensions
     • But Poker has waaaay more statistics available than this
     • We can make a lot of use of this extra data.




11
Poker Tracker



     • Poker Tracker is a stats package specifically for Poker
     • Analyses play at online casinos
     • Real-time access to stats about opponents
     • Allows players to review hands later




12
Stats in Poker



     • A few slides ago - Poker has many statistics
     • Poker Tracker keeps tabs on around 150 metrics
     • Some of these are somewhat similar, some relate more to
      the games than the players




13
Problem of Dimensionality




     • The problem now is that we have too much information!
     • Trying to learn on cluttered data can be problematic,
      assuming it works at all.




14
Dimensionality Reduction


     • Somehow we have to reduce the number of dimensions that
      our data points are using.
     • In many ways, getting the right data into a learning algorithm
      is the biggest challenge.
     • As much art as it is engineering.
     • Two options
        ‣ Feature selection
        ‣ Feature extraction

15
Selection vs Extraction



     • In Selection, you pick the dimensions you believe to be most
      relevant
        ‣ The human players did this to get their 3 dimensional
          representations
     • In Extraction, you come up new dimensions that can
      represent your datapoint



16
Principal Components
                          Analysis

     • PCA is a common strategy for this.
     • Recasts the dimensions of the datapoint into another set of
      "basis vectors".
     • Smushes together dimensions that have a strong correlation
        ‣ Some stats measures are looking at fundamentally the same thing, in
          different ways
        ‣ E.g.Various raise frequency metrics might be treated as a single
          “aggression” dimension after PCA


17
Principal Components
                       Analysis

     • This was going to be a worked example.
     • Honestly, that’s way to painful.
     • For N observations in M dimensions X is a matrix
      where each column is an observation.
     • Calculate the mean and std. dev. for each row in the
      matrix (each dimension)

18
Principal Components
                       Analysis
     • Calculate the covariance matrix, the amount that
      the dimensions vary with respect to each other.
     • Calculate the eigenvectors and eigenvalues of the
      covariance matrix
       ‣ The eigenvectors are the new basis vectors of the
         reduced-dimension datapoints
       ‣ The eigenvalues represent how significant the
         eigenvector is. Large value = significant
19
Principal Components
                      Analysis



     • Pick the most significant K of the eigenvectors.
     • Project the original datapoint in X onto the new
      basis vectors.




20
Principal Components
                        Analysis

     • Honestly, if anyone ever asks you to do this
        ‣ Get a textbook
        ‣ Use Matlab
        ‣ Be really careful because it’s kind of complicated
     • It is possible to do it by hand.
        ‣ I can’t anymore...


21
Principal Components
                        Analysis
     • Assuming that you finish the calculations without
      mucking up.
       ‣ Or, you find something to work it out for you (Matlab
         functions for this exist)
     • What you have now is a new datapoint, that is
      approximately the same information.
     • Recast into fewer dimensions.
       ‣ Note that the dimensions will not make sense
22
PCA in Action




23
Clustering Algorithms


     • Having performed PCA, we have a much more manageable
      set of datapoints, and we’ve eliminated extraneous
      dimensions
     • Now we need to group them together.
     • Clustering algorithms are one approach.
     • Tries to find a set of “clusters” of points that are grouped
      together.


24
Clustering
                50.0




                37.5




                25.0




                12.5




                  0
                       0   7.5   15.0   22.5   30.0




     Blue Peter style example - real data is rarely so neat
25
Clustering


     • k-means is one of the most popular algorithms
        ‣ Others exist, fuzzy c-means, FLAME clustering and more
     • Pick a value for k
        ‣ You can play around a bit to find good values or use
         some tricks
        ‣ Accepted “rule of thumb” :


26
K-Means Algorithm

     • Typically, we run the k-means algorithm as an
      “iterative refinement” process
        ‣ Guess at some initial values, keep running the process
         round and round until it stabilises
     • Randomly assign datapoints to one of the k clusters
     • Step 1 - Calculate centroids of the clusters
     • Step 2 - Update assignment based on new centroids
     • Rinse and repeat 1 and 2 until convergence.
27
K-Means Algorithm

     • Calculating Centroids of clusters
        ‣ xj denotes the datapoints being sampled
        ‣ mi(t+1) denotes mean of cluster i at iteration t+1
        ‣ Si(t) denotes the set of datapoints assigned to cluster i at
         iteration t
     • Effectively, the average of the datapoints




28
K-Means Algorithm



     • Assigning Datapoints to Clusters
     • The set of points Si is all datapoints for which the
      centroid of cluster i (mi) is the nearest centroid.




29
K-Means Worked Example




     • Board work




30
From Classification to
                       Prediction
     • Once we have our clusters defined, we know what
      datapoints constitute the type of player we are analysing
     • We can use this to predict what the player will do
        ‣ We have a collection of “similar” players, we can use
          their history.
        ‣ We may be able to use the raw data from the
          observations directly.
     • In either case, we can use the classification to predict actions

31
Back to Monte Carlo


     • So, back to the game tree.
     • We now have an idea of what type of player we are dealing
      with.
     • We have an idea of what actions the players are going to
      take in given situations.
     • Can we plug this back into the Monte Carlo simulation?


32
Informed Walks
                     in the Game Tree

     • We talked earlier about Opponent nodes in the game tree
     • Specifically, when we hit an Opponent node, we would use a
      uniform distribution to randomly pick between the options
      available.
     • Now, we can bias that distribution towards selecting the
      action we expect the player to take.


33
Does This Work?


     • Intuitively, it should
     • The more accurate we make the simulation, the
      more accurate the results should be.
     • Concern is that the prediction process will slow
      things down too much
        ‣ Monte Carlo relies on large numbers of samples, if they
          take too long, accuracy isn’t helping.

34
Does This Work?


     • We don’t know.
     • It’s been proven to aid Monte Carlo for Poker when
      k=1
        ‣ All players are treated as a generic “player”
     • This is ongoing research right now in SAIG.
     • Look for papers next year. :)

35
What We Do Know

     • We’ve previously attempted Machine Learning for
      Opponent Modelling.
     • Using 32 different statistical measures (reduced
      down to 8 significant dimensions by PCA)
     • Training data of 700,000 hands of Poker
     • Successfully extracted around 28 different player
      stereotypes.
36
The Aim of the Game

     • We aren’t going to be able to make an AI that
      always wins at Poker
     • There’s too much chance involved
       ‣ Bad hands come up
       ‣ Mis-interpreting players
     • What we want to do is make an AI that performs
      better than the other players under the same
      circumstances
37
Evaluation


     • Any time we do research we are testing some sort
      of scientific hypothesis.
     • We need to design experiments to test whether the
      hypothesis is true or not
     • Science doesn't care if we're right - unbiased. Even if
      we're wrong, we have learnt something.

38
Evaluation

     • Consider a pro Poker player
     • Will win some games and lose others
       ‣ In fact, a fundamental rule of good poker play is not even
         taking part in about 80% of the games you sit through
     • Measuring in terms of a single game doesn't work
       ‣ Need to look at the forest, not the trees
     • What counts is how much money the player wins at
      the end.
39
Measuring the Strength of
                      an AI
     • What we need is a measure of how successful a bot
      is on average.
     • Poker gives a metric for this - Big Blind / 100
        ‣ Metric is in terms of the table limit - normalised
     • Note that even for a large number of games, the
      variance on this measure can be really big.
        ‣ Recall Black Swan events - low likelihood, high
         impact. Large wins are Black Swans here.
40
Stable Experimentation

     • We really need a way to remove the variance from
      the problem.
     • Ordinarily we might repeat the experimentation, take
      a large number of sample, use law of averages to our
      advantage.
     • We talked yesterday about the state space of just the
      card dealing component of Poker
        ‣ We know it's too large for this to be an option
41
Experimentation

     • What if we generate experimental scenarios.
     • A large number of games, with the deck already
      configured.
     • We can play the scenario with player A
     • Then replay the exact same scenario with player B
     • The results that player A and B generate are now
      comparable.
42
Experimental Design


     • Designing good experiments is really important
     • Not just for AI but for all kinds of things
     • Understanding sources of uncertainty means we can
      find ways to factor them out
     • Design fair unbiased experiments
     • For Science!

43
Summary


     • More detail on Monte Carlo in Poker
     • Explanation of Opponent Modelling in Poker
       ‣ Dimensionality Reduction
       ‣ Clustering algorithms
     • Exploiting Opponent Models
     • Experimental Design

44
Next Week




     • Other uses for Opponent Models
     • Procedural Content Generation
     • AI in Video Games




45

More Related Content

Similar to Lecture 4 - Opponent Modelling

Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game PlayingAman Patel
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedOmid Vahdaty
 
Predicting the NBA MVP
Predicting the NBA MVPPredicting the NBA MVP
Predicting the NBA MVPThinkful
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hakky St
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed banditJie-Han Chen
 
Lecture 2 - Probability
Lecture 2 - ProbabilityLecture 2 - Probability
Lecture 2 - ProbabilityLuke Dicken
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)Thinkful
 
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpKnowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpLuke Dicken
 
Adversarial search
Adversarial searchAdversarial search
Adversarial searchNilu Desai
 
Decision trees
Decision treesDecision trees
Decision treesNcib Lotfi
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopPranab Ghosh
 
Coffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringCoffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringMihirKadam3
 
SAIG Overview March 2011
SAIG Overview March 2011SAIG Overview March 2011
SAIG Overview March 2011Luke Dicken
 
Fantasy Football Draft Optimization in R - HRUG
Fantasy Football Draft Optimization in R - HRUGFantasy Football Draft Optimization in R - HRUG
Fantasy Football Draft Optimization in R - HRUGegoodwintx
 
03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdfSugumarSarDurai
 
Multivariate statistics
Multivariate statisticsMultivariate statistics
Multivariate statisticsVeneficus
 

Similar to Lecture 4 - Opponent Modelling (20)

4646150.ppt
4646150.ppt4646150.ppt
4646150.ppt
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game Playing
 
Machine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data DemystifiedMachine Learning Essentials Demystified part1 | Big Data Demystified
Machine Learning Essentials Demystified part1 | Big Data Demystified
 
Predicting the NBA MVP
Predicting the NBA MVPPredicting the NBA MVP
Predicting the NBA MVP
 
CS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptxCS194Lec0hbh6EDA.pptx
CS194Lec0hbh6EDA.pptx
 
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
Hands-On Machine Learning with Scikit-Learn and TensorFlow - Chapter8
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed bandit
 
Lecture 2 - Probability
Lecture 2 - ProbabilityLecture 2 - Probability
Lecture 2 - Probability
 
Predict oscars (5:11)
Predict oscars (5:11)Predict oscars (5:11)
Predict oscars (5:11)
 
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em UpKnowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
Knowing When to Hold 'Em, When to Fold 'Em and When to Blow 'Em Up
 
Machine Learning for Dummies
Machine Learning for DummiesMachine Learning for Dummies
Machine Learning for Dummies
 
Adversarial search
Adversarial searchAdversarial search
Adversarial search
 
Decision trees
Decision treesDecision trees
Decision trees
 
Games
GamesGames
Games
 
Outlier and fraud detection using Hadoop
Outlier and fraud detection using HadoopOutlier and fraud detection using Hadoop
Outlier and fraud detection using Hadoop
 
Coffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringCoffee beans variants recommendation using clustering
Coffee beans variants recommendation using clustering
 
SAIG Overview March 2011
SAIG Overview March 2011SAIG Overview March 2011
SAIG Overview March 2011
 
Fantasy Football Draft Optimization in R - HRUG
Fantasy Football Draft Optimization in R - HRUGFantasy Football Draft Optimization in R - HRUG
Fantasy Football Draft Optimization in R - HRUG
 
03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf03-Data-Analysis-Final.pdf
03-Data-Analysis-Final.pdf
 
Multivariate statistics
Multivariate statisticsMultivariate statistics
Multivariate statistics
 

More from Luke Dicken

Advances in Game AI
Advances in Game AIAdvances in Game AI
Advances in Game AILuke Dicken
 
Diversity in NPC AI
Diversity in NPC AIDiversity in NPC AI
Diversity in NPC AILuke Dicken
 
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...Procedural Processes - Lessons Learnt from Automated Content Generation in "E...
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...Luke Dicken
 
Game AI For the Masses
Game AI For the MassesGame AI For the Masses
Game AI For the MassesLuke Dicken
 
The Next Generation of Game Planners
The Next Generation of Game PlannersThe Next Generation of Game Planners
The Next Generation of Game PlannersLuke Dicken
 
Game AI 101 - NPCs and Agents and Algorithms... Oh My!
Game AI 101 - NPCs and Agents and Algorithms... Oh My!Game AI 101 - NPCs and Agents and Algorithms... Oh My!
Game AI 101 - NPCs and Agents and Algorithms... Oh My!Luke Dicken
 
Game Development 1 - What is a Game?
Game Development 1 - What is a Game?Game Development 1 - What is a Game?
Game Development 1 - What is a Game?Luke Dicken
 
The International Game Developers Association
The International Game Developers AssociationThe International Game Developers Association
The International Game Developers AssociationLuke Dicken
 
Lecture 7 - Experience Management
Lecture 7 - Experience ManagementLecture 7 - Experience Management
Lecture 7 - Experience ManagementLuke Dicken
 
Lecture 6 - Procedural Content and Player Models
Lecture 6 - Procedural Content and Player ModelsLecture 6 - Procedural Content and Player Models
Lecture 6 - Procedural Content and Player ModelsLuke Dicken
 
Lecture 5 - Procedural Content Generation
Lecture 5 - Procedural Content GenerationLecture 5 - Procedural Content Generation
Lecture 5 - Procedural Content GenerationLuke Dicken
 
The Strathclyde Poker Research Environment
The Strathclyde Poker Research EnvironmentThe Strathclyde Poker Research Environment
The Strathclyde Poker Research EnvironmentLuke Dicken
 
The Ludic Fallacy Applied to Automated Planning
The Ludic Fallacy Applied to Automated PlanningThe Ludic Fallacy Applied to Automated Planning
The Ludic Fallacy Applied to Automated PlanningLuke Dicken
 
Artificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video GamesArtificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video GamesLuke Dicken
 
Integrated Influence - The Six Million Dollar Man of AI
Integrated Influence - The Six Million Dollar Man of AIIntegrated Influence - The Six Million Dollar Man of AI
Integrated Influence - The Six Million Dollar Man of AILuke Dicken
 
Robust Agent Execution
Robust Agent ExecutionRobust Agent Execution
Robust Agent ExecutionLuke Dicken
 
General Game Playing
General Game PlayingGeneral Game Playing
General Game PlayingLuke Dicken
 

More from Luke Dicken (17)

Advances in Game AI
Advances in Game AIAdvances in Game AI
Advances in Game AI
 
Diversity in NPC AI
Diversity in NPC AIDiversity in NPC AI
Diversity in NPC AI
 
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...Procedural Processes - Lessons Learnt from Automated Content Generation in "E...
Procedural Processes - Lessons Learnt from Automated Content Generation in "E...
 
Game AI For the Masses
Game AI For the MassesGame AI For the Masses
Game AI For the Masses
 
The Next Generation of Game Planners
The Next Generation of Game PlannersThe Next Generation of Game Planners
The Next Generation of Game Planners
 
Game AI 101 - NPCs and Agents and Algorithms... Oh My!
Game AI 101 - NPCs and Agents and Algorithms... Oh My!Game AI 101 - NPCs and Agents and Algorithms... Oh My!
Game AI 101 - NPCs and Agents and Algorithms... Oh My!
 
Game Development 1 - What is a Game?
Game Development 1 - What is a Game?Game Development 1 - What is a Game?
Game Development 1 - What is a Game?
 
The International Game Developers Association
The International Game Developers AssociationThe International Game Developers Association
The International Game Developers Association
 
Lecture 7 - Experience Management
Lecture 7 - Experience ManagementLecture 7 - Experience Management
Lecture 7 - Experience Management
 
Lecture 6 - Procedural Content and Player Models
Lecture 6 - Procedural Content and Player ModelsLecture 6 - Procedural Content and Player Models
Lecture 6 - Procedural Content and Player Models
 
Lecture 5 - Procedural Content Generation
Lecture 5 - Procedural Content GenerationLecture 5 - Procedural Content Generation
Lecture 5 - Procedural Content Generation
 
The Strathclyde Poker Research Environment
The Strathclyde Poker Research EnvironmentThe Strathclyde Poker Research Environment
The Strathclyde Poker Research Environment
 
The Ludic Fallacy Applied to Automated Planning
The Ludic Fallacy Applied to Automated PlanningThe Ludic Fallacy Applied to Automated Planning
The Ludic Fallacy Applied to Automated Planning
 
Artificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video GamesArtificial Intelligence in Computer and Video Games
Artificial Intelligence in Computer and Video Games
 
Integrated Influence - The Six Million Dollar Man of AI
Integrated Influence - The Six Million Dollar Man of AIIntegrated Influence - The Six Million Dollar Man of AI
Integrated Influence - The Six Million Dollar Man of AI
 
Robust Agent Execution
Robust Agent ExecutionRobust Agent Execution
Robust Agent Execution
 
General Game Playing
General Game PlayingGeneral Game Playing
General Game Playing
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Lecture 4 - Opponent Modelling

  • 1. Making Better Decisions - Opponent Modelling 1
  • 2. Monte Carlo in Poker (Recap) • Yesterday we saw that Monte Carlo could be used to estimate the expected reward of an action by evaluating the delayed reward • We do this by simulating or "rolling out" games to their end state. • Assess the amount we won or lost 2
  • 3. Game Tree and Monte Carlo i F R C Opponent Chance 3
  • 4. Random Walks in the Game Tree • When we walk the Game Tree at random, we pick nodes to follow at random. • We assume (for now) that this is an unbiased choice • This means every choice has the same probability of being chosen 4
  • 5. Can We Do Better? • Random walks are all well and good • But a uniform distribution across action choices isn't accurate ‣ Certain situations will make sensible players more likely to use certain actions • How can we bring this bias into play in the walk? 5
  • 6. Classifying Opponents • The way we do this is to work out what type of player someone is. • We observe them to get a better understanding of how they operate. • In Poker and other games, we can use all sorts of statistical measures to quantify a player's type. 6
  • 7. Action Prediction • Once we know what kind of player someone is, we can flip things on their head. • We answered "what is the likelihood this player is type X given we have seen this type of play" • We can now answer "what is the likelihood this player will make action Y given they are of type X" • Remember from Bayes Theorem last week, these questions are closely linked 7
  • 8. Simple (Human) Classification • Pro Poker players try to quantify their opponents into one of several classes based on 3 measures ‣ Voluntarily Put in Pot (VPiP) ‣ Won at Showdown (WSD) ‣ Pre-flop Raise (PFR) 8
  • 9. Player Stereotypes • Players can be ‣ Tight / Loose (how likely they are to play hands) ‣ Passive / Aggressive pre-flop ‣ Passive / Aggressive post-flop 9
  • 10. Utilising Stereotypes • If we can classify players we can use this against them • For instance, we might discover that passive players can be chased off by aggressive play • Or we understand that when a super-conservative player decides to raise, we need to be careful • We can build heuristic rule bases around this like we saw before. • Or we can be much smarter 10
  • 11. Better Classifications • Humans are getting by on 3 dimensions • But Poker has waaaay more statistics available than this • We can make a lot of use of this extra data. 11
  • 12. Poker Tracker • Poker Tracker is a stats package specifically for Poker • Analyses play at online casinos • Real-time access to stats about opponents • Allows players to review hands later 12
  • 13. Stats in Poker • A few slides ago - Poker has many statistics • Poker Tracker keeps tabs on around 150 metrics • Some of these are somewhat similar, some relate more to the games than the players 13
  • 14. Problem of Dimensionality • The problem now is that we have too much information! • Trying to learn on cluttered data can be problematic, assuming it works at all. 14
  • 15. Dimensionality Reduction • Somehow we have to reduce the number of dimensions that our data points are using. • In many ways, getting the right data into a learning algorithm is the biggest challenge. • As much art as it is engineering. • Two options ‣ Feature selection ‣ Feature extraction 15
  • 16. Selection vs Extraction • In Selection, you pick the dimensions you believe to be most relevant ‣ The human players did this to get their 3 dimensional representations • In Extraction, you come up new dimensions that can represent your datapoint 16
  • 17. Principal Components Analysis • PCA is a common strategy for this. • Recasts the dimensions of the datapoint into another set of "basis vectors". • Smushes together dimensions that have a strong correlation ‣ Some stats measures are looking at fundamentally the same thing, in different ways ‣ E.g.Various raise frequency metrics might be treated as a single “aggression” dimension after PCA 17
  • 18. Principal Components Analysis • This was going to be a worked example. • Honestly, that’s way to painful. • For N observations in M dimensions X is a matrix where each column is an observation. • Calculate the mean and std. dev. for each row in the matrix (each dimension) 18
  • 19. Principal Components Analysis • Calculate the covariance matrix, the amount that the dimensions vary with respect to each other. • Calculate the eigenvectors and eigenvalues of the covariance matrix ‣ The eigenvectors are the new basis vectors of the reduced-dimension datapoints ‣ The eigenvalues represent how significant the eigenvector is. Large value = significant 19
  • 20. Principal Components Analysis • Pick the most significant K of the eigenvectors. • Project the original datapoint in X onto the new basis vectors. 20
  • 21. Principal Components Analysis • Honestly, if anyone ever asks you to do this ‣ Get a textbook ‣ Use Matlab ‣ Be really careful because it’s kind of complicated • It is possible to do it by hand. ‣ I can’t anymore... 21
  • 22. Principal Components Analysis • Assuming that you finish the calculations without mucking up. ‣ Or, you find something to work it out for you (Matlab functions for this exist) • What you have now is a new datapoint, that is approximately the same information. • Recast into fewer dimensions. ‣ Note that the dimensions will not make sense 22
  • 24. Clustering Algorithms • Having performed PCA, we have a much more manageable set of datapoints, and we’ve eliminated extraneous dimensions • Now we need to group them together. • Clustering algorithms are one approach. • Tries to find a set of “clusters” of points that are grouped together. 24
  • 25. Clustering 50.0 37.5 25.0 12.5 0 0 7.5 15.0 22.5 30.0 Blue Peter style example - real data is rarely so neat 25
  • 26. Clustering • k-means is one of the most popular algorithms ‣ Others exist, fuzzy c-means, FLAME clustering and more • Pick a value for k ‣ You can play around a bit to find good values or use some tricks ‣ Accepted “rule of thumb” : 26
  • 27. K-Means Algorithm • Typically, we run the k-means algorithm as an “iterative refinement” process ‣ Guess at some initial values, keep running the process round and round until it stabilises • Randomly assign datapoints to one of the k clusters • Step 1 - Calculate centroids of the clusters • Step 2 - Update assignment based on new centroids • Rinse and repeat 1 and 2 until convergence. 27
  • 28. K-Means Algorithm • Calculating Centroids of clusters ‣ xj denotes the datapoints being sampled ‣ mi(t+1) denotes mean of cluster i at iteration t+1 ‣ Si(t) denotes the set of datapoints assigned to cluster i at iteration t • Effectively, the average of the datapoints 28
  • 29. K-Means Algorithm • Assigning Datapoints to Clusters • The set of points Si is all datapoints for which the centroid of cluster i (mi) is the nearest centroid. 29
  • 30. K-Means Worked Example • Board work 30
  • 31. From Classification to Prediction • Once we have our clusters defined, we know what datapoints constitute the type of player we are analysing • We can use this to predict what the player will do ‣ We have a collection of “similar” players, we can use their history. ‣ We may be able to use the raw data from the observations directly. • In either case, we can use the classification to predict actions 31
  • 32. Back to Monte Carlo • So, back to the game tree. • We now have an idea of what type of player we are dealing with. • We have an idea of what actions the players are going to take in given situations. • Can we plug this back into the Monte Carlo simulation? 32
  • 33. Informed Walks in the Game Tree • We talked earlier about Opponent nodes in the game tree • Specifically, when we hit an Opponent node, we would use a uniform distribution to randomly pick between the options available. • Now, we can bias that distribution towards selecting the action we expect the player to take. 33
  • 34. Does This Work? • Intuitively, it should • The more accurate we make the simulation, the more accurate the results should be. • Concern is that the prediction process will slow things down too much ‣ Monte Carlo relies on large numbers of samples, if they take too long, accuracy isn’t helping. 34
  • 35. Does This Work? • We don’t know. • It’s been proven to aid Monte Carlo for Poker when k=1 ‣ All players are treated as a generic “player” • This is ongoing research right now in SAIG. • Look for papers next year. :) 35
  • 36. What We Do Know • We’ve previously attempted Machine Learning for Opponent Modelling. • Using 32 different statistical measures (reduced down to 8 significant dimensions by PCA) • Training data of 700,000 hands of Poker • Successfully extracted around 28 different player stereotypes. 36
  • 37. The Aim of the Game • We aren’t going to be able to make an AI that always wins at Poker • There’s too much chance involved ‣ Bad hands come up ‣ Mis-interpreting players • What we want to do is make an AI that performs better than the other players under the same circumstances 37
  • 38. Evaluation • Any time we do research we are testing some sort of scientific hypothesis. • We need to design experiments to test whether the hypothesis is true or not • Science doesn't care if we're right - unbiased. Even if we're wrong, we have learnt something. 38
  • 39. Evaluation • Consider a pro Poker player • Will win some games and lose others ‣ In fact, a fundamental rule of good poker play is not even taking part in about 80% of the games you sit through • Measuring in terms of a single game doesn't work ‣ Need to look at the forest, not the trees • What counts is how much money the player wins at the end. 39
  • 40. Measuring the Strength of an AI • What we need is a measure of how successful a bot is on average. • Poker gives a metric for this - Big Blind / 100 ‣ Metric is in terms of the table limit - normalised • Note that even for a large number of games, the variance on this measure can be really big. ‣ Recall Black Swan events - low likelihood, high impact. Large wins are Black Swans here. 40
  • 41. Stable Experimentation • We really need a way to remove the variance from the problem. • Ordinarily we might repeat the experimentation, take a large number of sample, use law of averages to our advantage. • We talked yesterday about the state space of just the card dealing component of Poker ‣ We know it's too large for this to be an option 41
  • 42. Experimentation • What if we generate experimental scenarios. • A large number of games, with the deck already configured. • We can play the scenario with player A • Then replay the exact same scenario with player B • The results that player A and B generate are now comparable. 42
  • 43. Experimental Design • Designing good experiments is really important • Not just for AI but for all kinds of things • Understanding sources of uncertainty means we can find ways to factor them out • Design fair unbiased experiments • For Science! 43
  • 44. Summary • More detail on Monte Carlo in Poker • Explanation of Opponent Modelling in Poker ‣ Dimensionality Reduction ‣ Clustering algorithms • Exploiting Opponent Models • Experimental Design 44
  • 45. Next Week • Other uses for Opponent Models • Procedural Content Generation • AI in Video Games 45