SlideShare a Scribd company logo
Teaching Your Computer To Play
Video Games
A Presentation For The Bainbridge BARN
September 18, 2016
About Me
Tech enthusiast; hardware and software
hacker; particular interest in machine learning
Pros:
This presentation is free of charge!
Cons:
No training in computer science, embedded
systems design, electrical engineering,
software development
What Is Machine Learning?
● A way for computers to learn without being
explicitly programmed
● It allows machines to make predictions
about the future after studying examples
from the past
● Forms the basis of artificial intelligence
● One of the hottest areas of computer
science today!
Why Video Games?
● They are easy to set up and let you control
every aspect of the learning environment
● They are fun
● They can be directly compared against
human performance
Here’s one of my favorite examples
Consider Spam Filtering...
Consider Spam Filtering...
● It is impossible to predict every possible way a
spam email could be written...
● You could try programming a bunch of rules:
○ “Cheap Meds From Canada” -> SPAM
○ “Your Medication Has Shipped -> NOT SPAM
● This rapidly becomes intractable - likely to get too
many false positives and false negatives
Consider Spam Filtering...
● A better way is to show the machine a bunch of
human-labeled examples, and let it generalize a
way to identify spam from these
● This is called Supervised Learning, because we
train our system on a bunch of examples
● Basis for most spam-filtering systems today
So How Does It Work?
● There are many types of algorithms that are used
for learning
● These have colorful names:
○ Naive Baysian Classifiers
○ Support Vector Machines
○ Random Forest Trees
● But we’ll focus here on Neural Networks since they
are currently some of the most widely used and are
so cool
Neural Networks
● Neural networks were
inspired from studying how
our brains work
● These consist of multiple
layers of interconnected
nodes (like neurons)
● They take an input (like a
video image), pass it
through, and yield an output
(like a label)
Neural Networks
Each of these connections has a weight associated
with it.
.3
.5
.8
Neural Networks
Information propagates through each layer of the
network, adjusted by the weights
.3
.5
.8
100
Neural Networks
Information propagates through each layer of the
network, adjusted by the weights
.3
.5
.8
100
30
50
80
Neural Networks
Information propagates through each layer of the
network, adjusted by the weights
.3
.5
.8
100
30
50
80
30
30
30
50
50
50
80
80
80
How Neural Networks Learn
At each step, you compare the actual output (ie - 78%
chance it’s a cat) with the expected output (ie - yes,
it’s a cat)
......
NOT CAT
CAT
Pixel
Value
183
22
78
How Neural Networks Learn
The weights are then adjusted to bring the actual
output closer to the expect output. Rinse and repeat...
......
NOT CAT
CAT
Weights
183
20
80
Neural Network Learning
● Adjusting these weights is how the network learns
● Real life networks may have millions of weights
spread over many layers
● This process allows the network to learn complex
behaviors and, we hope, an ability to generalize
concepts beyond what it was explicitly taught
Neural Network Topology
● There are multiple ways of connecting the nodes in
a neural network
● All of these seek to minimize the number of weights
you need and to combat the central problem of
machine learning: overfitting
● Overfitting means your network performs great so
long as it’s working with data it’s already seen. But
it fails miserably when it needs to generalize to data
it hasn’t seen
Neural Networks
● Recently, a type called convolutional neural
networks has been achieving amazing results,
particularly for problems that involve classifying
images or video
● Moreover, when you incorporate many layers (5, 6,
7, and more), the power of these networks is
astounding
● This is where the phrase deep learning comes
from, since these networks have many layers
Stanford’s Image Classifier
● Here’s a deep convolutional neural network in
action from a 2014 competition!
● This network was trained on 1.2 million images,
each labeled with one of 1000 categories
● Then was tested on images it had not seen
before…
● And achieved an error rate of only 5.1% compared
with how humans would classify the images
Check it out!
Text Generation
● Another type of neural network (called a recurrent
neural network) is great for sequential problems,
like predicting the next word in a sentence: “There
are so many clouds in the ____.”
● A fun trick with these is to train them on a body of
text (like the Bible or the complete works of
Shakespeare) and see what they spit out...
Computer-Generated Bible and
Shakespeare Verses
● 1 Chronicles 4:7 Then came them out of the
house of brass; and in the midst is to him, and was
done with him with the new moon: for in the city of
Jeshua ye shall put him speed, as the horn of me
plagued among them that hath need.
● Second Senator: They are away this miseries,
produced upon my soul, Breaking and strongly
should be buried, when I perish The earth and
thoughts of many states.
Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
Image Captioning
● An even more challenging machine learning task is
automatically generating captions to images
● This often combines deep convolutional networks
with recurrent neural networks
● Here’s an example of Google’s work on this
subject...
Image Captioning
Source: https://research.googleblog.com/2014/11/a-picture-is-worth-thousand-coherent.html
Image Captioning
This is really starting to approach how our own minds
seem to learn...
Unsupervised Learning
● Most examples so far have been of Supervised
Learning, where the machine is trained on a bunch
of human-labeled examples; Unsupervised
Learning is where the human does not provide any
guidance
● Specifically, Reinforcement Learning simply
provides an environment for the machine to play in,
and it is given rewards and penalties based on its
actions
● It’s up to the machine to figure out the best
strategy...
Learning To Play Video Games
● This is how we can teach a machine to play video
games:
○ The score is the reward
○ The machine gets to press any buttons it wants
Here’s a video demonstrating Google’s Atari project
(from 9:25)
How Reinforcement Learning
Works
● A neural network is at the heart of Reinforcement
Learning
● For video games, the input is the screen itself at
each frame, and the output is an estimate of the
value of each possible move (right, up, jump, etc.)
● The machine records its experiences at each point
in time: the screen, the action it took, the reward it
received, and the resulting screen afterwards
How Reinforcement Learning
Works (cont.)
● The machine then compares its prediction of the
reward it will get given a screen and given a
particular move, and compares this with the actual
result it received
● The network’s weights are adjusted to bring the two
closer
● Rinse and repeat
Reinforcement Learning
● Many concepts of Reinforcement Learning are
analogous to how our own minds work:
○ Learning Rate: how fast the network should
adapt to new information
○ Explore v. Exploit: how much to try new things
versus simply maxing out the best strategy
you’ve found so far
Reinforcement Learning
● Many concepts of Reinforcement Learning are
analogous to how our own minds work:
○ Memory Size: how long should we maintain our
memory of past experiences
○ Discount Rate: how much should we discount
future rewards over immediate rewards
Super Mario Bros.
● My own project was to apply Google’s methods to
play Super Mario Bros.
Here’s how it started…
And here’s how it was doing after about 72 hours...
The State Of The Art
● The next step for Mario - why run a single game
when you can run eight!
The State Of The Art
● Advances in machine learning are happening
extremely fast!
○ More powerful machines
○ The proliferation of open-source tools
○ The availability of tasks (like video games) we
can use to measure our progress
Where To Learn More
● Google Atari Project, and the paper in Nature
● My fork of this project to play Super Mario Bros.
● A text generator using recurrent neural nets
● The latest-and-greatest A3C algorithm for training
Atari
● The latest (free) tools of machine learning: Theano,
Torch, TensorFlow, and Chainer

More Related Content

Similar to Teaching Your Computer To Play Video Games

Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
doppenhe
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Zeynep Su Kurultay
 
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest LouisvilleSippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Dawn Yankeelov
 
Testing for the deeplearning folks
Testing for the deeplearning folksTesting for the deeplearning folks
Testing for the deeplearning folks
Vishwas N
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
Bohdan Klimenko
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
Adwait Bhave
 
Deep learning
Deep learningDeep learning
Deep learning
AnimaSinghDhabal
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
Amr Rashed
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptx
john6938
 
Introduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLabIntroduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLab
CloudxLab
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowaki
javier ramirez
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
Julien SIMON
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
Poo Kuan Hoong
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
Tom Dierickx
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
HJ van Veen
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
Michele Toni
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej Karpathy
Sze Siong Teo
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
jabedskakib
 
Introduction to-machine-learning
Introduction to-machine-learningIntroduction to-machine-learning
Introduction to-machine-learning
Babu Priyavrat
 

Similar to Teaching Your Computer To Play Video Games (20)

Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest LouisvilleSippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest Louisville
 
Testing for the deeplearning folks
Testing for the deeplearning folksTesting for the deeplearning folks
Testing for the deeplearning folks
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptx
 
Introduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLabIntroduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLab
 
Machine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowakiMachine learning para tertulianos, by javier ramirez at teowaki
Machine learning para tertulianos, by javier ramirez at teowaki
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A PrimerMDEC Data Matters Series: machine learning and Deep Learning, A Primer
MDEC Data Matters Series: machine learning and Deep Learning, A Primer
 
Neural networks with python
Neural networks with pythonNeural networks with python
Neural networks with python
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej Karpathy
 
Artificial neural networks
Artificial neural networksArtificial neural networks
Artificial neural networks
 
Introduction to-machine-learning
Introduction to-machine-learningIntroduction to-machine-learning
Introduction to-machine-learning
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 

Teaching Your Computer To Play Video Games

  • 1. Teaching Your Computer To Play Video Games A Presentation For The Bainbridge BARN September 18, 2016
  • 2. About Me Tech enthusiast; hardware and software hacker; particular interest in machine learning Pros: This presentation is free of charge! Cons: No training in computer science, embedded systems design, electrical engineering, software development
  • 3. What Is Machine Learning? ● A way for computers to learn without being explicitly programmed ● It allows machines to make predictions about the future after studying examples from the past ● Forms the basis of artificial intelligence ● One of the hottest areas of computer science today!
  • 4. Why Video Games? ● They are easy to set up and let you control every aspect of the learning environment ● They are fun ● They can be directly compared against human performance Here’s one of my favorite examples
  • 6. Consider Spam Filtering... ● It is impossible to predict every possible way a spam email could be written... ● You could try programming a bunch of rules: ○ “Cheap Meds From Canada” -> SPAM ○ “Your Medication Has Shipped -> NOT SPAM ● This rapidly becomes intractable - likely to get too many false positives and false negatives
  • 7. Consider Spam Filtering... ● A better way is to show the machine a bunch of human-labeled examples, and let it generalize a way to identify spam from these ● This is called Supervised Learning, because we train our system on a bunch of examples ● Basis for most spam-filtering systems today
  • 8. So How Does It Work? ● There are many types of algorithms that are used for learning ● These have colorful names: ○ Naive Baysian Classifiers ○ Support Vector Machines ○ Random Forest Trees ● But we’ll focus here on Neural Networks since they are currently some of the most widely used and are so cool
  • 9. Neural Networks ● Neural networks were inspired from studying how our brains work ● These consist of multiple layers of interconnected nodes (like neurons) ● They take an input (like a video image), pass it through, and yield an output (like a label)
  • 10. Neural Networks Each of these connections has a weight associated with it. .3 .5 .8
  • 11. Neural Networks Information propagates through each layer of the network, adjusted by the weights .3 .5 .8 100
  • 12. Neural Networks Information propagates through each layer of the network, adjusted by the weights .3 .5 .8 100 30 50 80
  • 13. Neural Networks Information propagates through each layer of the network, adjusted by the weights .3 .5 .8 100 30 50 80 30 30 30 50 50 50 80 80 80
  • 14. How Neural Networks Learn At each step, you compare the actual output (ie - 78% chance it’s a cat) with the expected output (ie - yes, it’s a cat) ...... NOT CAT CAT Pixel Value 183 22 78
  • 15. How Neural Networks Learn The weights are then adjusted to bring the actual output closer to the expect output. Rinse and repeat... ...... NOT CAT CAT Weights 183 20 80
  • 16. Neural Network Learning ● Adjusting these weights is how the network learns ● Real life networks may have millions of weights spread over many layers ● This process allows the network to learn complex behaviors and, we hope, an ability to generalize concepts beyond what it was explicitly taught
  • 17. Neural Network Topology ● There are multiple ways of connecting the nodes in a neural network ● All of these seek to minimize the number of weights you need and to combat the central problem of machine learning: overfitting ● Overfitting means your network performs great so long as it’s working with data it’s already seen. But it fails miserably when it needs to generalize to data it hasn’t seen
  • 18. Neural Networks ● Recently, a type called convolutional neural networks has been achieving amazing results, particularly for problems that involve classifying images or video ● Moreover, when you incorporate many layers (5, 6, 7, and more), the power of these networks is astounding ● This is where the phrase deep learning comes from, since these networks have many layers
  • 19. Stanford’s Image Classifier ● Here’s a deep convolutional neural network in action from a 2014 competition! ● This network was trained on 1.2 million images, each labeled with one of 1000 categories ● Then was tested on images it had not seen before… ● And achieved an error rate of only 5.1% compared with how humans would classify the images Check it out!
  • 20. Text Generation ● Another type of neural network (called a recurrent neural network) is great for sequential problems, like predicting the next word in a sentence: “There are so many clouds in the ____.” ● A fun trick with these is to train them on a body of text (like the Bible or the complete works of Shakespeare) and see what they spit out...
  • 21. Computer-Generated Bible and Shakespeare Verses ● 1 Chronicles 4:7 Then came them out of the house of brass; and in the midst is to him, and was done with him with the new moon: for in the city of Jeshua ye shall put him speed, as the horn of me plagued among them that hath need. ● Second Senator: They are away this miseries, produced upon my soul, Breaking and strongly should be buried, when I perish The earth and thoughts of many states. Source: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
  • 22. Image Captioning ● An even more challenging machine learning task is automatically generating captions to images ● This often combines deep convolutional networks with recurrent neural networks ● Here’s an example of Google’s work on this subject...
  • 24. Image Captioning This is really starting to approach how our own minds seem to learn...
  • 25. Unsupervised Learning ● Most examples so far have been of Supervised Learning, where the machine is trained on a bunch of human-labeled examples; Unsupervised Learning is where the human does not provide any guidance ● Specifically, Reinforcement Learning simply provides an environment for the machine to play in, and it is given rewards and penalties based on its actions ● It’s up to the machine to figure out the best strategy...
  • 26. Learning To Play Video Games ● This is how we can teach a machine to play video games: ○ The score is the reward ○ The machine gets to press any buttons it wants Here’s a video demonstrating Google’s Atari project (from 9:25)
  • 27. How Reinforcement Learning Works ● A neural network is at the heart of Reinforcement Learning ● For video games, the input is the screen itself at each frame, and the output is an estimate of the value of each possible move (right, up, jump, etc.) ● The machine records its experiences at each point in time: the screen, the action it took, the reward it received, and the resulting screen afterwards
  • 28. How Reinforcement Learning Works (cont.) ● The machine then compares its prediction of the reward it will get given a screen and given a particular move, and compares this with the actual result it received ● The network’s weights are adjusted to bring the two closer ● Rinse and repeat
  • 29. Reinforcement Learning ● Many concepts of Reinforcement Learning are analogous to how our own minds work: ○ Learning Rate: how fast the network should adapt to new information ○ Explore v. Exploit: how much to try new things versus simply maxing out the best strategy you’ve found so far
  • 30. Reinforcement Learning ● Many concepts of Reinforcement Learning are analogous to how our own minds work: ○ Memory Size: how long should we maintain our memory of past experiences ○ Discount Rate: how much should we discount future rewards over immediate rewards
  • 31. Super Mario Bros. ● My own project was to apply Google’s methods to play Super Mario Bros. Here’s how it started… And here’s how it was doing after about 72 hours...
  • 32. The State Of The Art ● The next step for Mario - why run a single game when you can run eight!
  • 33. The State Of The Art ● Advances in machine learning are happening extremely fast! ○ More powerful machines ○ The proliferation of open-source tools ○ The availability of tasks (like video games) we can use to measure our progress
  • 34. Where To Learn More ● Google Atari Project, and the paper in Nature ● My fork of this project to play Super Mario Bros. ● A text generator using recurrent neural nets ● The latest-and-greatest A3C algorithm for training Atari ● The latest (free) tools of machine learning: Theano, Torch, TensorFlow, and Chainer