SlideShare a Scribd company logo
Deep Learning &
Reinforcement Learning MILA
Summer School Highlights
Natalia Díaz Rodríguez, PhD 26th June-5th July 2017
Montreal, Quebec
Learning to Learn - Nando de Freitas
• What is the intrinsic motivation we are
here? learning, satisfaction of getting
knowledge
• From Bengio’s brothers 92 to GitHub.com/
deepmind/learning-to-learn
• 1 single network: optimiser & optimizee
• Generalize: learning to learn X by doing
Y (unsup. by super. learning)
(Task-oriented) Language grounding
Related: language grounding
Rich Sutton: TD-learning
NdF’17
Does not scale for large amounts of actions
Related: Satinder Singh’s RL Talk
Language grounding via
instruction-guided RL
Automatic differentiation: the new trend by all
DL frameworks
• Matt Johnson great tutorial on Automatic
Differentiation
• IDEA: checkpointing and less config
boilerplate code
• Becoming standard:
• Tensor Flow eager
• PyTorch Taping
Graphical models and DL: a powerful
combination -Matt Johnson (Google)
GANs are sexy
CycleGAN Zhu’17
GANs state-of-the-art
• Applications: image generation, attribute morphing, image inpainting…
• State-of-the-art
• BEGAN*, Cycle-GAN (draw a bag and find a real one)
• Unsupervised Pixel–Level Domain Adaptation with Generative
Adversarial Networks, Bousmalis 16 (Unsupervised (GAN)–
based architecture able to learn a transformation without using
corresponding pairs from the two domains, code to appear,
CVPR17).
• The best state of the art approach improving over:
• Decoupling from the Task-Specific Architecture
• Generalization Across Label Spaces
• Achieve Training Stability
• Data Augmentation
* Fast and stable, new boundary equilibrium enforcing method paired with a loss derived from the Wasserstein distance for
training auto-encoder based GAN
CycleGAN
KNN is still one of the most repeated
quantitative measure for unsupervised
evaluation
Bousmalis’16
GANs help Semi and Unsupervised
learning as well as domain randomisation
• CVAE-GAN fine-grained category image
generation.
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
GANs Mode Collapse: inability to generate a variable
distribution of data
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
One/Few-shot learning
• Extending siamese with one-shot learning: Siamese
Neural Networks for One-shot Image Recognition.
One Shot Learning with Siamese Networks in PyTorch – Harshvardhan Gupta – Medium
This is Part 1 of a two part article. Part 2 will be shown here once it is published. 

• Black-Box Data-efficient Policy Search for Robotics
Mouret17 (Gaussian process regression for policy
optimisation using model based policy search). 5
episodes enough to learn the whole dynamics of
the arm from scratch.
• If you can’t predict
reward, predict a
relative ordering
rank (same vs
different)
• Siamese network:
optimize all rankings
simultaneously
• Natural language embedding into
multidimensional space really helps learning
(humans ALWAYS learn language)
• Physics and bodies provide essential
consistency for understanding intelligence, and
facilitate transfer and continuous learning
• Solving many tasks helps: sometimes many
tasks are essential to learn at all [Learning more
things at once often helps performance in RL.
Intentional unintentional agents]
• Reporting failure cases is also important!
Take Home
Messages
28
[NdF]
• TD-learning is back & hot (from the first
TD-Gammon AI won game)*
• Only 1 reward at the end
• No feedback along the way
• New venue: Int’ conference on RL and
decision making https://groups.google.com/
forum/#!forum/rldm-list
* See unsupervised representation learning talk by R. Sutton and latest
DeepMind (Mnih’17 evolution of UNREAL)
Take Home
Messages
31
• Domain randomization: use to transfer
from simulation to real life learning without
domain adaptation (OpenAI, NVIDIA cube
pose estimation: distractors and different
backgrounds, lights, virtual elements to real
images).
• Learning by demonstration and few shot
learning: Most data-efficient learning
algorithms for semi supervised learning
Take Home
Messages
32
• Regularizing NN by penalising confident
output distributions [Pereyra 17].
• Additional objectives (similar to UNREAL):
RL with Unsupervised Auxiliary Tasks
[Jaderberg’17]
• Generating grounded rewards
automatically [Littman, Topcu et al 17].
Take Home Papers
33
*Reinforcement Learning with Unsupervised Auxiliary Tasks - Implementation: https://github.com/miyosuda/unreal
**Option: a generalisation step of a single-step action that may span across more than 1 timestep and can be used as a
standard action. We move to the policy mu over options o with probability mu(s,o). We can derive a policy over options
Pi_omega that maximises the expected discounted (via regrets) sum of rewards.
•DeepMind 2 parallel works: Relational Networks and Visual Interaction
Networks (philosophically similar works using abstract logic to reason
about the world).  
•Dealing with sparse rewards:
•Reward shaping: Off-Policy Reward Shaping with Ensembles: https://
arxiv.org/abs/1502.03248 and Expressing Arbitrary Reward Functions
as Potential-Based Advice: https://www.aaai.org/ocs/index.php/AAAI/
AAAI15/paper/viewFile/9893/9923
•http://papers.nips.cc/paper/6538-safe-and-efficient-off-policy-reinf
 https://ai.vub.ac.be/sites/default/files/PID3130853.pdf
•Reinforcement Learning from Demonstration through Shaping
•Non-Markovian Rewards Expressed in LTL: Guiding Search Via
Reward Shaping. A. Camacho, et al. (RLDM), June 2017
•https://arxiv.org/pdf/1706.10295.pdf
Take Home Papers
34
•GANS:
•Allan Ma (Guelph) State of art GAN implem. +
evaluation.
•GAN used to perform domain adaptation (useful
ideas to go from simulated robot simulation to
real world robot simulation)
•LANGUAGE GROUNDING AND VISUAL/DIALOG
HYBRID SYSTEMS (Ideas for PARL.AI grant call):
End-to-end optimization of goal-driven and visually
grounded dialogue systems    
Take Home Papers
35
• Dex-Net Grasping dataset (10K 3D models to acquire force
closure grasps, for the ABB YuMi)
• ROS service for grasp planning. Dex-Net as a Service: Fall
2017. HTTP web API to create new databases with custom 3D
models and compute grasp robustness metrics.
• Google robot farm dataset: many robot arms for grasping,
pushing, etc. 800,000 grasp attempts (6-14 robotic
manipulators)
• Using Baxter:
• Pinto and Gupta Baxter dataset (40k grasping experiences).
CNNs predict lifting successes or to resist grasp perturbations
caused by an adversary*.
• Oberlin’15 Autonomously collecting object scans
Take Home Datasets
36
*Lerrel Pinto and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In Proc. IEEE
Int. Conf. Robotics and Automation (ICRA), 2016.
Lerrel Pinto, James Davidson, and Abhinav Gupta. Supervision via competition: Robot adversaries for learning tasks. arXiv preprint
arXiv:1610.01685, 2016.
Food for thought
• Is AI = DL + RL? (Hado van Hasselt)
• Does the brain do backpropagation?
• Even if the brain is not doing back-propagation as
ANN do, there is no mathematical handicap that
can prove otherwise
• CNNs and LSTMs: successful ubiquitous AI
models inspired by the human brain
• :( Neuroscience is still far apart from AI community
Keyword Summary
• GANS as data augmentation
(CycleGAN, BEGAN,…)
• Autoregressive models (PixelGAN)
• Embedding language and vision
representations
•End-to-end
•Self-supervision
•Learning by:
•Imitation*, cloning, demonstration and by predicting the
future (natural learning)
•One-shot learning
•Reward shaping and other myriad signals
•TD-learning
•Options framework
* E.g. Imitating Driver Behavior with Generative Adversarial Networks https://arxiv.org/pdf/1701.06699.pdf
Keyword
Summary
41
Grants and competitions
• https://nips.cc/
Conferences/2017/
CompetitionTrack
Learning to run
Papers right out of the oven
[PDF] End-to-End Learning of Semantic Grasping
E Jang, S Vijaynarasimhan, P Pastor, J Ibarz, S Levine - arXiv preprint arXiv: …, 2017
Abstract: We consider the task of semantic robotic grasping, in which a robot picks up an
object of a user-specified class using only monocular images. Inspired by the two-stream
hypothesis of visual reasoning, we present a semantic grasping framework that learns object
[PDF] Imitation from Observation: Learning to Imitate Behaviors from Raw Video via
Context Translation
YX Liu, A Gupta, P Abbeel, S Levine - arXiv preprint arXiv:1707.03374, 2017
Abstract: Imitation learning is an effective approach for autonomous systems to acquire
control policies when an explicit reward function is unavailable, using supervision provided
as demonstrations from an expert, typically a human operator. However, standard imitation
[PDF] Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End
Learning from Demonstration
R Rahmatizadeh, P Abolghasemi, L Bölöni, S Levine - arXiv preprint arXiv: …, 2017
42
Papers right out of the oven
43
Limitations:
• Requires a substantial number of demonstrations to learn the
translation model.
• Requires observations of demonstrations from multiple
contexts in order to learn to translate between them.
Insights:
• Training an end-to-end model from scratch for each task may
be inefficient in practice
• Combining our method with higher level representations
proposed in prior work would likely lead to more efficient
training (Sermanet et al., 2017).
• Challenge: Domain shift: combine multiple tasks from different
contexts into a single model
Papers right out of the oven
Papers right out of the oven
Papers right out of the oven
• REINFORCEMENT LEARNING WITH
UNSUPERVISED AUXILIARY TASKS
(UNREAL and extension Mnih17)
• Auxiliary control and reward prediction
tasks in Deep RL doubles data efficiency
& robustness to hyperp. settings.
• A3C successor in learning speed and the
robustness (over 87% of human scores)
• Slides
• TensorFlow Session
• Github Project Tutorial
• TensorFlow Installation Notes
• Theano Session Tutorial
RESOURCES
48
Thank you!
natalia.diaz@ensta-paristech.fr
@NataliaDiazRodr
www.linkedin.com/in/nataliadr
Appendix
AI safety
Using relational properties in our priors?
•Neural-symbolic (Knowledge Graph) learning
and reasoning
62
Relational Networks (Santoro’17) and Visual Interaction Networks (Watters’17)
Philosophically similar models using abstract logic to reason about the world
Interpreting unsupervised representations
•Understanding intermediate layers using linear
classifier probes. Alain and Bengio’16 https://
arxiv.org/pdf/1610.01644.pdf
•Explaining the Unexplained: A CLass-Enhanced
Attentive Response (CLEAR) Approach to
Understanding Deep Neural Networks, Kumar et
al 17. https://arxiv.org/pdf/1704.04133.pdf
MILA DL & RL summer school highlights

More Related Content

What's hot

Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Márton Miháltz
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
Roelof Pieters
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
Bhaskar Mitra
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
Samiur Rahman
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
Roelof Pieters
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
Sujit Pal
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do It
Holberton School
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
Jonathan Mugan
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
Francesco Gadaleta
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
Roelof Pieters
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
Anuj Gupta
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
Roelof Pieters
 
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and ApplicationsDay 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Aseda Owusua Addai-Deseh
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
NAVER Engineering
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
Alice Zheng
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
Akshay Hegde
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Saurabh Kaushik
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the Edge
Vincenzo Lomonaco
 

What's hot (20)

Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do It
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and ApplicationsDay 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the Edge
 

Similar to MILA DL & RL summer school highlights

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
Charmi Chokshi
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
gabrielesisinna
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
Alexander Borzunov
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
Ganesan Narayanasamy
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
Roelof Pieters
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
Fernando Constantino
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
Ashish Jaiman
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
Roelof Pieters
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
Michele Toni
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
GoDataDriven
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learning
Charles Deledalle
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
parlamind
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Databricks
 
Laird ibm-small
Laird ibm-smallLaird ibm-small
Laird ibm-small
diannepatricia
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
yang947066
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep Learning
Sebastian Ruder
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Abhishek Bhandwaldar
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learning
Vaibhav Singh
 
Learn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsLearn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building Projects
John Alex
 

Similar to MILA DL & RL summer school highlights (20)

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learning
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Laird ibm-small
Laird ibm-smallLaird ibm-small
Laird ibm-small
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learning
 
Learn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsLearn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building Projects
 

More from Natalia Díaz Rodríguez

State representation learning for control: an overview
State representation learning for control: an overview State representation learning for control: an overview
State representation learning for control: an overview
Natalia Díaz Rodríguez
 
Continual learning and robotics
Continual learning and robotics   Continual learning and robotics
Continual learning and robotics
Natalia Díaz Rodríguez
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
Natalia Díaz Rodríguez
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
Natalia Díaz Rodríguez
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
Natalia Díaz Rodríguez
 
How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)
Natalia Díaz Rodríguez
 
Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...
Natalia Díaz Rodríguez
 
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
Natalia Díaz Rodríguez
 
Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015 Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015
Natalia Díaz Rodríguez
 
PhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz RodríguezPhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz Rodríguez
Natalia Díaz Rodríguez
 
Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...
Natalia Díaz Rodríguez
 
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa RicaUCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
Natalia Díaz Rodríguez
 
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia DíazIFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
Natalia Díaz Rodríguez
 
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Natalia Díaz Rodríguez
 
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...Natalia Díaz Rodríguez
 

More from Natalia Díaz Rodríguez (15)

State representation learning for control: an overview
State representation learning for control: an overview State representation learning for control: an overview
State representation learning for control: an overview
 
Continual learning and robotics
Continual learning and robotics   Continual learning and robotics
Continual learning and robotics
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
 
How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)
 
Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...
 
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
 
Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015 Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015
 
PhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz RodríguezPhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz Rodríguez
 
Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...
 
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa RicaUCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
 
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia DíazIFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
 
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
 
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
 

Recently uploaded

general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 

Recently uploaded (20)

general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 

MILA DL & RL summer school highlights

  • 1. Deep Learning & Reinforcement Learning MILA Summer School Highlights Natalia Díaz Rodríguez, PhD 26th June-5th July 2017 Montreal, Quebec
  • 2. Learning to Learn - Nando de Freitas • What is the intrinsic motivation we are here? learning, satisfaction of getting knowledge • From Bengio’s brothers 92 to GitHub.com/ deepmind/learning-to-learn • 1 single network: optimiser & optimizee • Generalize: learning to learn X by doing Y (unsup. by super. learning)
  • 3.
  • 4.
  • 5.
  • 6.
  • 11.
  • 12. Does not scale for large amounts of actions
  • 15.
  • 16. Automatic differentiation: the new trend by all DL frameworks • Matt Johnson great tutorial on Automatic Differentiation • IDEA: checkpointing and less config boilerplate code • Becoming standard: • Tensor Flow eager • PyTorch Taping
  • 17. Graphical models and DL: a powerful combination -Matt Johnson (Google)
  • 19. GANs state-of-the-art • Applications: image generation, attribute morphing, image inpainting… • State-of-the-art • BEGAN*, Cycle-GAN (draw a bag and find a real one) • Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks, Bousmalis 16 (Unsupervised (GAN)– based architecture able to learn a transformation without using corresponding pairs from the two domains, code to appear, CVPR17). • The best state of the art approach improving over: • Decoupling from the Task-Specific Architecture • Generalization Across Label Spaces • Achieve Training Stability • Data Augmentation * Fast and stable, new boundary equilibrium enforcing method paired with a loss derived from the Wasserstein distance for training auto-encoder based GAN CycleGAN
  • 20.
  • 21. KNN is still one of the most repeated quantitative measure for unsupervised evaluation Bousmalis’16
  • 22. GANs help Semi and Unsupervised learning as well as domain randomisation
  • 23. • CVAE-GAN fine-grained category image generation. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17 GANs Mode Collapse: inability to generate a variable distribution of data
  • 24. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
  • 25. One/Few-shot learning • Extending siamese with one-shot learning: Siamese Neural Networks for One-shot Image Recognition. One Shot Learning with Siamese Networks in PyTorch – Harshvardhan Gupta – Medium This is Part 1 of a two part article. Part 2 will be shown here once it is published. • Black-Box Data-efficient Policy Search for Robotics Mouret17 (Gaussian process regression for policy optimisation using model based policy search). 5 episodes enough to learn the whole dynamics of the arm from scratch.
  • 26.
  • 27. • If you can’t predict reward, predict a relative ordering rank (same vs different) • Siamese network: optimize all rankings simultaneously
  • 28. • Natural language embedding into multidimensional space really helps learning (humans ALWAYS learn language) • Physics and bodies provide essential consistency for understanding intelligence, and facilitate transfer and continuous learning • Solving many tasks helps: sometimes many tasks are essential to learn at all [Learning more things at once often helps performance in RL. Intentional unintentional agents] • Reporting failure cases is also important! Take Home Messages 28 [NdF]
  • 29.
  • 30.
  • 31. • TD-learning is back & hot (from the first TD-Gammon AI won game)* • Only 1 reward at the end • No feedback along the way • New venue: Int’ conference on RL and decision making https://groups.google.com/ forum/#!forum/rldm-list * See unsupervised representation learning talk by R. Sutton and latest DeepMind (Mnih’17 evolution of UNREAL) Take Home Messages 31
  • 32. • Domain randomization: use to transfer from simulation to real life learning without domain adaptation (OpenAI, NVIDIA cube pose estimation: distractors and different backgrounds, lights, virtual elements to real images). • Learning by demonstration and few shot learning: Most data-efficient learning algorithms for semi supervised learning Take Home Messages 32
  • 33. • Regularizing NN by penalising confident output distributions [Pereyra 17]. • Additional objectives (similar to UNREAL): RL with Unsupervised Auxiliary Tasks [Jaderberg’17] • Generating grounded rewards automatically [Littman, Topcu et al 17]. Take Home Papers 33 *Reinforcement Learning with Unsupervised Auxiliary Tasks - Implementation: https://github.com/miyosuda/unreal **Option: a generalisation step of a single-step action that may span across more than 1 timestep and can be used as a standard action. We move to the policy mu over options o with probability mu(s,o). We can derive a policy over options Pi_omega that maximises the expected discounted (via regrets) sum of rewards.
  • 34. •DeepMind 2 parallel works: Relational Networks and Visual Interaction Networks (philosophically similar works using abstract logic to reason about the world).   •Dealing with sparse rewards: •Reward shaping: Off-Policy Reward Shaping with Ensembles: https:// arxiv.org/abs/1502.03248 and Expressing Arbitrary Reward Functions as Potential-Based Advice: https://www.aaai.org/ocs/index.php/AAAI/ AAAI15/paper/viewFile/9893/9923 •http://papers.nips.cc/paper/6538-safe-and-efficient-off-policy-reinf  https://ai.vub.ac.be/sites/default/files/PID3130853.pdf •Reinforcement Learning from Demonstration through Shaping •Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping. A. Camacho, et al. (RLDM), June 2017 •https://arxiv.org/pdf/1706.10295.pdf Take Home Papers 34
  • 35. •GANS: •Allan Ma (Guelph) State of art GAN implem. + evaluation. •GAN used to perform domain adaptation (useful ideas to go from simulated robot simulation to real world robot simulation) •LANGUAGE GROUNDING AND VISUAL/DIALOG HYBRID SYSTEMS (Ideas for PARL.AI grant call): End-to-end optimization of goal-driven and visually grounded dialogue systems     Take Home Papers 35
  • 36. • Dex-Net Grasping dataset (10K 3D models to acquire force closure grasps, for the ABB YuMi) • ROS service for grasp planning. Dex-Net as a Service: Fall 2017. HTTP web API to create new databases with custom 3D models and compute grasp robustness metrics. • Google robot farm dataset: many robot arms for grasping, pushing, etc. 800,000 grasp attempts (6-14 robotic manipulators) • Using Baxter: • Pinto and Gupta Baxter dataset (40k grasping experiences). CNNs predict lifting successes or to resist grasp perturbations caused by an adversary*. • Oberlin’15 Autonomously collecting object scans Take Home Datasets 36 *Lerrel Pinto and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In Proc. IEEE Int. Conf. Robotics and Automation (ICRA), 2016. Lerrel Pinto, James Davidson, and Abhinav Gupta. Supervision via competition: Robot adversaries for learning tasks. arXiv preprint arXiv:1610.01685, 2016.
  • 37.
  • 38. Food for thought • Is AI = DL + RL? (Hado van Hasselt) • Does the brain do backpropagation? • Even if the brain is not doing back-propagation as ANN do, there is no mathematical handicap that can prove otherwise • CNNs and LSTMs: successful ubiquitous AI models inspired by the human brain • :( Neuroscience is still far apart from AI community
  • 39. Keyword Summary • GANS as data augmentation (CycleGAN, BEGAN,…) • Autoregressive models (PixelGAN) • Embedding language and vision representations
  • 40. •End-to-end •Self-supervision •Learning by: •Imitation*, cloning, demonstration and by predicting the future (natural learning) •One-shot learning •Reward shaping and other myriad signals •TD-learning •Options framework * E.g. Imitating Driver Behavior with Generative Adversarial Networks https://arxiv.org/pdf/1701.06699.pdf Keyword Summary
  • 41. 41 Grants and competitions • https://nips.cc/ Conferences/2017/ CompetitionTrack Learning to run
  • 42. Papers right out of the oven [PDF] End-to-End Learning of Semantic Grasping E Jang, S Vijaynarasimhan, P Pastor, J Ibarz, S Levine - arXiv preprint arXiv: …, 2017 Abstract: We consider the task of semantic robotic grasping, in which a robot picks up an object of a user-specified class using only monocular images. Inspired by the two-stream hypothesis of visual reasoning, we present a semantic grasping framework that learns object [PDF] Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation YX Liu, A Gupta, P Abbeel, S Levine - arXiv preprint arXiv:1707.03374, 2017 Abstract: Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator. However, standard imitation [PDF] Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration R Rahmatizadeh, P Abolghasemi, L Bölöni, S Levine - arXiv preprint arXiv: …, 2017 42
  • 43. Papers right out of the oven 43
  • 44.
  • 45. Limitations: • Requires a substantial number of demonstrations to learn the translation model. • Requires observations of demonstrations from multiple contexts in order to learn to translate between them. Insights: • Training an end-to-end model from scratch for each task may be inefficient in practice • Combining our method with higher level representations proposed in prior work would likely lead to more efficient training (Sermanet et al., 2017). • Challenge: Domain shift: combine multiple tasks from different contexts into a single model Papers right out of the oven
  • 46. Papers right out of the oven
  • 47. Papers right out of the oven • REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS (UNREAL and extension Mnih17) • Auxiliary control and reward prediction tasks in Deep RL doubles data efficiency & robustness to hyperp. settings. • A3C successor in learning speed and the robustness (over 87% of human scores)
  • 48. • Slides • TensorFlow Session • Github Project Tutorial • TensorFlow Installation Notes • Theano Session Tutorial RESOURCES 48
  • 49.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 59.
  • 60.
  • 61.
  • 62. Using relational properties in our priors? •Neural-symbolic (Knowledge Graph) learning and reasoning 62 Relational Networks (Santoro’17) and Visual Interaction Networks (Watters’17) Philosophically similar models using abstract logic to reason about the world
  • 63. Interpreting unsupervised representations •Understanding intermediate layers using linear classifier probes. Alain and Bengio’16 https:// arxiv.org/pdf/1610.01644.pdf •Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks, Kumar et al 17. https://arxiv.org/pdf/1704.04133.pdf