SlideShare a Scribd company logo
1 of 33
Download to read offline
Making robots learn
for the real world
Author : Tomi Silander
Team : Morgan Funtowicz, Arnaud Sors, Julien Perez & NAVER LABS robotics team
CONTENT
1. Introducing Reality Gap
2. Learning in Robotics
2.1 Other than Reinforcement Learning
2.2 Unsupervised and Self-supervised Learning
2.3 Reinforcement Learning
2.4 Simulations
3. Active Localization
4. Summary
Superhuman AI – Alpha Go (Zero)
“Google’s AI AlphaGo Is Beating Humanity At Its Own Games”
VIDEO HERE
20 million self-play games
200 000 games per day
Superhuman AI – Dota2
AI bots trained for 180 years
a day to beat humans at
Dota-2
Robot Soccer – subhuman AI
Soccer over chess as a paradigmatic robotic task (Sahota et al. AI-94)
Thus a GAP
and its been widening lately, but why?
Computational power has made it possible to scale RL i.e.,
to make the software agents able to learn end-to-end by trial
and error in simulations that can be run fast and in parallel
producing huge amounts of training data.
CONTENT
1. Introducing Reality Gap
2. Learning in Robotics
2.1 Other than Reinforcement Learning
2.2 Unsupervised and Self-supervised Learning
2.3 Reinforcement Learning
2.4 Simulations
3. Active Localization
4. Summary
Should we make robots learn?
First Conference on Robot Learning 2017, keynote by
Rodney Brooks:
1. Well, maybe – just for fun, to see if that is possible
2. To put correct pressure to machine learning
methods - my favorite argument
3. To make robots more practical – maybe
So if the Godfather of robotics is not too enthusiastic,
we might also want to be skeptical
Learning in robotics is a hot topic
1. Conference on Robot Learning 2018 (CORL)
2. Robotics: Science and Systems 2018 (RSS)
- learning to grasp
- learning to localize
- transferring from simulations to real life
3. International Conference on Robotics and Automation 2018
- lot of deep learning
4. International conference on Intelligent Robots and Systems
- last week in Madrid
Reinforcement learning
Target is to learn a behavioural policy:
- “In this situation S it is usually best to do
action A in order to accomplish the task”
- accomplishing the task is signaled giving
agent a reward for achieving the goal.
The agent learns to optimize it behaviour to
get maximal reward
- all this can be formalized using decision
processes.
Improving behaviour using evaluative feedback via trial and error.
Problems when using RL in real robots
1. Trying all kind of things is dangerous.
- it breaks robots and robot lab.
2. Real robot cannot be run in hyper-speed
- so it will take “forever” before the robot
learns.
3. Many tasks are not game–like
- so robots can not play against each other.
4. Reward signals hard to give.
- this often requires human interaction.
From Yann LeCun How Could Machines Learn as Efficeintly
as Animals and Humans?
Before we jump on the RL bandwagon
other type of robotic learning
(not by trial and error)
Instruction
Josh Tenenbaum in his ICML keynote 2018:
“after 18 months, human children mostly learn via language”
Should we make robots understand us so we can just tell them what to do?
- but Tenenbaum talked about instruction from human to human.
- by 18 months old, children have enough “common ground” for instruction by
“being told”.
Thomas Nagel in The Philosophical Review 1974:
- “What is it like to be a bat?”
- robots “lifeworld” is so different that language communication will be difficult.
- many competencies (e.g., grasping) are also difficult to “explain”.
Imitation learning: mimic the expert
One of the favorite methods in robotics, because of its safety and efficiency!
How:
- supervised learning from state to actions (lot of algorithmic variants)
Why not:
- expert cannot demonstrate behaviour on all possible situations
- so what to do in new situations?
Why traditionally a favorite method:
- robots have been used in controlled environments.
If surprise then HALT!
Playing around: trial, no error
Playing around, recording what happened
From playing to goal oriented actions
Collecting ("#, %#, "#&') allows
us to train the
regression/classification
model:
)("#, "#&') = %#.
If I am in situation "#, and I
want to get to situation "#&',
I should do action %#.
Self-supervised learning
Learning general tasks that are useful
for many kind of other tasks:
- grasping, obstacle avoidance, etc.
Curiosity (Pathak et al. ICML 2017):
- trying things consequences of
which we cannot yet predict
- learning to explore efficiently
So if we really jump on RL bandwagon
Thinking, Fast and Slow
System 1:
Fast, automatic, frequent,
unconscious
- see if an object is further than
another
- localize the source of a sound
- display disgust when seeing a
gruesome image
- drive a car on an empty road
System 2:
Slow, infrequent, logical, conscious
- look out for a woman with the gray hear
- dig into your memory to recognize a
song
- determine the appropriateness of a
behavior in a social context
- count a number of A’s in a certain text
Similar dichotomy in RL: System 1 = model-free RL, System 2 = model-based RL
Model-free RL for thinking fast
Learn the gut-feeling:
- search in policy space to learn !(#) = &(#;().
- after fitting parameters (∗, just do &(#;(∗).
- can use expert examples
Suitable when task is “simple”
Deep Q-learning (DQN):
- learn to estimate expected reward of
doing action a in situation s, i.e. *(#,!).
- in each s just pick !,-.!/
0
*(#,!).
Model-based RL for thinking slow
Learn the model of the world, i.e. consequences of your actions:
- a “forward model” ! ",$,"%
, i.e.,
- the probability that action a changes the situation s to s’.
Same model can be used for many tasks.
Schmidhuber (2018):
- “World Models” => policies are simple
Dreaming:
- forward models can be used to simulate the world.
- in hyper-speed, thus allowing methods that require big data.
- not realistic, but it’s the versatility that matters (S. Levine).
Our adversarially trained forward model
Combines adversarial training and ideas from Info GANs
Model based or model-free RL?
NB! People have both System 1 and System 2
- rarely done in current RL, but maybe a key to robotics, since
versus
Example of closing the gap
Gilhyun Ryou, Youngwoo Sim, Seong Ho Yeon and Sangok Seok (ICRA 2018)
Applying Asynchronous Deep Classification Networks and Gaming Reinforcement Learning-Based Motion Planners to Mobile Robots
Cognitive architecture for a robot
Processes working asynchronously at
different speeds:
- fast for motor control
- slower for object classification
- connected by a central representation
that can also be trained via simulation
Thus enter the simulations
Huge bulk of current “robotic” RL is
conducted simulations (it’s faster, safer)
- and then, sometimes(!),
tried to transfer to a real robot
- maybe with some domain adaptation
Works OK for
- simple spaces and controlled
environments where similarity of
simulation and reality is easier to
establish
CONTENT
1. Introducing Reality Gap
2. Learning in Robotics
2.1 Other than Reinforcement Learning
2.2 Unsupervised and Self-supervised Learning
2.3 Reinforcement Learning
2.4 Simulations
3. Active Localization
4. Summary
Active Neural Localization
Perception is active, information
seeking action (Gibson 1976)
If you are lost, try to look around
D.S. Chaplot et al. (NIPS 2017)
- grid presentations
- discrete angles
- noiseless operations
- noiseless directions
- noiseless images
In active localization project we had to
1. Use particle filters instead of grid
2. Continuous, noisy directions
3. Noisy moves
4. Uncertainty in observations
After all that, yes, an active policy (still)
performs better than a passive one!
Summary
There is a big GAP
Community is working hard to close it
- but no evidence yet for the gap diminishing
Some of the promising ways to diminish the gap:
- let the robot play to find out how the world
works
- build the simulations to match the reality, not the
other way round
- let the robot dream to play in alternative realities.
Q & A
질문은 Slido에 남겨주세요.
sli.do
#deview
TRACK 4
Thank you

More Related Content

What's hot

Artificial intelligence(02)
Artificial intelligence(02)Artificial intelligence(02)
Artificial intelligence(02)Nazir Ahmed
 
Simplified Introduction to AI
Simplified Introduction to AISimplified Introduction to AI
Simplified Introduction to AIDeepu S Nath
 
Introduction to Artificial Intelligence - Cybernetics Robo Academy
Introduction to Artificial Intelligence - Cybernetics Robo AcademyIntroduction to Artificial Intelligence - Cybernetics Robo Academy
Introduction to Artificial Intelligence - Cybernetics Robo AcademyTutulAhmed3
 
Artificial intelligence - (A seminar on Emerging Trends of Technology)
Artificial intelligence - (A seminar on Emerging Trends of Technology) Artificial intelligence - (A seminar on Emerging Trends of Technology)
Artificial intelligence - (A seminar on Emerging Trends of Technology) ileomax
 
Lect#1 (Artificial Intelligence )
Lect#1 (Artificial Intelligence )Lect#1 (Artificial Intelligence )
Lect#1 (Artificial Intelligence )Zeeshan_Jadoon
 
Turing Test : From A.I. to Beyond !
Turing Test : From A.I. to Beyond !Turing Test : From A.I. to Beyond !
Turing Test : From A.I. to Beyond !Abhishek Singh
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Artificial Intelligent Humanoid Robot (AIHRO) An Overview
Artificial Intelligent Humanoid Robot (AIHRO) An OverviewArtificial Intelligent Humanoid Robot (AIHRO) An Overview
Artificial Intelligent Humanoid Robot (AIHRO) An OverviewAKHIL JOY
 
The Shape of Robots to Come - Robolift - March 2011
The Shape of Robots to Come - Robolift - March 2011The Shape of Robots to Come - Robolift - March 2011
The Shape of Robots to Come - Robolift - March 2011Dominique Sciamma
 
A survey on AI in computer games
A survey on AI in computer gamesA survey on AI in computer games
A survey on AI in computer gamesRedwanIslam12
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceAlbert Orriols-Puig
 
Learning Machine Learning
Learning Machine LearningLearning Machine Learning
Learning Machine LearningJoel Lord
 

What's hot (18)

Artificial intelligence(02)
Artificial intelligence(02)Artificial intelligence(02)
Artificial intelligence(02)
 
Simplified Introduction to AI
Simplified Introduction to AISimplified Introduction to AI
Simplified Introduction to AI
 
AI Introduction
AI Introduction AI Introduction
AI Introduction
 
Introduction to Artificial Intelligence - Cybernetics Robo Academy
Introduction to Artificial Intelligence - Cybernetics Robo AcademyIntroduction to Artificial Intelligence - Cybernetics Robo Academy
Introduction to Artificial Intelligence - Cybernetics Robo Academy
 
Artificial intelligence - (A seminar on Emerging Trends of Technology)
Artificial intelligence - (A seminar on Emerging Trends of Technology) Artificial intelligence - (A seminar on Emerging Trends of Technology)
Artificial intelligence - (A seminar on Emerging Trends of Technology)
 
Lect#1 (Artificial Intelligence )
Lect#1 (Artificial Intelligence )Lect#1 (Artificial Intelligence )
Lect#1 (Artificial Intelligence )
 
Ai
AiAi
Ai
 
Turing test
Turing testTuring test
Turing test
 
Turing Test : From A.I. to Beyond !
Turing Test : From A.I. to Beyond !Turing Test : From A.I. to Beyond !
Turing Test : From A.I. to Beyond !
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
mesca
mescamesca
mesca
 
Artificial Intelligent Humanoid Robot (AIHRO) An Overview
Artificial Intelligent Humanoid Robot (AIHRO) An OverviewArtificial Intelligent Humanoid Robot (AIHRO) An Overview
Artificial Intelligent Humanoid Robot (AIHRO) An Overview
 
The Shape of Robots to Come - Robolift - March 2011
The Shape of Robots to Come - Robolift - March 2011The Shape of Robots to Come - Robolift - March 2011
The Shape of Robots to Come - Robolift - March 2011
 
A survey on AI in computer games
A survey on AI in computer gamesA survey on AI in computer games
A survey on AI in computer games
 
Startup 101
Startup 101Startup 101
Startup 101
 
Lecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligenceLecture1 AI1 Introduction to artificial intelligence
Lecture1 AI1 Introduction to artificial intelligence
 
Learning Machine Learning
Learning Machine LearningLearning Machine Learning
Learning Machine Learning
 
What is AI Anyway?
What is AI Anyway?What is AI Anyway?
What is AI Anyway?
 

Similar to [244]로봇이 현실 세계에 대해 학습하도록 만들기

HPAI Class 2 - human aspects and computing systems in ai - 012920
HPAI  Class 2 - human aspects and computing systems in ai - 012920HPAI  Class 2 - human aspects and computing systems in ai - 012920
HPAI Class 2 - human aspects and computing systems in ai - 012920melendez321
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathySze Siong Teo
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItHolberton School
 
DL Classe 0 - You can do it
DL Classe 0 - You can do itDL Classe 0 - You can do it
DL Classe 0 - You can do itGregory Renard
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYsantoshverma90
 
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Acts
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor ActsTo Ask or To Sense? Planning to Integrate Speech and Sensorimotor Acts
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Actstoukaigi
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence NIKHILMALPURE3
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learningcolleges
 
Unit I Introduction to AI K.Sundar,AP/CSE,VEC
Unit I Introduction to AI K.Sundar,AP/CSE,VECUnit I Introduction to AI K.Sundar,AP/CSE,VEC
Unit I Introduction to AI K.Sundar,AP/CSE,VECsundarKanagaraj1
 
Selected topics in Computer Science
Selected topics in Computer Science Selected topics in Computer Science
Selected topics in Computer Science Melaku Bayih Demessie
 
RMV Artificial Intelligence
RMV Artificial IntelligenceRMV Artificial Intelligence
RMV Artificial Intelligenceanand hd
 
Artificial Intteligence-unit 1.pptx
Artificial Intteligence-unit 1.pptxArtificial Intteligence-unit 1.pptx
Artificial Intteligence-unit 1.pptxhoneydv1979
 
ARTIFICIAL INTELLIGENCE-New.pptx
ARTIFICIAL INTELLIGENCE-New.pptxARTIFICIAL INTELLIGENCE-New.pptx
ARTIFICIAL INTELLIGENCE-New.pptxParveshSachdev
 
Brave machine's tomorrow nazli temur
Brave machine's tomorrow nazli temurBrave machine's tomorrow nazli temur
Brave machine's tomorrow nazli temurnazlitemu
 
Python AI tutorial
Python AI tutorialPython AI tutorial
Python AI tutorialgrinu
 

Similar to [244]로봇이 현실 세계에 대해 학습하도록 만들기 (20)

Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
HPAI Class 2 - human aspects and computing systems in ai - 012920
HPAI  Class 2 - human aspects and computing systems in ai - 012920HPAI  Class 2 - human aspects and computing systems in ai - 012920
HPAI Class 2 - human aspects and computing systems in ai - 012920
 
Y conf talk - Andrej Karpathy
Y conf talk - Andrej KarpathyY conf talk - Andrej Karpathy
Y conf talk - Andrej Karpathy
 
Machine Learning and Robotic Vision
Machine Learning and Robotic VisionMachine Learning and Robotic Vision
Machine Learning and Robotic Vision
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do It
 
DL Classe 0 - You can do it
DL Classe 0 - You can do itDL Classe 0 - You can do it
DL Classe 0 - You can do it
 
Ai lecture1 final
Ai lecture1 finalAi lecture1 final
Ai lecture1 final
 
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGYAI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
AI INTRODUCTION.pptx,INFORMATION TECHNOLOGY
 
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Acts
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor ActsTo Ask or To Sense? Planning to Integrate Speech and Sensorimotor Acts
To Ask or To Sense? Planning to Integrate Speech and Sensorimotor Acts
 
Introduction to ai
Introduction to aiIntroduction to ai
Introduction to ai
 
Artificial Intelligence
Artificial Intelligence Artificial Intelligence
Artificial Intelligence
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Unit I Introduction to AI K.Sundar,AP/CSE,VEC
Unit I Introduction to AI K.Sundar,AP/CSE,VECUnit I Introduction to AI K.Sundar,AP/CSE,VEC
Unit I Introduction to AI K.Sundar,AP/CSE,VEC
 
Selected topics in Computer Science
Selected topics in Computer Science Selected topics in Computer Science
Selected topics in Computer Science
 
RMV Artificial Intelligence
RMV Artificial IntelligenceRMV Artificial Intelligence
RMV Artificial Intelligence
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Artificial Intteligence-unit 1.pptx
Artificial Intteligence-unit 1.pptxArtificial Intteligence-unit 1.pptx
Artificial Intteligence-unit 1.pptx
 
ARTIFICIAL INTELLIGENCE-New.pptx
ARTIFICIAL INTELLIGENCE-New.pptxARTIFICIAL INTELLIGENCE-New.pptx
ARTIFICIAL INTELLIGENCE-New.pptx
 
Brave machine's tomorrow nazli temur
Brave machine's tomorrow nazli temurBrave machine's tomorrow nazli temur
Brave machine's tomorrow nazli temur
 
Python AI tutorial
Python AI tutorialPython AI tutorial
Python AI tutorial
 

More from NAVER D2

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다NAVER D2
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...NAVER D2
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기NAVER D2
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발NAVER D2
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈NAVER D2
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&ANAVER D2
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep LearningNAVER D2
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applicationsNAVER D2
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingNAVER D2
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지NAVER D2
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기NAVER D2
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화NAVER D2
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)NAVER D2
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기NAVER D2
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual SearchNAVER D2
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화NAVER D2
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지NAVER D2
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터NAVER D2
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?NAVER D2
 
[231] Clova 화자인식
[231] Clova 화자인식[231] Clova 화자인식
[231] Clova 화자인식NAVER D2
 

More from NAVER D2 (20)

[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다[211] 인공지능이 인공지능 챗봇을 만든다
[211] 인공지능이 인공지능 챗봇을 만든다
 
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
[233] 대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing: Maglev Hashing Scheduler i...
 
[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기[215] Druid로 쉽고 빠르게 데이터 분석하기
[215] Druid로 쉽고 빠르게 데이터 분석하기
 
[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발[245]Papago Internals: 모델분석과 응용기술 개발
[245]Papago Internals: 모델분석과 응용기술 개발
 
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
[236] 스트림 저장소 최적화 이야기: 아파치 드루이드로부터 얻은 교훈
 
[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A[235]Wikipedia-scale Q&A
[235]Wikipedia-scale Q&A
 
[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning[243] Deep Learning to help student’s Deep Learning
[243] Deep Learning to help student’s Deep Learning
 
[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications[234]Fast & Accurate Data Annotation Pipeline for AI applications
[234]Fast & Accurate Data Annotation Pipeline for AI applications
 
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load BalancingOld version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
Old version: [233]대형 컨테이너 클러스터에서의 고가용성 Network Load Balancing
 
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
[226]NAVER 광고 deep click prediction: 모델링부터 서빙까지
 
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
[225]NSML: 머신러닝 플랫폼 서비스하기 & 모델 튜닝 자동화하기
 
[224]네이버 검색과 개인화
[224]네이버 검색과 개인화[224]네이버 검색과 개인화
[224]네이버 검색과 개인화
 
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
[216]Search Reliability Engineering (부제: 지진에도 흔들리지 않는 네이버 검색시스템)
 
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
[214] Ai Serving Platform: 하루 수 억 건의 인퍼런스를 처리하기 위한 고군분투기
 
[213] Fashion Visual Search
[213] Fashion Visual Search[213] Fashion Visual Search
[213] Fashion Visual Search
 
[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화[232] TensorRT를 활용한 딥러닝 Inference 최적화
[232] TensorRT를 활용한 딥러닝 Inference 최적화
 
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
[242]컴퓨터 비전을 이용한 실내 지도 자동 업데이트 방법: 딥러닝을 통한 POI 변화 탐지
 
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
[212]C3, 데이터 처리에서 서빙까지 가능한 하둡 클러스터
 
[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?[223]기계독해 QA: 검색인가, NLP인가?
[223]기계독해 QA: 검색인가, NLP인가?
 
[231] Clova 화자인식
[231] Clova 화자인식[231] Clova 화자인식
[231] Clova 화자인식
 

Recently uploaded

AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessUXDXConf
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...CzechDreamin
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreelreely ones
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024Stephanie Beckett
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?Mark Billinghurst
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101vincent683379
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024TopCSSGallery
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftshyamraj55
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 

Recently uploaded (20)

AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 

[244]로봇이 현실 세계에 대해 학습하도록 만들기

  • 1. Making robots learn for the real world Author : Tomi Silander Team : Morgan Funtowicz, Arnaud Sors, Julien Perez & NAVER LABS robotics team
  • 2. CONTENT 1. Introducing Reality Gap 2. Learning in Robotics 2.1 Other than Reinforcement Learning 2.2 Unsupervised and Self-supervised Learning 2.3 Reinforcement Learning 2.4 Simulations 3. Active Localization 4. Summary
  • 3. Superhuman AI – Alpha Go (Zero) “Google’s AI AlphaGo Is Beating Humanity At Its Own Games” VIDEO HERE 20 million self-play games 200 000 games per day
  • 4. Superhuman AI – Dota2 AI bots trained for 180 years a day to beat humans at Dota-2
  • 5. Robot Soccer – subhuman AI Soccer over chess as a paradigmatic robotic task (Sahota et al. AI-94)
  • 6. Thus a GAP and its been widening lately, but why? Computational power has made it possible to scale RL i.e., to make the software agents able to learn end-to-end by trial and error in simulations that can be run fast and in parallel producing huge amounts of training data.
  • 7. CONTENT 1. Introducing Reality Gap 2. Learning in Robotics 2.1 Other than Reinforcement Learning 2.2 Unsupervised and Self-supervised Learning 2.3 Reinforcement Learning 2.4 Simulations 3. Active Localization 4. Summary
  • 8. Should we make robots learn? First Conference on Robot Learning 2017, keynote by Rodney Brooks: 1. Well, maybe – just for fun, to see if that is possible 2. To put correct pressure to machine learning methods - my favorite argument 3. To make robots more practical – maybe So if the Godfather of robotics is not too enthusiastic, we might also want to be skeptical
  • 9. Learning in robotics is a hot topic 1. Conference on Robot Learning 2018 (CORL) 2. Robotics: Science and Systems 2018 (RSS) - learning to grasp - learning to localize - transferring from simulations to real life 3. International Conference on Robotics and Automation 2018 - lot of deep learning 4. International conference on Intelligent Robots and Systems - last week in Madrid
  • 10. Reinforcement learning Target is to learn a behavioural policy: - “In this situation S it is usually best to do action A in order to accomplish the task” - accomplishing the task is signaled giving agent a reward for achieving the goal. The agent learns to optimize it behaviour to get maximal reward - all this can be formalized using decision processes. Improving behaviour using evaluative feedback via trial and error.
  • 11. Problems when using RL in real robots 1. Trying all kind of things is dangerous. - it breaks robots and robot lab. 2. Real robot cannot be run in hyper-speed - so it will take “forever” before the robot learns. 3. Many tasks are not game–like - so robots can not play against each other. 4. Reward signals hard to give. - this often requires human interaction. From Yann LeCun How Could Machines Learn as Efficeintly as Animals and Humans?
  • 12. Before we jump on the RL bandwagon other type of robotic learning (not by trial and error)
  • 13. Instruction Josh Tenenbaum in his ICML keynote 2018: “after 18 months, human children mostly learn via language” Should we make robots understand us so we can just tell them what to do? - but Tenenbaum talked about instruction from human to human. - by 18 months old, children have enough “common ground” for instruction by “being told”. Thomas Nagel in The Philosophical Review 1974: - “What is it like to be a bat?” - robots “lifeworld” is so different that language communication will be difficult. - many competencies (e.g., grasping) are also difficult to “explain”.
  • 14. Imitation learning: mimic the expert One of the favorite methods in robotics, because of its safety and efficiency! How: - supervised learning from state to actions (lot of algorithmic variants) Why not: - expert cannot demonstrate behaviour on all possible situations - so what to do in new situations? Why traditionally a favorite method: - robots have been used in controlled environments. If surprise then HALT!
  • 16. Playing around, recording what happened
  • 17. From playing to goal oriented actions Collecting ("#, %#, "#&') allows us to train the regression/classification model: )("#, "#&') = %#. If I am in situation "#, and I want to get to situation "#&', I should do action %#.
  • 18. Self-supervised learning Learning general tasks that are useful for many kind of other tasks: - grasping, obstacle avoidance, etc. Curiosity (Pathak et al. ICML 2017): - trying things consequences of which we cannot yet predict - learning to explore efficiently
  • 19. So if we really jump on RL bandwagon
  • 20. Thinking, Fast and Slow System 1: Fast, automatic, frequent, unconscious - see if an object is further than another - localize the source of a sound - display disgust when seeing a gruesome image - drive a car on an empty road System 2: Slow, infrequent, logical, conscious - look out for a woman with the gray hear - dig into your memory to recognize a song - determine the appropriateness of a behavior in a social context - count a number of A’s in a certain text Similar dichotomy in RL: System 1 = model-free RL, System 2 = model-based RL
  • 21. Model-free RL for thinking fast Learn the gut-feeling: - search in policy space to learn !(#) = &(#;(). - after fitting parameters (∗, just do &(#;(∗). - can use expert examples Suitable when task is “simple” Deep Q-learning (DQN): - learn to estimate expected reward of doing action a in situation s, i.e. *(#,!). - in each s just pick !,-.!/ 0 *(#,!).
  • 22. Model-based RL for thinking slow Learn the model of the world, i.e. consequences of your actions: - a “forward model” ! ",$,"% , i.e., - the probability that action a changes the situation s to s’. Same model can be used for many tasks. Schmidhuber (2018): - “World Models” => policies are simple Dreaming: - forward models can be used to simulate the world. - in hyper-speed, thus allowing methods that require big data. - not realistic, but it’s the versatility that matters (S. Levine).
  • 23. Our adversarially trained forward model Combines adversarial training and ideas from Info GANs
  • 24. Model based or model-free RL? NB! People have both System 1 and System 2 - rarely done in current RL, but maybe a key to robotics, since versus
  • 25. Example of closing the gap Gilhyun Ryou, Youngwoo Sim, Seong Ho Yeon and Sangok Seok (ICRA 2018) Applying Asynchronous Deep Classification Networks and Gaming Reinforcement Learning-Based Motion Planners to Mobile Robots Cognitive architecture for a robot Processes working asynchronously at different speeds: - fast for motor control - slower for object classification - connected by a central representation that can also be trained via simulation
  • 26. Thus enter the simulations Huge bulk of current “robotic” RL is conducted simulations (it’s faster, safer) - and then, sometimes(!), tried to transfer to a real robot - maybe with some domain adaptation Works OK for - simple spaces and controlled environments where similarity of simulation and reality is easier to establish
  • 27. CONTENT 1. Introducing Reality Gap 2. Learning in Robotics 2.1 Other than Reinforcement Learning 2.2 Unsupervised and Self-supervised Learning 2.3 Reinforcement Learning 2.4 Simulations 3. Active Localization 4. Summary
  • 28. Active Neural Localization Perception is active, information seeking action (Gibson 1976) If you are lost, try to look around D.S. Chaplot et al. (NIPS 2017) - grid presentations - discrete angles - noiseless operations - noiseless directions - noiseless images
  • 29. In active localization project we had to 1. Use particle filters instead of grid 2. Continuous, noisy directions 3. Noisy moves 4. Uncertainty in observations After all that, yes, an active policy (still) performs better than a passive one!
  • 30. Summary There is a big GAP Community is working hard to close it - but no evidence yet for the gap diminishing Some of the promising ways to diminish the gap: - let the robot play to find out how the world works - build the simulations to match the reality, not the other way round - let the robot dream to play in alternative realities.
  • 31. Q & A