Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

AI Frontiers
Nov. 13, 2018
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting
1 of 21

More Related Content

What's hot

State of the art in Natural Language Processing (March 2019)State of the art in Natural Language Processing (March 2019)
State of the art in Natural Language Processing (March 2019)Liad Magen
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)
Li Deng at AI Frontiers: Three Generations of Spoken Dialogue Systems (Bots)AI Frontiers
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
Adam Coates at AI Frontiers: AI for 100 Million People with Deep LearningAdam Coates at AI Frontiers: AI for 100 Million People with Deep Learning
Adam Coates at AI Frontiers: AI for 100 Million People with Deep LearningAI Frontiers

Similar to Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

[244]로봇이 현실 세계에 대해 학습하도록 만들기[244]로봇이 현실 세계에 대해 학습하도록 만들기
[244]로봇이 현실 세계에 대해 학습하도록 만들기NAVER D2
Sam Spaulding - Emotion AI Developer Day 2016Sam Spaulding - Emotion AI Developer Day 2016
Sam Spaulding - Emotion AI Developer Day 2016Affectiva
Antimo Musone - Vocal Assistant - build natural and rich conversational exper...Antimo Musone - Vocal Assistant - build natural and rich conversational exper...
Antimo Musone - Vocal Assistant - build natural and rich conversational exper...Codemotion
TotalSynch-PitchDeckTotalSynch-PitchDeck
TotalSynch-PitchDeckMangesh Mahajan
Command, Goal Disambiguation, Introspection, and Instruction in Gesture-Free ...Command, Goal Disambiguation, Introspection, and Instruction in Gesture-Free ...
Command, Goal Disambiguation, Introspection, and Instruction in Gesture-Free ...Vladimir Kulyukin
Introduction to Software Engineering  Lecture 0Introduction to Software Engineering  Lecture 0
Introduction to Software Engineering Lecture 0Mohamed Essam

Similar to Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting(20)

More from AI Frontiers

Divya Jain at AI Frontiers : Video SummarizationDivya Jain at AI Frontiers : Video Summarization
Divya Jain at AI Frontiers : Video SummarizationAI Frontiers
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI AI Frontiers
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...AI Frontiers
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...AI Frontiers
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksTraining at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksAI Frontiers
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...AI Frontiers

More from AI Frontiers(20)

Recently uploaded

Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...
Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...ThousandEyes
h2 meet pdf test.pdfh2 meet pdf test.pdf
h2 meet pdf test.pdfJohnLee971654
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AIMatthew Reynolds
The Ultimate Administrator’s Guide to HCL Nomad WebThe Ultimate Administrator’s Guide to HCL Nomad Web
The Ultimate Administrator’s Guide to HCL Nomad Webpanagenda
GIT AND GITHUB (1).pptxGIT AND GITHUB (1).pptx
GIT AND GITHUB (1).pptxGDSCCVRGUPoweredbyGo
Meetup_adessoCamunda_2023-09-13_Part1&2_en.pdfMeetup_adessoCamunda_2023-09-13_Part1&2_en.pdf
Meetup_adessoCamunda_2023-09-13_Part1&2_en.pdfMariaAlcantara50

Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

Editor's Notes

  1. Good afternoon everyone. I am Wei Xu from Horizon Robotics. Today I am going to talk about our recent work on language learning in an interactive and embodied setting
  2. In 1950, in the same article where the famous Turing test was proposed, Turing also proposed a solution. “Instead of trying to produce a program to simulate the adult mind, why not rather try to produce one which simulates the child's? If this were then subjected to an appropriate course of education one would obtain the adult brain”. There are several advantages of this approach. First, there are so many things that a human adult can do, it will be too expensive and difficult to individually solve each one of them. Second, emphasizing that all the skills and knowledge of the machine are acquired through its own learning can make sure that the machine will be able to learn new skills and new knowledge unspecified at design time. Third, learning in a developmental way lets the machine gradually proceed from easier tasks to more difficult tasks, which can make the learning easier. This is like curriculum learning which is found to be effective in many difficult learning problems.
  3. For embodied learning, the learning experiences are from machine’s physical interactions with its environment. By actually doing things and observing the effects, the machine can learn a lots of common sense knowledge about the environment. These kinds of knowledges are typically very hard to be captured by rules or a static dataset. Self-driving car is a great example. Waymo recently announced that the total mileage of their cars is exceeding 10 million miles. Yet they are still not fully ready for deployment. On the other hand, we all know from our experience that a human can learn to drive very well with a few hundred miles practice. A key difference between the self-driving car and human is that human has a lot of commonsense knowledge about the world. For example, even without learning to drive, a human driver knows what situation is unsafe, what obstacles should be avoided, and so on. But for self-driving, all of these commonsense knowledge has to be either coded by rules or obtained from huge amount of driving data. Embodied learning is also very help for understanding language. In order for the machine to understand and use language, it needs to connect word sequences with the actual objects and events in the environment. ……..
  4. Why should the machine learn in an interactive way? There are several reasons. First, a useful robot needs to interact with human, so it should be able to understand and communicate effectively with human. Second, it is easier for human to teach machines directly using language than writing code. And human are great teachers because they are good at adjusting the teaching based on the state of the learner. And in order to be able to use language, the machine needs to learn the effects of speaking by observing feedbacks from its conversational partner. Finally, through the interaction with human, the machine can learn the human value, which is very important to make sure it will do things consistent with human value. So I’ve talked about our motivation of learning language in an interactive and embodied setting. In the rest of the talk, I will talk about our recent work along this direction.
  5. The first one is about learning to answer questions and follow commands. This work was published in this year’s ICLR conference. There are two problems we want to study in this paper.
  6. Here is the problem setup. We developed a 2D simulator. For each session, we generate a random map, question and instructions. The answer is provided as direct supervision. The agent is given reward based on whether it successfully executed the instruction. At test time, the agent will be given commands with words or word combinations never seen in training commands or questions.
  7. This is the high level structure of our model. I won’t go into the detail of the model. What I want to say here we design the structure focusing on its generalization ability.
  8. This is a short video demo showing how the agent navigates following the commands. The current command is “please move to the object that is front of basket ball”. The agent needs to approach the toilet paper from the direction where it is in front of the basket ball. After it finishes a task, a new map and command will be generated.
  9. So far our agent is able to understand some language. In this work, we want the agent to learn to use language through conversation.
  10. Here the problem setup. In initially, the agent has zero language ability, cannot understand nor use it, just like a new born baby.
  11. This is a high level structure of our model. First, it needs to have memory module because it needs to remember information coming from teacher utterances and images. The vision module generates the visual representation. The interpreter module is for understanding teacher utterance and decide whether to store things into memory. The speaker module is responsible for generating responses based understanding of teacher utterances and information retrieved from memory. The whole system is trained by predicting teacher word sequence and the rewards indicating the appropriateness of the response.
  12. I will skip the detail of the model. Just note that it’s trained end-to-end using gradient descent over Imitation Cost + Reinforce Cost
  13. Here I show some dialog examples. This is a dialog before learning. The agent just generates some garbage responses, just like a newborn baby. Then dialogs after the learning. Here I want to mention is that the machine never see these types of object during training. From these dialogs we can see that the machine learned several things. It can confirm the statements from the teacher. It can actively seeking information by asking questions. It can remember relevant information provided by the teacher so later it can use it for answering questions. And it can also answer teacher’s questions if it knows the answer. It somehow learned to uses shape as major cue to differentiate objects. I want to emphasize that, unlike most chatbots, where the behavior of the bot is pretty much designed by human, here non of these behaviors are programmed. The machine learned all these behaviors through its interaction with the teacher, in a similar way as a baby learn from their parents.
  14. In this final slide, I am going to say a little bit about Horizon Robotics. It’s a leading technology powerhouse of edge AI platform. Its current focus is providing algorithms, processors and hardware jointly optimized for high-performance, low-power and low-cost edge AI capabilities. And I want to share a good news with you that we just received the CES 2019 Innovation Reward Vehicle Intelligence and Self-Driving Technology We have two AI Labs in Silicon Valley, one is general AI Lab. It’s doing the kind of research I just talked about, building machines that can learn skills and knowledges unspecified at design time We also have applied AI Lab. It’s doing applied research focusing on near term needs of the company, developing novel AI technologies that are critical to our current products We are actively hiring. If you are interested, please visit either of these two websites.