Successfully reported this slideshow.
Your SlideShare is downloading. ×

Teacher-Aware Active Robot Learning

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
SBAC What is a CAT
SBAC What is a CAT
Loading in …3
×

Check these out next

1 of 48 Ad
Advertisement

More Related Content

Slideshows for you (19)

Similar to Teacher-Aware Active Robot Learning (20)

Advertisement

Teacher-Aware Active Robot Learning

  1. 1. Teacher-Aware Active Robot Learning Mattia Racca, Antti Oulasvirta and Ville Kyrki ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2019 mattia.racca@aalto.fi
  2. 2. Why (active) learning robots? 2 Programming robots is hard, pre-programming them for each task is harder impossible.
  3. 3. Why (active) learning robots? 3 Robot should learn by interacting with humans! M. Racca and V. Kyrki, Active Robot Learning for Temporal Task models, HRI ‘18
  4. 4. The idea behind Active Learning 4
  5. 5. The idea behind Active Learning 5
  6. 6. The idea behind Active Learning 6 The agent can efficiently choose what to learn next.
  7. 7. The idea behind Active Learning 7 … and improve its model faster!
  8. 8. 8 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions
  9. 9. 9 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions 2. Query Efficiency Learning faster (with less data)
  10. 10. 10 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions 2. Query Efficiency Learning faster (with less data) But what about REAL users?
  11. 11. What if efficient query selection is not best for the interaction? 11
  12. 12. Can efficiency indirectly counter its own benefits? 12 Query Efficiency Complex questions Questions out of context Harder for the teacher ● slower interaction ● more effort ● more errors!
  13. 13. Different types of Active Learning 13 1. CLASSIC AL STRATEGY (LEARNER C) 2. TEACHER-AWARE AL STRATEGY (LEARNER M) 3. HYBRID AL STRATEGY (LEARNER H)
  14. 14. An agent has to learn the value of a certain attribute a for a set E of entities by making queries. We used the Animals with Attributes 2* dataset with 50 animals (entities) and 85 semantic attributes. Problem statement & Evaluation scenario 14 * Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI
  15. 15. An agent has to learn the value of a certain attribute a for a set E of entities by making queries. We used the Animals with Attributes 2* dataset with 50 animals (entities) and 85 semantic attributes. Problem statement & Evaluation scenario 15 * Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI YES Do giraffes have patches?
  16. 16. ● categories C over entities using WordNet ● Learner assumption: Entities in the same category are more likely to share the same attribute value. Problem statement & Evaluation scenario 16
  17. 17. ● categories C over entities using WordNet ● Learner assumption: Entities in the same category are more likely to share the same attribute value. Problem statement & Evaluation scenario 17
  18. 18. Classic AL: Uncertainty Sampling 18 ● Learner C: ○ uses Uncertainty Sampling ○ selects the most uncertain query, given the current model. ○ As expected efficient!
  19. 19. Classic AL: Uncertainty Sampling 19 ● Learner C drawbacks ○ Some questions are difficult! ○ Topic or context switches!
  20. 20. ● Teacher-Aware strategy (Learner M) ○ Inspired by ACT-R declarative memory model, saying “Information associated with recently retrieved information is easier to retrieve”, ○ minimize the distance between consecutive queries In response to the drawbacks 20
  21. 21. ● Teacher-Aware strategy (Learner M) ○ Inspired by ACT-R declarative memory model, saying “Information associated with recently retrieved information is easier to retrieve”, ○ minimize the distance between consecutive queries; ● Hybrid strategy (Learner H) ○ a tradeoff between Learner C and Learner M In response to the drawbacks 21
  22. 22. Teacher-Aware AL: Memory Effort strategy 22
  23. 23. Simulation on the entire dataset: ● Perfect users (no errors, no distraction) ● Baseline: asks random questions and cannot leverage our model to make predictions Performance in Simulation 23
  24. 24. Performance in Simulation 24
  25. 25. User study: 26 participants, the 3 strategies as conditions (within-subject). What about real users? 25
  26. 26. User study: 26 participants, the 3 strategies as conditions (within-subject). Data logged: ● NASA TLX ● Q&A, response times, prediction power ● Overall preferences What about real users? 26
  27. 27. User study: 26 participants, the 3 strategies as conditions (within-subject). Data logged: ● NASA TLX ● Q&A, response times, prediction power ● Overall preferences What about real users? 27 Our hypotheses: Learner M makes the participants reply (a) faster and (b) with less errors compared to Learner C, with Learner H achieving intermediate results.
  28. 28. Results 28 *
  29. 29. (Unexpected) Results 29 * *
  30. 30. (Unexpected) Results 30 * *
  31. 31. (Unexpected) Results 31 * *
  32. 32. ● Higher response time and more errors for Learner C. Discussion 32
  33. 33. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking Discussion 33
  34. 34. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. Discussion 34
  35. 35. Discussion 35 ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable
  36. 36. Discussion 36 ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable ○ too easy? lowering attention or cause boredom
  37. 37. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable ○ too easy? lowering attention or cause boredom ○ too predictable? using the same (maybe wrong) answer Discussion 37
  38. 38. Discussion 38 ● Overall preferences:
  39. 39. ● Overall preferences: ● Learner C as efficient Mitigating difficulty! Discussion 39
  40. 40. Discussion 40 ● Overall preferences: ● Learner C as efficient Mitigating difficulty! ● Learner M as useless Frustration and boredom!
  41. 41. Discussion 41 ● Overall preferences: ● Learner C as efficient Mitigating difficulty! ● Learner M as useless Frustration and boredom! ● AVOID USELESS QUESTIONS!
  42. 42. Conclusions Can efficiency-driven Active Learning counter its own benefits? 42
  43. 43. Can efficiency-driven Active Learning counter its own benefits? If we consider in the equation non-oracle users, yes! But we just scratched the surface... ● We need a better understanding of interaction aspects that can affect learning ● Strategies that can adapt to the specific user Conclusions 43
  44. 44. Teacher-Aware Active Robot Learning Mattia Racca, Antti Oulasvirta and Ville Kyrki mattia.racca@aalto.fi Thank you for the attention! Code available at github.com/MattiaRacca Can efficiency-driven Active Learning counter its own benefits? If we consider in the equation non-oracle users and the interaction, yes!
  45. 45. 45 Tree building algorithm
  46. 46. 46 We model the probability of attribute a applying to category c as and then we maintain a prior over these distribution. We can then compute the probability of a applying to entity e as and therefore predict attribute entities pairs, given our current model. The update step of the model is the computation of the posterior distributions given the user answer r as an observation. Attribute-Category Model
  47. 47. 47 Learner C Learner M Learner H Scores for each active learner
  48. 48. Assumption choice

×