Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

An Intelligent Assistant for High-Level Task Understanding

1,474 views

Published on

Proceedings of The 21st Annual Meeting of the Intelligent Interfaces Community (IUI 2016), pages 169-174, Sonoma, CA, March 7-10, 2016. ACM.

Published in: Technology
  • Be the first to comment

An Intelligent Assistant for High-Level Task Understanding

  1. 1. Ming Sun Yun-Nung Chen Alexander I. Rudnicky School of Computer Science Carnegie Mellon University An Intelligent Assistant for High-Level Task Understanding
  2. 2. Many devices, many apps… 2  Rich functionality is now available  How well can it support user activities?
  3. 3. 3 Book a restaurant! I can do Yelp!
  4. 4. 4 Show me bus to China Palace. I know Maps!
  5. 5. 5 Tell kids we will have Chinese tonight. I will Text!
  6. 6. Multi-domain Dialog – Current System System: What can I do for you? User: Could you arrange a dinner for me and my friends? System: Sorry I don’t understand that. What can I do for you? User: Can I book a table for three in Täkō downtown for this Friday? … System: OK. What can I do for you next? User: Show me the bus from here. System: Where is your destination please? User: Täkō downtown Pittsburgh. … System: What should I do next? User: Send the bus route to Carrie and Peter. System: OK. Sending the following message to Carrie and Peter:6 Cannot handle complex intention Passively support cross-domain dialo No shared context
  7. 7. Multi-domain Dialog – Human Assistant Assistant: What can I do for you? User: Could you arrange a dinner for me and my friends? Assistant : What kind of food do you prefer? User: Mexican? Assistant : How about Täkō? I can book a table for you. User: Sounds good! Can I take a bus there? Assistant : 61 A/B/C/D can take you there. Do you want to send this to your friends? User: Great! Send it to Carrie and Peter. Assistant: OK. The bus route 61 has been sent. 7 Understand complex intentions Actively support cross-domain dialo Maintain context
  8. 8. Intention Understanding Yes Dad! Special Team 1! 8 Plan a dinner!
  9. 9. Intention Understanding Find a restaurant! 9 Yelp!
  10. 10. Intention Understanding Find a bus route! 10 Maps!
  11. 11. Intention Understanding Message Agnes & Edith! 11 Messenger!
  12. 12. Intention Understanding Yes Dad! Special Team 2! 12 Set up meeting!
  13. 13. Approach  Step 1: Observe human user perform multi- domain tasks  Step 2: Learn to assist at task level  Map an activity description to a set of domain apps  Interact at the task level 13
  14. 14. Data Collection 1 – Smart Phone Wednesday 17:08 – 17:14 CMU Messenge r Gmail Browser Calendar Schedule a visit to CMU Lab  Log app invocation + time/date/location  Separate log into episodes if there is 3 minute inactivity 14
  15. 15. Data Collection 2 – Wizard-of-Oz 15 Find me an Indian place near CMU. Yuva India is nearby. Monday 10:08 – 10:15 Home Yelp Maps Messeng er Schedule a lunch with David. Music
  16. 16. Data Collection 2 – Wizard-of-Oz 16 When is the next bus to school? In 10 min, 61C. Monday 10:08 – 10:15 Home Yelp Maps Messeng er Schedule a lunch with David. Music
  17. 17. Data Collection 2 – Wizard-of-Oz 17 Tell David to meet me there in 15 min. Message sent. Monday 10:08 – 10:15 Home Yelp Maps Messeng er Schedule a lunch with David. Music
  18. 18. Corpus  533 real-life multi-domain interactions from 14 real users  12 native English speakers (2 non-)  4 males & 10 females  Mean age: 31  Total # unique apps: 130 (Mean = 19/user) 18 Resources Examples Usage App sequences Yelp->Maps->Messenger structure/arrangement Task descriptions “Schedule a lunch with David” nature of the intention, language reference User utterances “Find me an Indian place near CMU.” language reference Meta data Monday, 10:08 – 10:15, Home contexts of the tasks
  19. 19. Intention Realization Model • [Yelp, Maps, Uber] • November, weekday, afternoon, office • “Try to arrange evening out” • [United Airlines, AirBnB, Calendar] • September, weekend, morning, home • “Planning a trip to California” • [TripAdvisor, United Airlines] • July, weekend, morning, home • “I was planning a trip to Oregon” “Plan a weekend in Virginia” • […, …, …] • … •“Shared picture to Alexis” Infer: 1) Supportive Domains: United Airlines, TripAdvisor, AirBnB 2) Summarization: “plan trip” 19
  20. 20. Find similar past experience  Cluster-based:  K-means clustering on user generated language  Neighbor-based:  KNN 1 2 3 Cluster- based Neighbor- based 20
  21. 21. Yelp Yelp OpenTabl e Yelp Maps Maps Maps Maps Messeng er Email Email Email Realize domains from past experience  Representative Sequence  Multi-label Classification App sequences of similar experience (ROVER) 21
  22. 22. Some Obstacles to Remove  Language-mismatch  Solution: Query Enrichment (QryEnr)  [“shoot”, “photo”] -> [“shoot”, “take”, “photo”, “picture”]  word2vec, GoogleNews model  App-mismatch  Solution: App Similarity (AppSim)  Functionality space (derived from app descriptions) to identify apps  Data-driven: doc2vec on app store texts  Rule-based: app package name  Knowledge-driven: Google Play similar app suggestions 22
  23. 23. 23 Gap between Generic and Personalized Models QryEnr, AppSim, QryEnr+AppSim reduce the gap of F1 0 5 10 15 20 25 30 35
  24. 24. Compare different AppSim 0 10 20 30 40 50 60 70 Precision Recall F1 Baseline Data Knowledge Rule Combine 24
  25. 25. Compare different AppSim  Combining three approaches performs the best  Knowledge-driven and data-driven have low coverage among (manufacture) apps  Rule-based is better than the other two individual approaches 25
  26. 26. Learning to talk at the task level  Techniques:  (Extractive/abstractive) summarization  Key phrase extraction [RAKE]  User study:  Key phrase extraction + user generated language  Ranked list of key phrases + user’s binary judgment 1. solutions online 2. project file 3. Google drive 4. math problems 5. physics homework 6. answers online … [descriptions] Looking up math problems. Now open a browser. Go to slader.com. Doing physics homework. … [utterances] Go to my Google drive. Look up kinematic equations. Now open my calculator so I can plug in numbers. … 26
  27. 27. 1. solutions online 2. project file 3. Google drive 4. math problems 5. physics homework 6. answers online … Learning to talk at the task level  Metrics  Mean Reciprocal Rank (MRR)  Result:  MRR ~0.6  understandable verbal reference show up in top 2 of the ranked list [descriptions] Looking up math problems. Now open a browser. Go to slader.com. Doing physics homework. … [utterances] Go to my Google drive. Look up kinematic equations. Now open my calculator so I can plug in numbers. … 27
  28. 28. Summary  Collected real-life cross-domain interactions from real users  HELPR: a framework to learn assistance at the task level  Suggest a set of supportive domains to accomplish the task  Personalized model > Generic model  The gap can be reduced by QryEnr + AppSim  Generate language reference to communicate verbally at task level 28
  29. 29. HELPR demo  Interface  HELPR display  GoogleASR  Android TTS  HELPR server  User models 29
  30. 30. Thank you  Questions? 30

×