Advertisement
Advertisement

More Related Content

Similar to "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent(20)

Advertisement

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

  1. s“ Is there anything else I can help you with? ” Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent Ting-Hao K. Huang Walter S. Lasecki Amos Azaria Jeffrey P. Bigham 1 / 31
  2. Challenges of Open Conversation • Goal: A system that users can converse with • General Purpose Dialog System – Combining multiple dialog systems • DialPort (Zhao, et al., 2016) – Adapting a model to many other domains • Walker, et al., 2007; Sun, et al., 2016 – Chit-chat system • Hold social conversations (Banchs, et al., 2012) • It is still a very hard problem… – Alexa Prize: $2.5 Million • “… achieves the grand challenge of conversing coherently and engagingly with humans on popular topics for 20 minutes.” 2 / 31
  3. What birthday gift should I get for Laila? Sorry I can not understand your question. Kenneth’s apartment. 3 / 31
  4. • Crowd workers collectively hold a conversation by: 1. Propose Responses 2. Vote Responses 3. Take Notes Chorus: A Crowd-powered Conversation Assistant Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013. Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162. 4 / 31
  5. Vote Propose Note Chorus’ Worker Interface 5 / 31
  6. Research Questions • How hard it is to deploy such a system? – Real-time crowdsourcing + – Conversational interface + – Intelligent agent • How will users use it? • Will workers be capable to handle all the tasks? 6 / 31
  7. We deployed Chorus • Launched on May 20th, 2016. • 113 users used it during 937 conversational sessions 7 / 31
  8. How to recruit workers fast on-demand? • Two Common Practices – Start recruiting on-demand (Bigham, et al., 2010) • Pros: Workers are engaged when waiting • Cons: Expensive to have workers wait longer – Keep workers on-call (Retainer) (Bernstein, et al., 2011) • Pros: Quick response • Cons: A retainer runs on money / “Cold start” • Both are designed for short tasks 8 / 31
  9. Retainer Model Conversation Conv. Ends Wait in Retainer Time Conv. Starts Wait in Retainer Workers’ waiting time cost money. 9 / 31
  10. Chorus’ Recruiting Method Conversation 1 Conversation 2 Post HIT Fully Occupied Conv. 1 Ends Post HIT Wait in Retainer Time 10 / 31
  11. Is this recruiting method fast enough? • Avg first crowd response Time = 88.351 sec 21.55% first crowd respond within 30 sec 56.08% fist crowd respond within 1 min 81.77% crowd respond within 2 min 90.06% crowd respond within 3 min 11 / 31
  12. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 12 / 31
  13. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 13 / 31
  14. come on...... This is a YouTube link... Not how to backup my MySQL database but it’s funny what up b****h U U Inappropriate Workers Try that [The YouTube link of “Bryan Cranston’s Super Sweet 60” of “Jimmy Kimmel Live”] U[Ask How to backup a MySQL database] 14 / 31
  15. You mean username? we need to verify your name U Flirter Workers whats your name user? what ? Or my name? real name both … 15 / 31
  16. Spammer Workers • We know they exist • 3 Main Actions – Message • “how are you”, “yeah”, “yes (or no)”, “Sure you can”, or “It suits you best.” – Note • “user is dumb” • “like all the answers.” – Vote • Upvote on almost all messages 16 / 31
  17. How does Chorus detect abusive language? • Word Matching 17 / 31
  18. Malicious Users • Abusive Languages – Sexual content – Profanity – Hate speech – Threats of criminal acts • Solutions – Word detection 18 / 31
  19. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 19 / 31
  20. Can I have pumpkin congee? The cold ones Maybe not now.. Why keep asking? U U “Is there anything else I can help you with?” That should be fine Is there anything else I can help you with? That would be great actually. :) Is there anything else? … U [Ask what can he/she eat or drink after a dental surgery] 20 / 31
  21. Nope U U The Dynamics of User Intent Are you sure? Any other question? to confirm exit please type EXIT or if you want funny cat jokes type CATS CATS … 21 / 31
  22. I see... so I will need to check the traffic at different times of the day Did you try Google traffic alerts? Are you there? U User Timeout Please wait for a few minutes... UIs there an easy way to check traffic status between Miami and Key West? [User didn’t respond for 2 minutes] 22 / 31
  23. How does Chorus know a conversation is over? • User & Crowd Timeout • Crowd Voting – Once 2 workers click “Conversation is over” 23 / 31
  24. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 24 / 31
  25. I was wondering about your name. Why is it Chorus Bot? How long has it been for you here? Is there anything I can help you with? About 3 minutes U U Collective Identity & Personality I am not sure. I’m new to this. 25 / 31
  26. This worker’s opinion is that God does not exist. I believe in a God, but not necessarily all of the things in the Bible Subjective Questions Evolution can’t be disproven, but neither can creationism in a sense. Is that all? UDo you believe Bible is God’s word? [ Few messages later ] 26 / 31
  27. Chorus Bot can’t reserve tables :( ? U Requests For Action I can reserve a table for you if you prefer [Suggested the user to call a restaurant’s number to make a reservation.] what time and how many people? 27 / 31
  28. To conclude… 1. Malicious Workers & Users – Content 2. Identifying the End of a Conversation – Boundary 3. When Consensus Is Not Enough – Scope 28 / 31
  29. What’s next? • Why Chorus can talk? – Decompose human workers’ tasks into sub-tasks – Which sub-tasks can be automated? • What can we learn from the data? – User’s questions – Crowd Response – Votes 29 / 31
  30. What birthday gift should I get for Laila? 30 / 31
  31. What birthday gift should I get for Laila? TalkingToTheCrowd.org 31 / 31
  32. Reference • Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken Dialog Research Community to Real User Data. arXiv preprint arXiv:1606.02562.. • Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012 System Demonstrations (pp. 37-42). Association for Computational Linguistics. • Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456. • Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users (Doctoral dissertation, YAHOO! Research). • Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T. (2010, October). VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 333-342). ACM. • Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October). Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology (pp. 33-42). ACM.
Advertisement