Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
s“ Is there anything else I can help you with? ”
Challenges in Deploying an
On-Demand Crowd-Powered
Conversational Agent
T...
Challenges of Open Conversation
• Goal: A system that users can converse with
• General Purpose Dialog System
– Combining ...
What birthday gift
should I get for Laila?
Sorry I can not understand
your question.
Kenneth’s apartment.
3 / 31
• Crowd workers collectively hold a
conversation by:
1. Propose Responses
2. Vote Responses
3. Take Notes
Chorus: A Crowd-...
Vote
Propose Note
Chorus’ Worker Interface
5 / 31
Research Questions
• How hard it is to deploy such a system?
– Real-time crowdsourcing +
– Conversational interface +
– In...
We deployed Chorus
• Launched on May 20th, 2016.
• 113 users used it during 937 conversational sessions
7 / 31
How to recruit workers
fast on-demand?
• Two Common Practices
– Start recruiting on-demand (Bigham, et al., 2010)
• Pros: ...
Retainer Model
Conversation
Conv.
Ends
Wait in Retainer
Time
Conv.
Starts
Wait in Retainer
Workers’ waiting
time cost mone...
Chorus’ Recruiting Method
Conversation 1 Conversation 2
Post
HIT
Fully
Occupied
Conv. 1
Ends
Post
HIT
Wait in Retainer
Tim...
Is this recruiting method
fast enough?
• Avg first crowd response Time = 88.351 sec
21.55% first crowd respond within 30 s...
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
12 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
13 / 31
come on......
This is a YouTube link...
Not how to backup my
MySQL database
but it’s funny
what up b****h
U
U
Inappropriat...
You mean username?
we need to verify your
name
U
Flirter
Workers whats your name user?
what ?
Or my name?
real name
both
…...
Spammer Workers
• We know they exist
• 3 Main Actions
– Message
• “how are you”, “yeah”, “yes (or no)”, “Sure you can”,
or...
How does Chorus detect
abusive language?
• Word Matching
17 / 31
Malicious Users
• Abusive Languages
– Sexual content
– Profanity
– Hate speech
– Threats of criminal acts
• Solutions
– Wo...
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
19 / 31
Can I have pumpkin
congee? The cold ones
Maybe not now..
Why keep asking?
U
U
“Is there
anything else
I can help
you with?...
Nope U
U
The Dynamics
of User Intent
Are you sure?
Any other question?
to confirm exit please
type EXIT
or if you want fun...
I see... so I will need to
check the traffic at
different times of the day
Did you try Google
traffic alerts?
Are you ther...
How does Chorus know
a conversation is over?
• User & Crowd Timeout
• Crowd Voting
– Once 2 workers click “Conversation is...
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
24 / 31
I was wondering about
your name. Why is it
Chorus Bot?
How long has it been
for you here?
Is there anything I can help
you...
This worker’s opinion is
that God does not exist.
I believe in a God, but
not necessarily all of
the things in the Bible
S...
Chorus Bot can’t
reserve tables :( ?
U
Requests
For
Action
I can reserve a table
for you if you prefer
[Suggested the user...
To conclude…
1. Malicious Workers & Users
– Content
2. Identifying the End of a Conversation
– Boundary
3. When Consensus ...
What’s next?
• Why Chorus can talk?
– Decompose human workers’ tasks into sub-tasks
– Which sub-tasks can be automated?
• ...
What birthday gift
should I get for Laila?
30 / 31
What birthday gift
should I get for Laila?
TalkingToTheCrowd.org
31 / 31
Reference
• Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken
Dialog Research Community to Real Us...
Upcoming SlideShare
Loading in …5
×

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

0 views

Published on

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent
Ting-Hao K. Huang, Walter S. Lasecki, Amos Azaria, Jeffrey P. Bigham.
In Proceedings of Conference on Human Computation & Crowdsourcing (HCOMP 2016), 2016, Austin, TX, USA.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

  1. 1. s“ Is there anything else I can help you with? ” Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent Ting-Hao K. Huang Walter S. Lasecki Amos Azaria Jeffrey P. Bigham 1 / 31
  2. 2. Challenges of Open Conversation • Goal: A system that users can converse with • General Purpose Dialog System – Combining multiple dialog systems • DialPort (Zhao, et al., 2016) – Adapting a model to many other domains • Walker, et al., 2007; Sun, et al., 2016 – Chit-chat system • Hold social conversations (Banchs, et al., 2012) • It is still a very hard problem… – Alexa Prize: $2.5 Million • “… achieves the grand challenge of conversing coherently and engagingly with humans on popular topics for 20 minutes.” 2 / 31
  3. 3. What birthday gift should I get for Laila? Sorry I can not understand your question. Kenneth’s apartment. 3 / 31
  4. 4. • Crowd workers collectively hold a conversation by: 1. Propose Responses 2. Vote Responses 3. Take Notes Chorus: A Crowd-powered Conversation Assistant Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013. Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162. 4 / 31
  5. 5. Vote Propose Note Chorus’ Worker Interface 5 / 31
  6. 6. Research Questions • How hard it is to deploy such a system? – Real-time crowdsourcing + – Conversational interface + – Intelligent agent • How will users use it? • Will workers be capable to handle all the tasks? 6 / 31
  7. 7. We deployed Chorus • Launched on May 20th, 2016. • 113 users used it during 937 conversational sessions 7 / 31
  8. 8. How to recruit workers fast on-demand? • Two Common Practices – Start recruiting on-demand (Bigham, et al., 2010) • Pros: Workers are engaged when waiting • Cons: Expensive to have workers wait longer – Keep workers on-call (Retainer) (Bernstein, et al., 2011) • Pros: Quick response • Cons: A retainer runs on money / “Cold start” • Both are designed for short tasks 8 / 31
  9. 9. Retainer Model Conversation Conv. Ends Wait in Retainer Time Conv. Starts Wait in Retainer Workers’ waiting time cost money. 9 / 31
  10. 10. Chorus’ Recruiting Method Conversation 1 Conversation 2 Post HIT Fully Occupied Conv. 1 Ends Post HIT Wait in Retainer Time 10 / 31
  11. 11. Is this recruiting method fast enough? • Avg first crowd response Time = 88.351 sec 21.55% first crowd respond within 30 sec 56.08% fist crowd respond within 1 min 81.77% crowd respond within 2 min 90.06% crowd respond within 3 min 11 / 31
  12. 12. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 12 / 31
  13. 13. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 13 / 31
  14. 14. come on...... This is a YouTube link... Not how to backup my MySQL database but it’s funny what up b****h U U Inappropriate Workers Try that [The YouTube link of “Bryan Cranston’s Super Sweet 60” of “Jimmy Kimmel Live”] U[Ask How to backup a MySQL database] 14 / 31
  15. 15. You mean username? we need to verify your name U Flirter Workers whats your name user? what ? Or my name? real name both … 15 / 31
  16. 16. Spammer Workers • We know they exist • 3 Main Actions – Message • “how are you”, “yeah”, “yes (or no)”, “Sure you can”, or “It suits you best.” – Note • “user is dumb” • “like all the answers.” – Vote • Upvote on almost all messages 16 / 31
  17. 17. How does Chorus detect abusive language? • Word Matching 17 / 31
  18. 18. Malicious Users • Abusive Languages – Sexual content – Profanity – Hate speech – Threats of criminal acts • Solutions – Word detection 18 / 31
  19. 19. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 19 / 31
  20. 20. Can I have pumpkin congee? The cold ones Maybe not now.. Why keep asking? U U “Is there anything else I can help you with?” That should be fine Is there anything else I can help you with? That would be great actually. :) Is there anything else? … U [Ask what can he/she eat or drink after a dental surgery] 20 / 31
  21. 21. Nope U U The Dynamics of User Intent Are you sure? Any other question? to confirm exit please type EXIT or if you want funny cat jokes type CATS CATS … 21 / 31
  22. 22. I see... so I will need to check the traffic at different times of the day Did you try Google traffic alerts? Are you there? U User Timeout Please wait for a few minutes... UIs there an easy way to check traffic status between Miami and Key West? [User didn’t respond for 2 minutes] 22 / 31
  23. 23. How does Chorus know a conversation is over? • User & Crowd Timeout • Crowd Voting – Once 2 workers click “Conversation is over” 23 / 31
  24. 24. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 24 / 31
  25. 25. I was wondering about your name. Why is it Chorus Bot? How long has it been for you here? Is there anything I can help you with? About 3 minutes U U Collective Identity & Personality I am not sure. I’m new to this. 25 / 31
  26. 26. This worker’s opinion is that God does not exist. I believe in a God, but not necessarily all of the things in the Bible Subjective Questions Evolution can’t be disproven, but neither can creationism in a sense. Is that all? UDo you believe Bible is God’s word? [ Few messages later ] 26 / 31
  27. 27. Chorus Bot can’t reserve tables :( ? U Requests For Action I can reserve a table for you if you prefer [Suggested the user to call a restaurant’s number to make a reservation.] what time and how many people? 27 / 31
  28. 28. To conclude… 1. Malicious Workers & Users – Content 2. Identifying the End of a Conversation – Boundary 3. When Consensus Is Not Enough – Scope 28 / 31
  29. 29. What’s next? • Why Chorus can talk? – Decompose human workers’ tasks into sub-tasks – Which sub-tasks can be automated? • What can we learn from the data? – User’s questions – Crowd Response – Votes 29 / 31
  30. 30. What birthday gift should I get for Laila? 30 / 31
  31. 31. What birthday gift should I get for Laila? TalkingToTheCrowd.org 31 / 31
  32. 32. Reference • Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken Dialog Research Community to Real User Data. arXiv preprint arXiv:1606.02562.. • Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012 System Demonstrations (pp. 37-42). Association for Computational Linguistics. • Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456. • Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users (Doctoral dissertation, YAHOO! Research). • Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T. (2010, October). VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 333-342). ACM. • Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October). Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology (pp. 33-42). ACM.

×