SlideShare a Scribd company logo
s“ Is there anything else I can help you with? ”
Challenges in Deploying an
On-Demand Crowd-Powered
Conversational Agent
Ting-Hao K. Huang
Walter S. Lasecki
Amos Azaria
Jeffrey P. Bigham
1 / 31
Challenges of Open Conversation
• Goal: A system that users can converse with
• General Purpose Dialog System
– Combining multiple dialog systems
• DialPort (Zhao, et al., 2016)
– Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
– Chit-chat system
• Hold social conversations (Banchs, et al., 2012)
• It is still a very hard problem…
– Alexa Prize: $2.5 Million
• “… achieves the grand challenge of conversing coherently and engagingly
with humans on popular topics for 20 minutes.”
2 / 31
What birthday gift
should I get for Laila?
Sorry I can not understand
your question.
Kenneth’s apartment.
3 / 31
• Crowd workers collectively hold a
conversation by:
1. Propose Responses
2. Vote Responses
3. Take Notes
Chorus: A Crowd-powered
Conversation Assistant
Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013.
Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162.
4 / 31
Vote
Propose Note
Chorus’ Worker Interface
5 / 31
Research Questions
• How hard it is to deploy such a system?
– Real-time crowdsourcing +
– Conversational interface +
– Intelligent agent
• How will users use it?
• Will workers be capable to handle all the tasks?
6 / 31
We deployed Chorus
• Launched on May 20th, 2016.
• 113 users used it during 937 conversational sessions
7 / 31
How to recruit workers
fast on-demand?
• Two Common Practices
– Start recruiting on-demand (Bigham, et al., 2010)
• Pros: Workers are engaged when waiting
• Cons: Expensive to have workers wait longer
– Keep workers on-call (Retainer) (Bernstein, et al., 2011)
• Pros: Quick response
• Cons: A retainer runs on money / “Cold start”
• Both are designed for short tasks
8 / 31
Retainer Model
Conversation
Conv.
Ends
Wait in Retainer
Time
Conv.
Starts
Wait in Retainer
Workers’ waiting
time cost money.
9 / 31
Chorus’ Recruiting Method
Conversation 1 Conversation 2
Post
HIT
Fully
Occupied
Conv. 1
Ends
Post
HIT
Wait in Retainer
Time
10 / 31
Is this recruiting method
fast enough?
• Avg first crowd response Time = 88.351 sec
21.55% first crowd respond within 30 sec
56.08% fist crowd respond within 1 min
81.77% crowd respond within 2 min
90.06% crowd respond within 3 min
11 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
12 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
13 / 31
come on......
This is a YouTube link...
Not how to backup my
MySQL database
but it’s funny
what up b****h
U
U
Inappropriate
Workers
Try that
[The YouTube link of “Bryan
Cranston’s Super Sweet 60”
of “Jimmy Kimmel Live”]
U[Ask How to backup a
MySQL database]
14 / 31
You mean username?
we need to verify your
name
U
Flirter
Workers whats your name user?
what ?
Or my name?
real name
both
…
15 / 31
Spammer Workers
• We know they exist
• 3 Main Actions
– Message
• “how are you”, “yeah”, “yes (or no)”, “Sure you can”,
or “It suits you best.”
– Note
• “user is dumb”
• “like all the answers.”
– Vote
• Upvote on almost all messages
16 / 31
How does Chorus detect
abusive language?
• Word Matching
17 / 31
Malicious Users
• Abusive Languages
– Sexual content
– Profanity
– Hate speech
– Threats of criminal acts
• Solutions
– Word detection
18 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
19 / 31
Can I have pumpkin
congee? The cold ones
Maybe not now..
Why keep asking?
U
U
“Is there
anything else
I can help
you with?”
That should be fine
Is there anything else I
can help you with?
That would be great
actually. :)
Is there anything else?
…
U
[Ask what can he/she eat or
drink after a dental surgery]
20 / 31
Nope U
U
The Dynamics
of User Intent
Are you sure?
Any other question?
to confirm exit please
type EXIT
or if you want funny cat
jokes type CATS
CATS
…
21 / 31
I see... so I will need to
check the traffic at
different times of the day
Did you try Google
traffic alerts?
Are you there?
U
User
Timeout
Please wait for a
few minutes...
UIs there an easy way to
check traffic status between
Miami and Key West?
[User didn’t respond for 2 minutes]
22 / 31
How does Chorus know
a conversation is over?
• User & Crowd Timeout
• Crowd Voting
– Once 2 workers click “Conversation is over”
23 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
24 / 31
I was wondering about
your name. Why is it
Chorus Bot?
How long has it been
for you here?
Is there anything I can help
you with?
About 3 minutes
U
U
Collective
Identity &
Personality
I am not sure.
I’m new to this.
25 / 31
This worker’s opinion is
that God does not exist.
I believe in a God, but
not necessarily all of
the things in the Bible
Subjective
Questions
Evolution can’t be
disproven, but neither can
creationism in a sense.
Is that all?
UDo you believe Bible
is God’s word?
[ Few messages later ]
26 / 31
Chorus Bot can’t
reserve tables :( ?
U
Requests
For
Action
I can reserve a table
for you if you prefer
[Suggested the user to call
a restaurant’s number to
make a reservation.]
what time and how
many people?
27 / 31
To conclude…
1. Malicious Workers & Users
– Content
2. Identifying the End of a Conversation
– Boundary
3. When Consensus Is Not Enough
– Scope
28 / 31
What’s next?
• Why Chorus can talk?
– Decompose human workers’ tasks into sub-tasks
– Which sub-tasks can be automated?
• What can we learn from the data?
– User’s questions
– Crowd Response
– Votes
29 / 31
What birthday gift
should I get for Laila?
30 / 31
What birthday gift
should I get for Laila?
TalkingToTheCrowd.org
31 / 31
Reference
• Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken
Dialog Research Community to Real User Data. arXiv preprint
arXiv:1606.02562..
• Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system
based on the vector space model. In Proceedings of the ACL 2012 System
Demonstrations (pp. 37-42). Association for Computational Linguistics.
• Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and
domain adaptation in sentence planning for dialogue. Journal of Artificial
Intelligence Research, 30, 413-456.
• Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users
(Doctoral dissertation, YAHOO! Research).
• Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T.
(2010, October). VizWiz: nearly real-time answers to visual questions. In
Proceedings of the 23nd annual ACM symposium on User interface software
and technology (pp. 333-342). ACM.
• Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October).
Crowds in two seconds: Enabling realtime crowd-powered interfaces. In
Proceedings of the 24th annual ACM symposium on User interface software and
technology (pp. 33-42). ACM.

More Related Content

Similar to "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

Meeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your CommunityMeeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your Community
Everyday Democracy
 
Big Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrandBig Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrand
IntoTheMinds
 
Adv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - IntroductionAdv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - Introduction
Tanzila Kehkashan
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
krisztianbalog
 
UX For Digital Humanists: A Primer
UX For Digital Humanists: A PrimerUX For Digital Humanists: A Primer
UX For Digital Humanists: A Primer
craigmmacdonald
 
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
Peter Windle
 
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Joe Marquez
 
Social Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language TeachingSocial Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language Teaching
Shannon Sauro
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
Denis Parra Santander
 
A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017
davinia.hl
 
SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2
compapp 4 orang
 
Corpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussionsCorpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussions
Shi Min CHUA
 
Team Chat: A Technology for Learning
Team Chat: A Technology for LearningTeam Chat: A Technology for Learning
Team Chat: A Technology for Learning
BCcampus
 
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Jennifer Romano Bergstrom
 
Presentation on collaboration
Presentation on collaborationPresentation on collaboration
Presentation on collaboration
groupVision | optimizing group collaboration
 
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data ExpeditionASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
A Scuola di OpenCoesione
 
Classroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_ENClassroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_EN
A Scuola di OpenCoesione
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
Everyday Democracy
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
Everyday Democracy
 
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Fabrizio Fornara
 

Similar to "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent (20)

Meeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your CommunityMeeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your Community
 
Big Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrandBig Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrand
 
Adv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - IntroductionAdv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - Introduction
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
UX For Digital Humanists: A Primer
UX For Digital Humanists: A PrimerUX For Digital Humanists: A Primer
UX For Digital Humanists: A Primer
 
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
 
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
 
Social Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language TeachingSocial Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language Teaching
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017
 
SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2
 
Corpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussionsCorpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussions
 
Team Chat: A Technology for Learning
Team Chat: A Technology for LearningTeam Chat: A Technology for Learning
Team Chat: A Technology for Learning
 
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
 
Presentation on collaboration
Presentation on collaborationPresentation on collaboration
Presentation on collaboration
 
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data ExpeditionASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
 
Classroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_ENClassroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_EN
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
 
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
 

More from Ting-Hao Huang

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
Ting-Hao Huang
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Ting-Hao Huang
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
Ting-Hao Huang
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
Ting-Hao Huang
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
Ting-Hao Huang
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical Analysis
Ting-Hao Huang
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Ting-Hao Huang
 

More from Ting-Hao Huang (7)

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical Analysis
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
 

Recently uploaded

Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 

Recently uploaded (20)

Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

  • 1. s“ Is there anything else I can help you with? ” Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent Ting-Hao K. Huang Walter S. Lasecki Amos Azaria Jeffrey P. Bigham 1 / 31
  • 2. Challenges of Open Conversation • Goal: A system that users can converse with • General Purpose Dialog System – Combining multiple dialog systems • DialPort (Zhao, et al., 2016) – Adapting a model to many other domains • Walker, et al., 2007; Sun, et al., 2016 – Chit-chat system • Hold social conversations (Banchs, et al., 2012) • It is still a very hard problem… – Alexa Prize: $2.5 Million • “… achieves the grand challenge of conversing coherently and engagingly with humans on popular topics for 20 minutes.” 2 / 31
  • 3. What birthday gift should I get for Laila? Sorry I can not understand your question. Kenneth’s apartment. 3 / 31
  • 4. • Crowd workers collectively hold a conversation by: 1. Propose Responses 2. Vote Responses 3. Take Notes Chorus: A Crowd-powered Conversation Assistant Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013. Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162. 4 / 31
  • 6. Research Questions • How hard it is to deploy such a system? – Real-time crowdsourcing + – Conversational interface + – Intelligent agent • How will users use it? • Will workers be capable to handle all the tasks? 6 / 31
  • 7. We deployed Chorus • Launched on May 20th, 2016. • 113 users used it during 937 conversational sessions 7 / 31
  • 8. How to recruit workers fast on-demand? • Two Common Practices – Start recruiting on-demand (Bigham, et al., 2010) • Pros: Workers are engaged when waiting • Cons: Expensive to have workers wait longer – Keep workers on-call (Retainer) (Bernstein, et al., 2011) • Pros: Quick response • Cons: A retainer runs on money / “Cold start” • Both are designed for short tasks 8 / 31
  • 9. Retainer Model Conversation Conv. Ends Wait in Retainer Time Conv. Starts Wait in Retainer Workers’ waiting time cost money. 9 / 31
  • 10. Chorus’ Recruiting Method Conversation 1 Conversation 2 Post HIT Fully Occupied Conv. 1 Ends Post HIT Wait in Retainer Time 10 / 31
  • 11. Is this recruiting method fast enough? • Avg first crowd response Time = 88.351 sec 21.55% first crowd respond within 30 sec 56.08% fist crowd respond within 1 min 81.77% crowd respond within 2 min 90.06% crowd respond within 3 min 11 / 31
  • 12. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 12 / 31
  • 13. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 13 / 31
  • 14. come on...... This is a YouTube link... Not how to backup my MySQL database but it’s funny what up b****h U U Inappropriate Workers Try that [The YouTube link of “Bryan Cranston’s Super Sweet 60” of “Jimmy Kimmel Live”] U[Ask How to backup a MySQL database] 14 / 31
  • 15. You mean username? we need to verify your name U Flirter Workers whats your name user? what ? Or my name? real name both … 15 / 31
  • 16. Spammer Workers • We know they exist • 3 Main Actions – Message • “how are you”, “yeah”, “yes (or no)”, “Sure you can”, or “It suits you best.” – Note • “user is dumb” • “like all the answers.” – Vote • Upvote on almost all messages 16 / 31
  • 17. How does Chorus detect abusive language? • Word Matching 17 / 31
  • 18. Malicious Users • Abusive Languages – Sexual content – Profanity – Hate speech – Threats of criminal acts • Solutions – Word detection 18 / 31
  • 19. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 19 / 31
  • 20. Can I have pumpkin congee? The cold ones Maybe not now.. Why keep asking? U U “Is there anything else I can help you with?” That should be fine Is there anything else I can help you with? That would be great actually. :) Is there anything else? … U [Ask what can he/she eat or drink after a dental surgery] 20 / 31
  • 21. Nope U U The Dynamics of User Intent Are you sure? Any other question? to confirm exit please type EXIT or if you want funny cat jokes type CATS CATS … 21 / 31
  • 22. I see... so I will need to check the traffic at different times of the day Did you try Google traffic alerts? Are you there? U User Timeout Please wait for a few minutes... UIs there an easy way to check traffic status between Miami and Key West? [User didn’t respond for 2 minutes] 22 / 31
  • 23. How does Chorus know a conversation is over? • User & Crowd Timeout • Crowd Voting – Once 2 workers click “Conversation is over” 23 / 31
  • 24. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 24 / 31
  • 25. I was wondering about your name. Why is it Chorus Bot? How long has it been for you here? Is there anything I can help you with? About 3 minutes U U Collective Identity & Personality I am not sure. I’m new to this. 25 / 31
  • 26. This worker’s opinion is that God does not exist. I believe in a God, but not necessarily all of the things in the Bible Subjective Questions Evolution can’t be disproven, but neither can creationism in a sense. Is that all? UDo you believe Bible is God’s word? [ Few messages later ] 26 / 31
  • 27. Chorus Bot can’t reserve tables :( ? U Requests For Action I can reserve a table for you if you prefer [Suggested the user to call a restaurant’s number to make a reservation.] what time and how many people? 27 / 31
  • 28. To conclude… 1. Malicious Workers & Users – Content 2. Identifying the End of a Conversation – Boundary 3. When Consensus Is Not Enough – Scope 28 / 31
  • 29. What’s next? • Why Chorus can talk? – Decompose human workers’ tasks into sub-tasks – Which sub-tasks can be automated? • What can we learn from the data? – User’s questions – Crowd Response – Votes 29 / 31
  • 30. What birthday gift should I get for Laila? 30 / 31
  • 31. What birthday gift should I get for Laila? TalkingToTheCrowd.org 31 / 31
  • 32. Reference • Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken Dialog Research Community to Real User Data. arXiv preprint arXiv:1606.02562.. • Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012 System Demonstrations (pp. 37-42). Association for Computational Linguistics. • Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456. • Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users (Doctoral dissertation, YAHOO! Research). • Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T. (2010, October). VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 333-342). ACM. • Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October). Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology (pp. 33-42). ACM.