SlideShare a Scribd company logo
s“ Is there anything else I can help you with? ”
Challenges in Deploying an
On-Demand Crowd-Powered
Conversational Agent
Ting-Hao K. Huang
Walter S. Lasecki
Amos Azaria
Jeffrey P. Bigham
1 / 31
Challenges of Open Conversation
• Goal: A system that users can converse with
• General Purpose Dialog System
– Combining multiple dialog systems
• DialPort (Zhao, et al., 2016)
– Adapting a model to many other domains
• Walker, et al., 2007; Sun, et al., 2016
– Chit-chat system
• Hold social conversations (Banchs, et al., 2012)
• It is still a very hard problem…
– Alexa Prize: $2.5 Million
• “… achieves the grand challenge of conversing coherently and engagingly
with humans on popular topics for 20 minutes.”
2 / 31
What birthday gift
should I get for Laila?
Sorry I can not understand
your question.
Kenneth’s apartment.
3 / 31
• Crowd workers collectively hold a
conversation by:
1. Propose Responses
2. Vote Responses
3. Take Notes
Chorus: A Crowd-powered
Conversation Assistant
Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013.
Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162.
4 / 31
Vote
Propose Note
Chorus’ Worker Interface
5 / 31
Research Questions
• How hard it is to deploy such a system?
– Real-time crowdsourcing +
– Conversational interface +
– Intelligent agent
• How will users use it?
• Will workers be capable to handle all the tasks?
6 / 31
We deployed Chorus
• Launched on May 20th, 2016.
• 113 users used it during 937 conversational sessions
7 / 31
How to recruit workers
fast on-demand?
• Two Common Practices
– Start recruiting on-demand (Bigham, et al., 2010)
• Pros: Workers are engaged when waiting
• Cons: Expensive to have workers wait longer
– Keep workers on-call (Retainer) (Bernstein, et al., 2011)
• Pros: Quick response
• Cons: A retainer runs on money / “Cold start”
• Both are designed for short tasks
8 / 31
Retainer Model
Conversation
Conv.
Ends
Wait in Retainer
Time
Conv.
Starts
Wait in Retainer
Workers’ waiting
time cost money.
9 / 31
Chorus’ Recruiting Method
Conversation 1 Conversation 2
Post
HIT
Fully
Occupied
Conv. 1
Ends
Post
HIT
Wait in Retainer
Time
10 / 31
Is this recruiting method
fast enough?
• Avg first crowd response Time = 88.351 sec
21.55% first crowd respond within 30 sec
56.08% fist crowd respond within 1 min
81.77% crowd respond within 2 min
90.06% crowd respond within 3 min
11 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
12 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
13 / 31
come on......
This is a YouTube link...
Not how to backup my
MySQL database
but it’s funny
what up b****h
U
U
Inappropriate
Workers
Try that
[The YouTube link of “Bryan
Cranston’s Super Sweet 60”
of “Jimmy Kimmel Live”]
U[Ask How to backup a
MySQL database]
14 / 31
You mean username?
we need to verify your
name
U
Flirter
Workers whats your name user?
what ?
Or my name?
real name
both
…
15 / 31
Spammer Workers
• We know they exist
• 3 Main Actions
– Message
• “how are you”, “yeah”, “yes (or no)”, “Sure you can”,
or “It suits you best.”
– Note
• “user is dumb”
• “like all the answers.”
– Vote
• Upvote on almost all messages
16 / 31
How does Chorus detect
abusive language?
• Word Matching
17 / 31
Malicious Users
• Abusive Languages
– Sexual content
– Profanity
– Hate speech
– Threats of criminal acts
• Solutions
– Word detection
18 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
19 / 31
Can I have pumpkin
congee? The cold ones
Maybe not now..
Why keep asking?
U
U
“Is there
anything else
I can help
you with?”
That should be fine
Is there anything else I
can help you with?
That would be great
actually. :)
Is there anything else?
…
U
[Ask what can he/she eat or
drink after a dental surgery]
20 / 31
Nope U
U
The Dynamics
of User Intent
Are you sure?
Any other question?
to confirm exit please
type EXIT
or if you want funny cat
jokes type CATS
CATS
…
21 / 31
I see... so I will need to
check the traffic at
different times of the day
Did you try Google
traffic alerts?
Are you there?
U
User
Timeout
Please wait for a
few minutes...
UIs there an easy way to
check traffic status between
Miami and Key West?
[User didn’t respond for 2 minutes]
22 / 31
How does Chorus know
a conversation is over?
• User & Crowd Timeout
• Crowd Voting
– Once 2 workers click “Conversation is over”
23 / 31
Challenges
1. Malicious Workers & Users
2. Identifying the End of a Conversation
3. When Consensus Is Not Enough
24 / 31
I was wondering about
your name. Why is it
Chorus Bot?
How long has it been
for you here?
Is there anything I can help
you with?
About 3 minutes
U
U
Collective
Identity &
Personality
I am not sure.
I’m new to this.
25 / 31
This worker’s opinion is
that God does not exist.
I believe in a God, but
not necessarily all of
the things in the Bible
Subjective
Questions
Evolution can’t be
disproven, but neither can
creationism in a sense.
Is that all?
UDo you believe Bible
is God’s word?
[ Few messages later ]
26 / 31
Chorus Bot can’t
reserve tables :( ?
U
Requests
For
Action
I can reserve a table
for you if you prefer
[Suggested the user to call
a restaurant’s number to
make a reservation.]
what time and how
many people?
27 / 31
To conclude…
1. Malicious Workers & Users
– Content
2. Identifying the End of a Conversation
– Boundary
3. When Consensus Is Not Enough
– Scope
28 / 31
What’s next?
• Why Chorus can talk?
– Decompose human workers’ tasks into sub-tasks
– Which sub-tasks can be automated?
• What can we learn from the data?
– User’s questions
– Crowd Response
– Votes
29 / 31
What birthday gift
should I get for Laila?
30 / 31
What birthday gift
should I get for Laila?
TalkingToTheCrowd.org
31 / 31
Reference
• Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken
Dialog Research Community to Real User Data. arXiv preprint
arXiv:1606.02562..
• Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system
based on the vector space model. In Proceedings of the ACL 2012 System
Demonstrations (pp. 37-42). Association for Computational Linguistics.
• Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and
domain adaptation in sentence planning for dialogue. Journal of Artificial
Intelligence Research, 30, 413-456.
• Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users
(Doctoral dissertation, YAHOO! Research).
• Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T.
(2010, October). VizWiz: nearly real-time answers to visual questions. In
Proceedings of the 23nd annual ACM symposium on User interface software
and technology (pp. 333-342). ACM.
• Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October).
Crowds in two seconds: Enabling realtime crowd-powered interfaces. In
Proceedings of the 24th annual ACM symposium on User interface software and
technology (pp. 33-42). ACM.

More Related Content

Similar to "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

Meeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your CommunityMeeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your Community
Everyday Democracy
 
Big Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrandBig Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrand
IntoTheMinds
 
Adv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - IntroductionAdv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - Introduction
Tanzila Kehkashan
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
krisztianbalog
 
UX For Digital Humanists: A Primer
UX For Digital Humanists: A PrimerUX For Digital Humanists: A Primer
UX For Digital Humanists: A Primer
craigmmacdonald
 
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
Peter Windle
 
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Joe Marquez
 
Social Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language TeachingSocial Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language Teaching
Shannon Sauro
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
Denis Parra Santander
 
A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017
davinia.hl
 
SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2
compapp 4 orang
 
Corpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussionsCorpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussions
Shi Min CHUA
 
Team Chat: A Technology for Learning
Team Chat: A Technology for LearningTeam Chat: A Technology for Learning
Team Chat: A Technology for Learning
BCcampus
 
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Jennifer Romano Bergstrom
 
Presentation on collaboration
Presentation on collaborationPresentation on collaboration
Presentation on collaboration
groupVision | optimizing group collaboration
 
Classroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_ENClassroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_EN
A Scuola di OpenCoesione
 
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data ExpeditionASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
A Scuola di OpenCoesione
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
Everyday Democracy
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
Everyday Democracy
 
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Fabrizio Fornara
 

Similar to "Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent (20)

Meeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your CommunityMeeting Agenda to Identify the Issue in Your Community
Meeting Agenda to Identify the Issue in Your Community
 
Big Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrandBig Data and ethics meetup : slides presentation michael ekstrand
Big Data and ethics meetup : slides presentation michael ekstrand
 
Adv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - IntroductionAdv. HCI Lecture1 - Introduction
Adv. HCI Lecture1 - Introduction
 
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?What Does Conversational Information Access Exactly Mean and How to Evaluate It?
What Does Conversational Information Access Exactly Mean and How to Evaluate It?
 
UX For Digital Humanists: A Primer
UX For Digital Humanists: A PrimerUX For Digital Humanists: A Primer
UX For Digital Humanists: A Primer
 
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
EU-CONEXUS: Technology, Interaction and Community for online teaching and lea...
 
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...Understanding Understanding: Implementing Design-Focused Service Initiatives ...
Understanding Understanding: Implementing Design-Focused Service Initiatives ...
 
Social Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language TeachingSocial Media Needs Analysis for Language Teaching
Social Media Needs Analysis for Language Teaching
 
Twitter in Academic Conferences
Twitter in Academic ConferencesTwitter in Academic Conferences
Twitter in Academic Conferences
 
A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017A Social Learning Space Grid for MOOCs, EMOOCs2017
A Social Learning Space Grid for MOOCs, EMOOCs2017
 
SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2SLIDE ASSIGNMENT 2
SLIDE ASSIGNMENT 2
 
Corpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussionsCorpus Analysis of MOOC discussions
Corpus Analysis of MOOC discussions
 
Team Chat: A Technology for Learning
Team Chat: A Technology for LearningTeam Chat: A Technology for Learning
Team Chat: A Technology for Learning
 
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
Usable Government Forms and Surveys: Best Practices for Design (from MoDevGov)
 
Presentation on collaboration
Presentation on collaborationPresentation on collaboration
Presentation on collaboration
 
Classroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_ENClassroom Exercise 1: Data Expedition - ASOC2324_EN
Classroom Exercise 1: Data Expedition - ASOC2324_EN
 
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data ExpeditionASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
ASOCEU - Lesson 1- In-Class Exercise 1: Data Expedition
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
 
Initial Goal Setting Activity
Initial Goal Setting ActivityInitial Goal Setting Activity
Initial Goal Setting Activity
 
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...Exploring the Factors that Promote L2 Learner Participation and Interaction o...
Exploring the Factors that Promote L2 Learner Participation and Interaction o...
 

More from Ting-Hao Huang

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
Ting-Hao Huang
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Ting-Hao Huang
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
Ting-Hao Huang
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
Ting-Hao Huang
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
Ting-Hao Huang
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical Analysis
Ting-Hao Huang
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Ting-Hao Huang
 

More from Ting-Hao Huang (7)

A Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over TimeA Crowd-Powered Conversational Assistant That Automates Itself Over Time
A Crowd-Powered Conversational Assistant That Automates Itself Over Time
 
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
Evorus: A Crowd-Powered Conversational Assistant Built to Automate Itself Ove...
 
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
A 10-Month-Long Deployment Study of On-Demand Recruiting for Low-Latency Crow...
 
Real-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity ExtractionReal-time On-Demand Crowd-powered Entity Extraction
Real-time On-Demand Crowd-powered Entity Extraction
 
Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)Visual Storytelling (NAACL 2016, Poster)
Visual Storytelling (NAACL 2016, Poster)
 
Social Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical AnalysisSocial Metaphor Detection via Topical Analysis
Social Metaphor Detection via Topical Analysis
 
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIsGuardian: A Crowd-Powered Spoken Dialog System for Web APIs
Guardian: A Crowd-Powered Spoken Dialog System for Web APIs
 

Recently uploaded

Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
CEPTES Software Inc
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Networks
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
Ivanti
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
RaminGhanbari2
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Kunal Gupta
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
shyamraj55
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
LINUS PROJECTS (INDIA)
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
Matthias Neugebauer
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Nicolás Lopéz
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
Brian Pichman
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
aslasdfmkhan4750
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Torry Harris
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
Neo4j
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
Jimmy Lai
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
alexjohnson7307
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
BrainSell Technologies
 

Recently uploaded (20)

Salesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot WorkshopSalesforce AI & Einstein Copilot Workshop
Salesforce AI & Einstein Copilot Workshop
 
IPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite SolutionIPLOOK Remote-Sensing Satellite Solution
IPLOOK Remote-Sensing Satellite Solution
 
July Patch Tuesday
July Patch TuesdayJuly Patch Tuesday
July Patch Tuesday
 
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyyActive Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
Active Inference is a veryyyyyyyyyyyyyyyyyyyyyyyy
 
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptxDublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
Dublin_mulesoft_meetup_Mulesoft_Salesforce_Integration (1).pptx
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
Integrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecaseIntegrating Kafka with MuleSoft 4 and usecase
Integrating Kafka with MuleSoft 4 and usecase
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
Pigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending PlantPigging Unit Lubricant Oil Blending Plant
Pigging Unit Lubricant Oil Blending Plant
 
Opencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of MünsterOpencast Summit 2024 — Opencast @ University of Münster
Opencast Summit 2024 — Opencast @ University of Münster
 
Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024Vertex AI Agent Builder - GDG Alicante - Julio 2024
Vertex AI Agent Builder - GDG Alicante - Julio 2024
 
Uncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in LibrariesUncharted Together- Navigating AI's New Frontiers in Libraries
Uncharted Together- Navigating AI's New Frontiers in Libraries
 
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
High Profile Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class ...
 
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...Evolution of iPaaS - simplify IT workloads to provide a unified view of  data...
Evolution of iPaaS - simplify IT workloads to provide a unified view of data...
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdfBT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
BT & Neo4j: Knowledge Graphs for Critical Enterprise Systems.pptx.pdf
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python CodebaseEuroPython 2024 - Streamlining Testing in a Large Python Codebase
EuroPython 2024 - Streamlining Testing in a Large Python Codebase
 
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
leewayhertz.com-AI agents for healthcare Applications benefits and implementa...
 
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdfAcumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
Acumatica vs. Sage Intacct vs. NetSuite _ NOW CFO.pdf
 

"Is there anything else I can help you with?": Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent

  • 1. s“ Is there anything else I can help you with? ” Challenges in Deploying an On-Demand Crowd-Powered Conversational Agent Ting-Hao K. Huang Walter S. Lasecki Amos Azaria Jeffrey P. Bigham 1 / 31
  • 2. Challenges of Open Conversation • Goal: A system that users can converse with • General Purpose Dialog System – Combining multiple dialog systems • DialPort (Zhao, et al., 2016) – Adapting a model to many other domains • Walker, et al., 2007; Sun, et al., 2016 – Chit-chat system • Hold social conversations (Banchs, et al., 2012) • It is still a very hard problem… – Alexa Prize: $2.5 Million • “… achieves the grand challenge of conversing coherently and engagingly with humans on popular topics for 20 minutes.” 2 / 31
  • 3. What birthday gift should I get for Laila? Sorry I can not understand your question. Kenneth’s apartment. 3 / 31
  • 4. • Crowd workers collectively hold a conversation by: 1. Propose Responses 2. Vote Responses 3. Take Notes Chorus: A Crowd-powered Conversation Assistant Lasecki, W. S.; Wesley, R.; Nichols, J.; Kulkarni, A.; Allen, J. F.; and Bigham, J. P. 2013. Chorus: A crowd-powered conversational assistant. In UIST 2013, UIST ’13, 151–162. 4 / 31
  • 6. Research Questions • How hard it is to deploy such a system? – Real-time crowdsourcing + – Conversational interface + – Intelligent agent • How will users use it? • Will workers be capable to handle all the tasks? 6 / 31
  • 7. We deployed Chorus • Launched on May 20th, 2016. • 113 users used it during 937 conversational sessions 7 / 31
  • 8. How to recruit workers fast on-demand? • Two Common Practices – Start recruiting on-demand (Bigham, et al., 2010) • Pros: Workers are engaged when waiting • Cons: Expensive to have workers wait longer – Keep workers on-call (Retainer) (Bernstein, et al., 2011) • Pros: Quick response • Cons: A retainer runs on money / “Cold start” • Both are designed for short tasks 8 / 31
  • 9. Retainer Model Conversation Conv. Ends Wait in Retainer Time Conv. Starts Wait in Retainer Workers’ waiting time cost money. 9 / 31
  • 10. Chorus’ Recruiting Method Conversation 1 Conversation 2 Post HIT Fully Occupied Conv. 1 Ends Post HIT Wait in Retainer Time 10 / 31
  • 11. Is this recruiting method fast enough? • Avg first crowd response Time = 88.351 sec 21.55% first crowd respond within 30 sec 56.08% fist crowd respond within 1 min 81.77% crowd respond within 2 min 90.06% crowd respond within 3 min 11 / 31
  • 12. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 12 / 31
  • 13. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 13 / 31
  • 14. come on...... This is a YouTube link... Not how to backup my MySQL database but it’s funny what up b****h U U Inappropriate Workers Try that [The YouTube link of “Bryan Cranston’s Super Sweet 60” of “Jimmy Kimmel Live”] U[Ask How to backup a MySQL database] 14 / 31
  • 15. You mean username? we need to verify your name U Flirter Workers whats your name user? what ? Or my name? real name both … 15 / 31
  • 16. Spammer Workers • We know they exist • 3 Main Actions – Message • “how are you”, “yeah”, “yes (or no)”, “Sure you can”, or “It suits you best.” – Note • “user is dumb” • “like all the answers.” – Vote • Upvote on almost all messages 16 / 31
  • 17. How does Chorus detect abusive language? • Word Matching 17 / 31
  • 18. Malicious Users • Abusive Languages – Sexual content – Profanity – Hate speech – Threats of criminal acts • Solutions – Word detection 18 / 31
  • 19. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 19 / 31
  • 20. Can I have pumpkin congee? The cold ones Maybe not now.. Why keep asking? U U “Is there anything else I can help you with?” That should be fine Is there anything else I can help you with? That would be great actually. :) Is there anything else? … U [Ask what can he/she eat or drink after a dental surgery] 20 / 31
  • 21. Nope U U The Dynamics of User Intent Are you sure? Any other question? to confirm exit please type EXIT or if you want funny cat jokes type CATS CATS … 21 / 31
  • 22. I see... so I will need to check the traffic at different times of the day Did you try Google traffic alerts? Are you there? U User Timeout Please wait for a few minutes... UIs there an easy way to check traffic status between Miami and Key West? [User didn’t respond for 2 minutes] 22 / 31
  • 23. How does Chorus know a conversation is over? • User & Crowd Timeout • Crowd Voting – Once 2 workers click “Conversation is over” 23 / 31
  • 24. Challenges 1. Malicious Workers & Users 2. Identifying the End of a Conversation 3. When Consensus Is Not Enough 24 / 31
  • 25. I was wondering about your name. Why is it Chorus Bot? How long has it been for you here? Is there anything I can help you with? About 3 minutes U U Collective Identity & Personality I am not sure. I’m new to this. 25 / 31
  • 26. This worker’s opinion is that God does not exist. I believe in a God, but not necessarily all of the things in the Bible Subjective Questions Evolution can’t be disproven, but neither can creationism in a sense. Is that all? UDo you believe Bible is God’s word? [ Few messages later ] 26 / 31
  • 27. Chorus Bot can’t reserve tables :( ? U Requests For Action I can reserve a table for you if you prefer [Suggested the user to call a restaurant’s number to make a reservation.] what time and how many people? 27 / 31
  • 28. To conclude… 1. Malicious Workers & Users – Content 2. Identifying the End of a Conversation – Boundary 3. When Consensus Is Not Enough – Scope 28 / 31
  • 29. What’s next? • Why Chorus can talk? – Decompose human workers’ tasks into sub-tasks – Which sub-tasks can be automated? • What can we learn from the data? – User’s questions – Crowd Response – Votes 29 / 31
  • 30. What birthday gift should I get for Laila? 30 / 31
  • 31. What birthday gift should I get for Laila? TalkingToTheCrowd.org 31 / 31
  • 32. Reference • Zhao, T., Lee, K., & Eskenazi, M. (2016). DialPort: Connecting the Spoken Dialog Research Community to Real User Data. arXiv preprint arXiv:1606.02562.. • Banchs, R. E., & Li, H. (2012, July). IRIS: a chat-oriented dialogue system based on the vector space model. In Proceedings of the ACL 2012 System Demonstrations (pp. 37-42). Association for Computational Linguistics. • Walker, M. A., Stent, A., Mairesse, F., & Prasad, R. (2007). Individual and domain adaptation in sentence planning for dialogue. Journal of Artificial Intelligence Research, 30, 413-456. • Sun, M. (2016). Adapting Spoken Dialog Systems Towards Domains and Users (Doctoral dissertation, YAHOO! Research). • Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., ... & Yeh, T. (2010, October). VizWiz: nearly real-time answers to visual questions. In Proceedings of the 23nd annual ACM symposium on User interface software and technology (pp. 333-342). ACM. • Bernstein, M. S., Brandt, J., Miller, R. C., & Karger, D. R. (2011, October). Crowds in two seconds: Enabling realtime crowd-powered interfaces. In Proceedings of the 24th annual ACM symposium on User interface software and technology (pp. 33-42). ACM.