Teaching Computers to Chat
July 3rd, 2019
Ramot Menashe SW Techies Meetup
Avi Yaeli, aviy@il.ibm.com
Conversational Analytics
IBM Research
I propose to consider the question,
"Can machines think?"
Alan Turing, 1950
The Evolution of Chatbots
2019 – IBM
2018 Google Duplex
Source: wizu
4
IBM Watson and Cloud Platform /
© 2018 IBM Corporation
AI passed the Turing Test
WE LIVE IN THE ERA
OF AI
Machines are becoming more and more
capable in performing “intelligent”
human-like tasks in a very high
computational capacity
- Natural language processing, Learning, Planning, Knowledge
Representation, Computer Vision, Speech Recognition,
Reasoning, Robotics
Algorithms are trained to learn from
data and solve ”intelligent” tasks in
specific contexts
Machine learning
AI typically more successful in tasks that
Have sufficient training data and features
Are well understood (theoretical background, well scoped, clear
goal)
Future action should draw from historic data
Tom
Emma
instantSo
that 81% of your
customers will leave you
after one poor experienceexperience
…
Source: The Northridge Group
.
IBM Watson
We live in an era of instant
…
.
experience is everything.
IBM Watson
The QWERTY
Keyboard (1874)
For many years humans had to learn to talk to machines
Language is one of our natural ways of communication
CONVERSATION IN
AI-DRIVEN
CUSTOMER CARE
- To survive, companies
understand that they must
create an engaging customer
experience
- AI has the potential to scale
to every customer, on every
issue, anytime, on any
channel in every language.
Goodbye phone and IVRs;
Hello conversational interfaces !!
Tom
Emma
$55M calls/month
30+ languages
Where can AI help to generate smarter
user experiences?
1. Let your customers self-serve faster
Channel
Resolution
Dialog Search Agent
Assistant Skills
2. Help your employees self-serve
3. Assist your agents as they work
IBM Watson
4. Help your experts find answers
Customer, Employee,
Expert
Text, voice, touch,
cards
Client
Devices and channels
server
Systems or record,
Service desk
CRM
Live agent assist
…
Architectural Patterns for Virtual Assistants
Knowledge
modeling and
discovery
Knowledge
services
HOW MANY WAYS ARE THERE
TO SAY ‘MY BICYCLE IS STOLEN’ “My bicycle is stolen”
“My bike is stolen”
“Somebody took my tandem”
“My bike got nicked”
“My wheels are gone”
“F**#!, they took me iron horse”
“During the act of drinking a nice
beer a rascal took advantage of
me enjoying that particular beer
and relieved me of my dear
bicycle, god bless his soul”
“ t h e y t o o k m e i r o n h o r s e ”
Natural Language Understanding
Pronoun
plural
Verb
past
NounPossessive
singular
“British”
Part-of-the-speech
Language
Language: English; British
Corpus, Thesaurus, synonyms
“ t h e y t o o k m e i r o n h o r s e ”
Natural Language Understanding
Entity
Intent: FileClaim
Utterance: “…”
Part-of-the-speech
Utterance
Intent
Entity
Language
“ t h e y t o o k m e i r o n h o r s e ”
Natural Language Understanding
Entity type: bicycle
brand: iron horse
Part-of-the-speech
Context
Disambiguation
Utterance
Intent
Entity
Language
Context: insurance
Context: bicycle
Context: user
Knowledge
“ t h e y t o o k m e i r o n h o r s e ”
Natural Language Understanding
Part-of-the-speech
Sentiment: negative
Tone: Anger
Context
Disambiguation
Utterance
Intent
Entity
“ F * * K … ”
Language
Sentiment and tone
Knowledge
“ t h e y t o o k m e i r o n h o r s e ”
Natural Language Generation
Part-of-the-speech
Context
Disambiguation
Utterance
Intent
Entity
Sentiment and tone
“I’m so sorry to hear…”
Language
”I found your policy, would you like to file a claim…”
Conversation design
Dialog nodes
“May I offer you a coupon for a...” Action
Conversation flow, navigation, dialog steps, actions, responses
Responses
empathy
Responses:
Utterance:
Knowledge
INTENTS
Represent the goal or
purpose of the user's input.
Provide sample utterances for
each intent
Train ML models to correctly
detect intent
ENTITIES
a portion of the user's
input that you can use to
provide a different
response to a particular
intent
User-defined entities,
system-defined entities,
Synonyms, contextual
entities
Import ontologies
DIALOGS
A dialog uses intents, entities,
and context from your
application to define a
response to each user's input.
Creating a dialog defines
how your virtual assistant will
respond to what your users
are saying.
Patterns (slot filling)
Branching (back tracking,
digression)
Fall-backs
Turn taking (pause, wait)
Every turn executes some nodes in the dialog tree
- Start where you stopped in the last turn
- Evaluate children first, then siblings
- If none are evaluated, execute fall-back logic
AT RUNTIME…
Send a user utterance to
a skill and evaluate the
response
Response to /Message API call to Watson
Intents & entities (+confidence)
Dialogs nodes
Context
Responses
Slot filling ”pattern”
collects required/optional
inputs and validation
USING MACHINE LEARNING FOR INTENT CLASSIFICATIONProblem formulation: given a user utterance, predict one or more intents (labels)
Utterance
<text>
Intent(s)
<categorical>
USING MACHINE LEARNING FOR INTENT CLASSIFICATION
Typical pipeline (supervised):
Manual !!
USING MACHINE LEARNING FOR INTENT CLASSIFICATION
Typical Neural Network (Deep Learning) pipeline:
ML-based: e.g., GloVe, Word2Vec
Knowledge-based: Ontologies
General purpose + Domain specific
Model
Utterance
<text>
Intent(s)
<categorical>
USING MACHINE LEARNING FOR INTENT CLASSIFICATION
Measuring model performance
Machine DO NOT understand text like
humans do !!
Common metrics used in ML:
-Accuracy, sensitivity, specificity
-Confusion matrix
-ROC
-…
Is it possible to learn from human-to-
human conversational logs ???
Not that
simple !!!
Post
Production
Analytics:
I1
Sample utterances
U1
U1
U1U1
U1
U1
U1U2
Intents
Dialogs
U1
U1
U1U3
I2 I3
Training real-world utterances per
intents
‒ Not enough real world samples to train intents on
‒ Cold start problem
‒ As the bot goes to production, users get frustrated
that the bot doesn’t understand their intent
Building a rich conversational dialog
‒ Building a dialog script that captures the many
ways users will want to converse in an intent
scenario is difficult
‒ Enriching the dialogue with conversation
navigation (e.g. digression, backtrack, jump-to’s)
is still difficult to design and maintain.
‒ Adaptive interfaces
User input is open ended
‒ Unanticipated utterances
‒ High variance in dialog scenarios across user base
Integrating multi-skill bots
‒ Competing intents across skills (mishandled)
Testing high-variance scenarios
‒ Manual labor intensive to achieve coverage
‒ Copying with inherent variability in conversation
design and personalization
• Measure
• Understand
• Evolve
Orchestration
PERSONAS
Conversation Analyst
Tom is leading a team of content managers/conversation
analysts in this project. He needs to communicate weekly
the status/recommendations of the chatbot to the main
stakeholders: Project managers and Customer Journey
manager.
Data Analyst
Emma is in charge of the analytics around the
performance of the virtual agent. She creates weekly and
monthly report on particular performance measurements.
Customer Journey Manager
Based on the weekly reports from the CA and DA, and
the business requirements, Jane prioritizes the tasks
aiming to maximize the improvement for the next release.
Tom
Emma
Jane
BUSINESS UNDERSTANDING
How can I improve the containment level of flow X ?
We have modified and improved the flow Y - how can we measure the
difference in performance, now?
I wish I cloud prioritize the tasks based on the importance/impact of the flows –
how?
Where and why are we losing engagement?
“ “
• We want a visualization that is self-explanatory - Easy to understand
(different layer of abstraction)
• It needs to show the impact/importance of each flow
• It will highlight problematic area of the flows, helping the prioritization task
Key Capabilities: • Advanced analysis microservices that can be consumed in data science and BI platforms
• Pipeline includes data ingestion, transformation, enrichment, filtering and conversation analysis
• Select most abandonment conversation paths, and find frequent or new conversation topics
Using Visual Interfaces for Conversation Analytics
1. Select pathways of interest (e.g. drop-offs)
2. Explore conversation
characteristics, e.g. topics in
dropped-off conversations
3. Take action, e.g.
• Define new intents w/ samples
• Define new/update options
• Change the dialog
Impressions from the industry (Chatbot Summit, June 24th, Tel Aviv)
• Most major brands have started the journey
• Starting simple (intents, slot filling, handoff) and improving iteratively
• Investing in conversation design, training conversation analysts
• Vendor landscape continues to evolve
• Quicker and more easily build bots
• Challenges:
• Building a good bot is still considered an art
• Pushbacks if realistic expectations not set for users
• Don’t use call-center metrics
• Negative reactions to avatars; Positive to personality
What could happen when AI does not understand humans
Thank You !
BACKUP
Test your self:
Why did this conversation fail?

Teaching Computers to Chat

  • 1.
    Teaching Computers toChat July 3rd, 2019 Ramot Menashe SW Techies Meetup Avi Yaeli, aviy@il.ibm.com Conversational Analytics IBM Research
  • 2.
    I propose toconsider the question, "Can machines think?" Alan Turing, 1950
  • 3.
    The Evolution ofChatbots 2019 – IBM 2018 Google Duplex Source: wizu
  • 4.
    4 IBM Watson andCloud Platform / © 2018 IBM Corporation AI passed the Turing Test
  • 5.
    WE LIVE INTHE ERA OF AI Machines are becoming more and more capable in performing “intelligent” human-like tasks in a very high computational capacity - Natural language processing, Learning, Planning, Knowledge Representation, Computer Vision, Speech Recognition, Reasoning, Robotics Algorithms are trained to learn from data and solve ”intelligent” tasks in specific contexts Machine learning AI typically more successful in tasks that Have sufficient training data and features Are well understood (theoretical background, well scoped, clear goal) Future action should draw from historic data Tom Emma
  • 6.
    instantSo that 81% ofyour customers will leave you after one poor experienceexperience … Source: The Northridge Group . IBM Watson We live in an era of instant … .
  • 7.
  • 8.
    The QWERTY Keyboard (1874) Formany years humans had to learn to talk to machines
  • 9.
    Language is oneof our natural ways of communication
  • 10.
    CONVERSATION IN AI-DRIVEN CUSTOMER CARE -To survive, companies understand that they must create an engaging customer experience - AI has the potential to scale to every customer, on every issue, anytime, on any channel in every language. Goodbye phone and IVRs; Hello conversational interfaces !! Tom Emma $55M calls/month 30+ languages
  • 11.
    Where can AIhelp to generate smarter user experiences? 1. Let your customers self-serve faster Channel Resolution Dialog Search Agent Assistant Skills 2. Help your employees self-serve 3. Assist your agents as they work IBM Watson 4. Help your experts find answers Customer, Employee, Expert
  • 13.
    Text, voice, touch, cards Client Devicesand channels server Systems or record, Service desk CRM Live agent assist … Architectural Patterns for Virtual Assistants Knowledge modeling and discovery Knowledge services
  • 14.
    HOW MANY WAYSARE THERE TO SAY ‘MY BICYCLE IS STOLEN’ “My bicycle is stolen” “My bike is stolen” “Somebody took my tandem” “My bike got nicked” “My wheels are gone” “F**#!, they took me iron horse” “During the act of drinking a nice beer a rascal took advantage of me enjoying that particular beer and relieved me of my dear bicycle, god bless his soul”
  • 15.
    “ t he y t o o k m e i r o n h o r s e ” Natural Language Understanding Pronoun plural Verb past NounPossessive singular “British” Part-of-the-speech Language Language: English; British Corpus, Thesaurus, synonyms
  • 16.
    “ t he y t o o k m e i r o n h o r s e ” Natural Language Understanding Entity Intent: FileClaim Utterance: “…” Part-of-the-speech Utterance Intent Entity Language
  • 17.
    “ t he y t o o k m e i r o n h o r s e ” Natural Language Understanding Entity type: bicycle brand: iron horse Part-of-the-speech Context Disambiguation Utterance Intent Entity Language Context: insurance Context: bicycle Context: user Knowledge
  • 18.
    “ t he y t o o k m e i r o n h o r s e ” Natural Language Understanding Part-of-the-speech Sentiment: negative Tone: Anger Context Disambiguation Utterance Intent Entity “ F * * K … ” Language Sentiment and tone Knowledge
  • 19.
    “ t he y t o o k m e i r o n h o r s e ” Natural Language Generation Part-of-the-speech Context Disambiguation Utterance Intent Entity Sentiment and tone “I’m so sorry to hear…” Language ”I found your policy, would you like to file a claim…” Conversation design Dialog nodes “May I offer you a coupon for a...” Action Conversation flow, navigation, dialog steps, actions, responses Responses empathy Responses: Utterance: Knowledge
  • 20.
    INTENTS Represent the goalor purpose of the user's input. Provide sample utterances for each intent Train ML models to correctly detect intent
  • 21.
    ENTITIES a portion ofthe user's input that you can use to provide a different response to a particular intent User-defined entities, system-defined entities, Synonyms, contextual entities Import ontologies
  • 22.
    DIALOGS A dialog usesintents, entities, and context from your application to define a response to each user's input. Creating a dialog defines how your virtual assistant will respond to what your users are saying. Patterns (slot filling) Branching (back tracking, digression) Fall-backs Turn taking (pause, wait) Every turn executes some nodes in the dialog tree - Start where you stopped in the last turn - Evaluate children first, then siblings - If none are evaluated, execute fall-back logic
  • 23.
    AT RUNTIME… Send auser utterance to a skill and evaluate the response Response to /Message API call to Watson Intents & entities (+confidence) Dialogs nodes Context Responses Slot filling ”pattern” collects required/optional inputs and validation
  • 24.
    USING MACHINE LEARNINGFOR INTENT CLASSIFICATIONProblem formulation: given a user utterance, predict one or more intents (labels) Utterance <text> Intent(s) <categorical>
  • 25.
    USING MACHINE LEARNINGFOR INTENT CLASSIFICATION Typical pipeline (supervised): Manual !!
  • 26.
    USING MACHINE LEARNINGFOR INTENT CLASSIFICATION Typical Neural Network (Deep Learning) pipeline: ML-based: e.g., GloVe, Word2Vec Knowledge-based: Ontologies General purpose + Domain specific Model Utterance <text> Intent(s) <categorical>
  • 27.
    USING MACHINE LEARNINGFOR INTENT CLASSIFICATION Measuring model performance Machine DO NOT understand text like humans do !! Common metrics used in ML: -Accuracy, sensitivity, specificity -Confusion matrix -ROC -… Is it possible to learn from human-to- human conversational logs ???
  • 28.
    Not that simple !!! Post Production Analytics: I1 Sampleutterances U1 U1 U1U1 U1 U1 U1U2 Intents Dialogs U1 U1 U1U3 I2 I3 Training real-world utterances per intents ‒ Not enough real world samples to train intents on ‒ Cold start problem ‒ As the bot goes to production, users get frustrated that the bot doesn’t understand their intent Building a rich conversational dialog ‒ Building a dialog script that captures the many ways users will want to converse in an intent scenario is difficult ‒ Enriching the dialogue with conversation navigation (e.g. digression, backtrack, jump-to’s) is still difficult to design and maintain. ‒ Adaptive interfaces User input is open ended ‒ Unanticipated utterances ‒ High variance in dialog scenarios across user base Integrating multi-skill bots ‒ Competing intents across skills (mishandled) Testing high-variance scenarios ‒ Manual labor intensive to achieve coverage ‒ Copying with inherent variability in conversation design and personalization • Measure • Understand • Evolve Orchestration
  • 29.
    PERSONAS Conversation Analyst Tom isleading a team of content managers/conversation analysts in this project. He needs to communicate weekly the status/recommendations of the chatbot to the main stakeholders: Project managers and Customer Journey manager. Data Analyst Emma is in charge of the analytics around the performance of the virtual agent. She creates weekly and monthly report on particular performance measurements. Customer Journey Manager Based on the weekly reports from the CA and DA, and the business requirements, Jane prioritizes the tasks aiming to maximize the improvement for the next release. Tom Emma Jane
  • 30.
    BUSINESS UNDERSTANDING How canI improve the containment level of flow X ? We have modified and improved the flow Y - how can we measure the difference in performance, now? I wish I cloud prioritize the tasks based on the importance/impact of the flows – how? Where and why are we losing engagement? “ “ • We want a visualization that is self-explanatory - Easy to understand (different layer of abstraction) • It needs to show the impact/importance of each flow • It will highlight problematic area of the flows, helping the prioritization task
  • 31.
    Key Capabilities: •Advanced analysis microservices that can be consumed in data science and BI platforms • Pipeline includes data ingestion, transformation, enrichment, filtering and conversation analysis • Select most abandonment conversation paths, and find frequent or new conversation topics Using Visual Interfaces for Conversation Analytics 1. Select pathways of interest (e.g. drop-offs) 2. Explore conversation characteristics, e.g. topics in dropped-off conversations 3. Take action, e.g. • Define new intents w/ samples • Define new/update options • Change the dialog
  • 32.
    Impressions from theindustry (Chatbot Summit, June 24th, Tel Aviv) • Most major brands have started the journey • Starting simple (intents, slot filling, handoff) and improving iteratively • Investing in conversation design, training conversation analysts • Vendor landscape continues to evolve • Quicker and more easily build bots • Challenges: • Building a good bot is still considered an art • Pushbacks if realistic expectations not set for users • Don’t use call-center metrics • Negative reactions to avatars; Positive to personality
  • 33.
    What could happenwhen AI does not understand humans
  • 34.
  • 35.
  • 36.
    Test your self: Whydid this conversation fail?