Ai = your data | Rasa Summit 2021

Rasa Technologies
Rasa TechnologiesRasa Technologies
AI = Your Data
(And what that means for you
as you build a virtual assistant)
Why do you bother
working on chatbots?
Why do you bother
working on chatbots?
Subtext: I think most conversational
AI projects are bad
What makes a bad
Conversational AI project?
It doesn't help people do
what they need to do.
How do you know what
people need to do?
● You make an educated guess
● You ask them (UX research)
● They tell you (You look at data)
How do you know what people need to do?
● You make an educated guess
○ Complete top-down design (state machines, dialog trees) can
be a good approach, but they're inflexible
● You ask them (UX research)
● They tell you (You look at data)
How do you know what
people need to do?
● You make an educated guess
● ✨You ask them (UX research)✨
○ Not covered today but 100%
do it if you can
● They tell you (You look at data)
How do you know what people need to do?
● You make an educated guess
● You ask them (UX research)
● They tell you (You look at data)
○ More flexible
○ But don't assume that all you need is
more user-generated data!
https://www.theguardian.com/world/2021/jan/14/time-to-properl
y-socialise-hate-speech-ai-chatbot-pulled-from-facebook
How do you know what people need to do?
● You make an educated guess
● You ask them (UX research)
● They tell you (You look at data)
○ More flexible
○ But don't assume that all you need is
more user-generated data!
○ Happy medium:
■ your system learns from data
■ you provide additional structure
and organization
https://www.theguardian.com/world/2021/jan/14/time-to-properl
y-socialise-hate-speech-ai-chatbot-pulled-from-facebook
What is “data” in conversational AI?
● The text data use to pretrain any models or
features you're using (e.g. language models,
word embeddings, etc.)
● User-generated text
● Patterns of conversations
● Examples:
○ Customer support logs (assuming data
collection & reuse is covered in your
privacy policy)
Two different assistants can have more
or less the same underlying ML code.
(That’s what makes building a
conversational AI framework possible!)
What makes your assistant work for
you and your users is your data & how
you structure it.
● Curate = decide what data
to use as you train &
retrain your assistant
● Annotate = apply (or
correct) labels for
individual pieces of data
● Intents
○ If you already
have data
○ If you don't
● Stories
● Checking if it works
Intent = something a
user wants to do
Intent = something a
user wants to do
Quick test: is this a VERB
(inform, book_trip, confirm)?
If you have data
● Modified content analysis:
○ Go through data (or sample) by hand and assign each datapoint to a
group
○ If no existing group fits, add a new one
○ At given intervals, go through your groups and combine or seperate
them as needed
○ Start with 2-3 passes through your dataset
● Can't you just automate this?
○ Maaaybe, but I wouldn't recommend it: no guarantee clusters will
map well to user needs
(Even) If you don't have data
● Start with the most common intent
○ Most people want to do the same thing
○ Use the experts in your institution (e.g.
support staff)
● Start with the smallest possible number of
intents (that cover your core use case)
● Everything else goes in an out of scope intent
○ If your assistant can't handle something,
give users an escape hatch right away
● Additional intents will come from user data
Why fewer intents?
● Older style of conversational design:
○ You need an intent for everything your
user might want to do!
● Rasa style CDD:
○ You only need to start with the most
popular, important intents & a way to
handle things outside them
○ Continue to build from there if that’s
what users need
Why fewer intents?
● Human reasons
○ More intents = more training data,
maintenance, documentation
○ More intents = annotation more difficult
● ML reasons
○ Transformer classifiers scale linearly with the
# of classes*
○ Entity extraction (esp. with very lightweight
rule-based systems like Duckling) is often
faster
*Taming pretrained transformers for eXtreme multi-label text classification, Chang et al 2020 @ KDD
Paring down intents book_train:
● One train ticket
● Need to book a train ride
● A rail journey please
book_plane:
● One plane ticket
● Need to book a plane ride
● Don’t use intents as a way to store
information
○ Storing information = slots
● Do a lot of the same tokens show up
in training data for two intents?
Consider if they can be combined
● I would personally start with:
○ max 10 intents
○ min 20 training examples (for
each intent)
make_booking:
● One [train](train) ticket
● Need to book a [train](train) ride
● A [rail](train) journey please
● One [plane](air) ticket
● Need to book a [plane](air) ride
● i'd like to book a trip
● Need a vacation
Training data for an intent
● User-generated > synthetic
● Chat-based interactions tend to be
informal
● Each utterance should unambiguously
match to a single intent
○ You can verify this using human
sorting & inter-rater reliability
● Is an utterance ambiguous?
○ Use end-to-end instead (the raw text
as training data w/out classifying it)
●
Unambiguous:
● Hi there
● Hieeeeeeeeeeeee
● Hola
● I said, helllllloooooO!!!!
● What is up?
● ayyyy whaddup
● hello robot
● hello sara
● merhaba
● ola sara
Ambiguous (goes in end to end):
● good day
● ciao
● alhoa🤙
Stories = training data to
decide what your
assistant should do next
Stories = training data to
decide what your
assistant should do next
Older style: Most of the design work
Newer style: Provide examples &
system extrapolates from them
Stories
● If you have conversational data:
○ If you have conversations, start with the patterns you see in them
○ Find a new intent? Add it to your intents
● Generating your own conversational patterns:
○ It's easiest to use interactive learning to create stories (in
command line or Rasa X)
○ Start with common flows, “happy paths”
○ Then add common errors/digressions
● Once your model is trained:
○ Add user data for more ASAP
Stories
● If you have conversational data:
○ If you have conversations, start with the patterns you see in them
○ Find a new intent? Add it to your intents
● Generating your own conversational patterns:
○ It's easiest to use interactive learning to create stories (in
command line or Rasa X)
○ Start with common flows, “happy paths”
○ Then add common errors/digressions
● Once your model is trained:
○ Add user data for more ASAP
Do you need conditional logic in
your conversations? Like: "If
they're already signed in don't ask
for a name" or "never show
account balance without a PIN?
Add rules!
How do you know if it works?
● Reviewing user conversations!
● Tests
○ Sample conversions that should
always be handled the same
○ Good to shoot for 100% correct
● Validation
○ Checking that your model can guess
correctly at an acceptable rate
○ Be very suspicious of accuracy near
100% 🤔
Takeaways
● Language data is what makes
your Rasa assistant work
● Providing structure for language
data is the first step for building
NLP systems
● Start with the fewest possible,
most popular things
● Get your prototype in front of
users ASAP
Thanks! Questions?
@rctatman
1 of 28

Recommended

PyData SF 2016 --- Moving forward through the darkness by
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessChia-Chi Chang
553 views85 slides
Python Machine Learning - Getting Started by
Python Machine Learning - Getting StartedPython Machine Learning - Getting Started
Python Machine Learning - Getting StartedRafey Iqbal Rahman
242 views38 slides
Module 1 introduction to machine learning by
Module 1  introduction to machine learningModule 1  introduction to machine learning
Module 1 introduction to machine learningSara Hooker
499 views64 slides
Statistics vs machine learning by
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learningTom Dierickx
482 views29 slides
Tool criticism by
Tool criticismTool criticism
Tool criticismMarijn Koolen
20 views23 slides
AI Tools & Best Practices by
AI Tools  & Best PracticesAI Tools  & Best Practices
AI Tools & Best PracticesCraig Stoss
50 views13 slides

More Related Content

Similar to Ai = your data | Rasa Summit 2021

Machine Learning: Artificial Intelligence isn't just a Science Fiction topic by
Machine Learning: Artificial Intelligence isn't just a Science Fiction topicMachine Learning: Artificial Intelligence isn't just a Science Fiction topic
Machine Learning: Artificial Intelligence isn't just a Science Fiction topicRaúl Garreta
2.4K views29 slides
Sentiment Analysis by
Sentiment AnalysisSentiment Analysis
Sentiment AnalysisData Science Society
5.1K views39 slides
What Are the Basics of Product Manager Interviews by Google PM by
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PMProduct School
25.5K views44 slides
A DevOps Checklist for Startups by
A DevOps Checklist for StartupsA DevOps Checklist for Startups
A DevOps Checklist for StartupsRick Manelius
146 views22 slides
General introduction to AI ML DL DS by
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DSRoopesh Kohad
1.2K views83 slides
L15.pptx by
L15.pptxL15.pptx
L15.pptxImonBennett
4 views83 slides

Similar to Ai = your data | Rasa Summit 2021(20)

Machine Learning: Artificial Intelligence isn't just a Science Fiction topic by Raúl Garreta
Machine Learning: Artificial Intelligence isn't just a Science Fiction topicMachine Learning: Artificial Intelligence isn't just a Science Fiction topic
Machine Learning: Artificial Intelligence isn't just a Science Fiction topic
Raúl Garreta2.4K views
What Are the Basics of Product Manager Interviews by Google PM by Product School
What Are the Basics of Product Manager Interviews by Google PMWhat Are the Basics of Product Manager Interviews by Google PM
What Are the Basics of Product Manager Interviews by Google PM
Product School25.5K views
A DevOps Checklist for Startups by Rick Manelius
A DevOps Checklist for StartupsA DevOps Checklist for Startups
A DevOps Checklist for Startups
Rick Manelius146 views
General introduction to AI ML DL DS by Roopesh Kohad
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
Roopesh Kohad1.2K views
A few questions about large scale machine learning by Theodoros Vasiloudis
A few questions about large scale machine learningA few questions about large scale machine learning
A few questions about large scale machine learning
ChatGPT in academic settings H2.de by David Döring
ChatGPT in academic settings H2.deChatGPT in academic settings H2.de
ChatGPT in academic settings H2.de
David Döring16 views
[DSC Europe 22] The Challenges of Product Analytics - Tamas Srancsik by DataScienceConferenc1
[DSC Europe 22] The Challenges of Product Analytics - Tamas Srancsik[DSC Europe 22] The Challenges of Product Analytics - Tamas Srancsik
[DSC Europe 22] The Challenges of Product Analytics - Tamas Srancsik
Business Analyst Technical Interview by Neka Allen
Business Analyst Technical InterviewBusiness Analyst Technical Interview
Business Analyst Technical Interview
Neka Allen131 views
How i got interviews at google, facebook, and bridgewater (tech version) by Tomiwa Ademidun
How i got interviews at google, facebook, and bridgewater (tech version)How i got interviews at google, facebook, and bridgewater (tech version)
How i got interviews at google, facebook, and bridgewater (tech version)
Tomiwa Ademidun65 views
Big Data & Social Analytics presentation by gustavosouto
Big Data & Social Analytics presentationBig Data & Social Analytics presentation
Big Data & Social Analytics presentation
gustavosouto261 views
BIG2016- Lessons Learned from building real-life user-focused Big Data systems by Xavier Amatriain
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Xavier Amatriain3.4K views
What should be your approach for solving ML_CV problem statements_.pdf by Vishwas N
What should be your approach for solving ML_CV problem statements_.pdfWhat should be your approach for solving ML_CV problem statements_.pdf
What should be your approach for solving ML_CV problem statements_.pdf
Vishwas N8 views
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana... by Data Con LA
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA 2022 - Demystifying the Art of Business Intelligence and Data Ana...
Data Con LA8 views
Machine learning: A Walk Through School Exams by Ramsha Ijaz
Machine learning: A Walk Through School ExamsMachine learning: A Walk Through School Exams
Machine learning: A Walk Through School Exams
Ramsha Ijaz81 views
NLP & Machine Learning - An Introductory Talk by Vijay Ganti
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
Vijay Ganti180 views
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems by Xavier Amatriain
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Xavier Amatriain5.9K views

More from Rasa Technologies

Six Steps to Conversation Driven Development by
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven DevelopmentRasa Technologies
354 views37 slides
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Rasa Technologies
250 views22 slides
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Rasa Technologies
178 views20 slides
How to Effectively Test Your Chatbot | Rasa Summit by
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa SummitRasa Technologies
142 views8 slides
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa... by
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...Rasa Technologies
194 views17 slides
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit... by
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Rasa Technologies
112 views48 slides

More from Rasa Technologies(20)

Six Steps to Conversation Driven Development by Rasa Technologies
Six Steps to Conversation Driven DevelopmentSix Steps to Conversation Driven Development
Six Steps to Conversation Driven Development
Rasa Technologies354 views
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu... by Rasa Technologies
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Rasa Technologies250 views
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ... by Rasa Technologies
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Using Rasa to Power an Immersive Multimedia Conversational Experience | Rasa ...
Rasa Technologies178 views
How to Effectively Test Your Chatbot | Rasa Summit by Rasa Technologies
How to Effectively Test Your Chatbot  | Rasa SummitHow to Effectively Test Your Chatbot  | Rasa Summit
How to Effectively Test Your Chatbot | Rasa Summit
Rasa Technologies142 views
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa... by Rasa Technologies
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
Rasa Technologies194 views
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit... by Rasa Technologies
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Rasa Technologies112 views
The missing link: How AI can help create a safer society and better businesse... by Rasa Technologies
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...
Rasa Technologies120 views
Boss - Bringing More Diversity to Tech | Rasa Summit by Rasa Technologies
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa Summit
Rasa Technologies116 views
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit by Rasa Technologies
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
Rasa Technologies136 views
Applying Conversational AI in the Enterprise by Rasa Technologies
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the Enterprise
Rasa Technologies118 views
Supercharging User Interfaces with Rasa | Rasa Summit 2021 by Rasa Technologies
Supercharging User Interfaces with Rasa | Rasa Summit 2021Supercharging User Interfaces with Rasa | Rasa Summit 2021
Supercharging User Interfaces with Rasa | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 by Rasa Technologies
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
Rasa Technologies215 views
Continuous Improvement of Conversational AI in Production | Rasa Summit by Rasa Technologies
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa Summit
Rasa Technologies130 views
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ... by Rasa Technologies
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
The State of Conversation Design - Designing for the Conversational Future by Rasa Technologies
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational Future
Rasa Technologies348 views
Building an AI Assistant Factory - Rasa Summit 2021 by Rasa Technologies
Building an AI Assistant Factory - Rasa Summit 2021Building an AI Assistant Factory - Rasa Summit 2021
Building an AI Assistant Factory - Rasa Summit 2021
Rasa Technologies119 views
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ... by Rasa Technologies
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Building an End-to-End Test Automation Pipeline for Conversational AI | Rasa ...
Rasa Technologies156 views
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021 by Rasa Technologies
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Deploy your Rasa Chatbots like a Boss with DevOps | Rasa Summit 2021
Rasa Technologies184 views
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021 by Rasa Technologies
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
Rasa Technologies170 views

Recently uploaded

Combining Orchestration and Choreography for a Clean Architecture by
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean ArchitectureThomasHeinrichs1
68 views24 slides
AI: mind, matter, meaning, metaphors, being, becoming, life values by
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life valuesTwain Liu 刘秋艳
34 views16 slides
Black and White Modern Science Presentation.pptx by
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptxmaryamkhalid2916
14 views21 slides
Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
38 views21 slides
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...Vadym Kazulkin
70 views64 slides
Java Platform Approach 1.0 - Picnic Meetup by
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic MeetupRick Ossendrijver
25 views39 slides

Recently uploaded(20)

Combining Orchestration and Choreography for a Clean Architecture by ThomasHeinrichs1
Combining Orchestration and Choreography for a Clean ArchitectureCombining Orchestration and Choreography for a Clean Architecture
Combining Orchestration and Choreography for a Clean Architecture
ThomasHeinrichs168 views
AI: mind, matter, meaning, metaphors, being, becoming, life values by Twain Liu 刘秋艳
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life values
Black and White Modern Science Presentation.pptx by maryamkhalid2916
Black and White Modern Science Presentation.pptxBlack and White Modern Science Presentation.pptx
Black and White Modern Science Presentation.pptx
maryamkhalid291614 views
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor... by Vadym Kazulkin
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
How to reduce cold starts for Java Serverless applications in AWS at JCON Wor...
Vadym Kazulkin70 views
RADIUS-Omnichannel Interaction System by RADIUS
RADIUS-Omnichannel Interaction SystemRADIUS-Omnichannel Interaction System
RADIUS-Omnichannel Interaction System
RADIUS14 views
Understanding GenAI/LLM and What is Google Offering - Felix Goh by NUS-ISS
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS39 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab11 views
[2023] Putting the R! in R&D.pdf by Eleanor McHugh
[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh38 views
Voice Logger - Telephony Integration Solution at Aegis by Nirmal Sharma
Voice Logger - Telephony Integration Solution at AegisVoice Logger - Telephony Integration Solution at Aegis
Voice Logger - Telephony Integration Solution at Aegis
Nirmal Sharma17 views
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst449 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta14 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi113 views

Ai = your data | Rasa Summit 2021

  • 1. AI = Your Data (And what that means for you as you build a virtual assistant)
  • 2. Why do you bother working on chatbots?
  • 3. Why do you bother working on chatbots? Subtext: I think most conversational AI projects are bad
  • 4. What makes a bad Conversational AI project? It doesn't help people do what they need to do.
  • 5. How do you know what people need to do? ● You make an educated guess ● You ask them (UX research) ● They tell you (You look at data)
  • 6. How do you know what people need to do? ● You make an educated guess ○ Complete top-down design (state machines, dialog trees) can be a good approach, but they're inflexible ● You ask them (UX research) ● They tell you (You look at data)
  • 7. How do you know what people need to do? ● You make an educated guess ● ✨You ask them (UX research)✨ ○ Not covered today but 100% do it if you can ● They tell you (You look at data)
  • 8. How do you know what people need to do? ● You make an educated guess ● You ask them (UX research) ● They tell you (You look at data) ○ More flexible ○ But don't assume that all you need is more user-generated data! https://www.theguardian.com/world/2021/jan/14/time-to-properl y-socialise-hate-speech-ai-chatbot-pulled-from-facebook
  • 9. How do you know what people need to do? ● You make an educated guess ● You ask them (UX research) ● They tell you (You look at data) ○ More flexible ○ But don't assume that all you need is more user-generated data! ○ Happy medium: ■ your system learns from data ■ you provide additional structure and organization https://www.theguardian.com/world/2021/jan/14/time-to-properl y-socialise-hate-speech-ai-chatbot-pulled-from-facebook
  • 10. What is “data” in conversational AI? ● The text data use to pretrain any models or features you're using (e.g. language models, word embeddings, etc.) ● User-generated text ● Patterns of conversations ● Examples: ○ Customer support logs (assuming data collection & reuse is covered in your privacy policy)
  • 11. Two different assistants can have more or less the same underlying ML code. (That’s what makes building a conversational AI framework possible!) What makes your assistant work for you and your users is your data & how you structure it.
  • 12. ● Curate = decide what data to use as you train & retrain your assistant ● Annotate = apply (or correct) labels for individual pieces of data
  • 13. ● Intents ○ If you already have data ○ If you don't ● Stories ● Checking if it works
  • 14. Intent = something a user wants to do
  • 15. Intent = something a user wants to do Quick test: is this a VERB (inform, book_trip, confirm)?
  • 16. If you have data ● Modified content analysis: ○ Go through data (or sample) by hand and assign each datapoint to a group ○ If no existing group fits, add a new one ○ At given intervals, go through your groups and combine or seperate them as needed ○ Start with 2-3 passes through your dataset ● Can't you just automate this? ○ Maaaybe, but I wouldn't recommend it: no guarantee clusters will map well to user needs
  • 17. (Even) If you don't have data ● Start with the most common intent ○ Most people want to do the same thing ○ Use the experts in your institution (e.g. support staff) ● Start with the smallest possible number of intents (that cover your core use case) ● Everything else goes in an out of scope intent ○ If your assistant can't handle something, give users an escape hatch right away ● Additional intents will come from user data
  • 18. Why fewer intents? ● Older style of conversational design: ○ You need an intent for everything your user might want to do! ● Rasa style CDD: ○ You only need to start with the most popular, important intents & a way to handle things outside them ○ Continue to build from there if that’s what users need
  • 19. Why fewer intents? ● Human reasons ○ More intents = more training data, maintenance, documentation ○ More intents = annotation more difficult ● ML reasons ○ Transformer classifiers scale linearly with the # of classes* ○ Entity extraction (esp. with very lightweight rule-based systems like Duckling) is often faster *Taming pretrained transformers for eXtreme multi-label text classification, Chang et al 2020 @ KDD
  • 20. Paring down intents book_train: ● One train ticket ● Need to book a train ride ● A rail journey please book_plane: ● One plane ticket ● Need to book a plane ride ● Don’t use intents as a way to store information ○ Storing information = slots ● Do a lot of the same tokens show up in training data for two intents? Consider if they can be combined ● I would personally start with: ○ max 10 intents ○ min 20 training examples (for each intent) make_booking: ● One [train](train) ticket ● Need to book a [train](train) ride ● A [rail](train) journey please ● One [plane](air) ticket ● Need to book a [plane](air) ride ● i'd like to book a trip ● Need a vacation
  • 21. Training data for an intent ● User-generated > synthetic ● Chat-based interactions tend to be informal ● Each utterance should unambiguously match to a single intent ○ You can verify this using human sorting & inter-rater reliability ● Is an utterance ambiguous? ○ Use end-to-end instead (the raw text as training data w/out classifying it) ● Unambiguous: ● Hi there ● Hieeeeeeeeeeeee ● Hola ● I said, helllllloooooO!!!! ● What is up? ● ayyyy whaddup ● hello robot ● hello sara ● merhaba ● ola sara Ambiguous (goes in end to end): ● good day ● ciao ● alhoa🤙
  • 22. Stories = training data to decide what your assistant should do next
  • 23. Stories = training data to decide what your assistant should do next Older style: Most of the design work Newer style: Provide examples & system extrapolates from them
  • 24. Stories ● If you have conversational data: ○ If you have conversations, start with the patterns you see in them ○ Find a new intent? Add it to your intents ● Generating your own conversational patterns: ○ It's easiest to use interactive learning to create stories (in command line or Rasa X) ○ Start with common flows, “happy paths” ○ Then add common errors/digressions ● Once your model is trained: ○ Add user data for more ASAP
  • 25. Stories ● If you have conversational data: ○ If you have conversations, start with the patterns you see in them ○ Find a new intent? Add it to your intents ● Generating your own conversational patterns: ○ It's easiest to use interactive learning to create stories (in command line or Rasa X) ○ Start with common flows, “happy paths” ○ Then add common errors/digressions ● Once your model is trained: ○ Add user data for more ASAP Do you need conditional logic in your conversations? Like: "If they're already signed in don't ask for a name" or "never show account balance without a PIN? Add rules!
  • 26. How do you know if it works? ● Reviewing user conversations! ● Tests ○ Sample conversions that should always be handled the same ○ Good to shoot for 100% correct ● Validation ○ Checking that your model can guess correctly at an acceptable rate ○ Be very suspicious of accuracy near 100% 🤔
  • 27. Takeaways ● Language data is what makes your Rasa assistant work ● Providing structure for language data is the first step for building NLP systems ● Start with the fewest possible, most popular things ● Get your prototype in front of users ASAP