SlideShare a Scribd company logo
1 of 8
Download to read offline
How to
”Effectively” ”Test”
your Chatbot
Soumya Mukherjee
Director QA, DevOps & AIML
Apty.IO
How are we doing our QA today
• Testing is Blackbox for testers
• Mostly manual testing done in organization
• Conversational flow testing
• Small Talk
• Fallback checks
• Integrations
• Automation done on UI and API layer
• Testing is mostly done on same training data
• Models are trained by engineers and are not being
monitored by QA
• There are analytics tools available to monitor but it
needs technical expertise for the QA
• Result : More than 90% times bot breaks (no one
understands when it will break), most of them fallback
and get stuck - once bot is stuck it is stuck
Q ?
A
What are the issues in QA ?
• Bots are evolving and continuous story creation is a problem
• No tool manage story coverage
• Your training data may not correspond to new stories or vice versa (it’s a
mismatch) – most org keep training on the same data
• Most automation tools offers record and playback (My stories are
already written how to port is the question)
What are the issues in QA ?
• No (unified) centralized dashboard present where QA can check (everything is quite scattered)
• Intent Matching
• Entity Testing – Slot identification
• Entity Testing – Entity Validation
• Confidence score
• Confusion Matrix along with Precision/Recall/F1-Score
• No easy way to reset the failed bot !
• Bot versioning is a mess and A/B testing becomes difficult
• Multilingual bot QA is a challenge (have to make 2 separate bots)
• High confidence score is also a problem as your bot will only predict same thing (if the data is same
for multiple intents then it will predict the one with highest confidence score – may be incorrect)
How to make sure your bot never breaks ?
How to make your test effective ?
• Create scenarios for happy path, contextual questions, digressions, domain
specific questions, stateless conversations
• Map proper entities for common scenarios (example bus fee, tuition fee) –
flow should change with entities in the stories
• Automated tests should consume all stories and run them each time as part
of regression testing
• Story coverage visualization
• For Manual Testing use Bot emulation product (like RasaX, Botfront) to test
How to make your test effective ?
• Central dashboarding including :
• Confusion matrix, Precision, Recall and F1-Score
• Cumulative accuracy profile
• Cross validation results
• Perform Exhaustive testing (bot resiliency), Integration checks across
platforms, Webhooks
• Perform fault tolerance testing by performing performance testing (bot
response, session management) & security testing (api interaction,
typing speed check, punctuations, typo errors)
Other KPIs to track
• Activity Volume
• Bounce rate
• Retention rate
• Open sessions count
• Session times (conversation length)
• Goal completion rate
• User feedback (sentiments)
• Fallback rate (Confusion rate, reset rate & Human takeover rate)
Thanks
@QASoumya
Linkedin.com/in/mukherjeesoumya

More Related Content

What's hot

AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Pythonamyiris
 
Introduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingIntroduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingYan Cui
 
Chatbot Tutorial - Create your first bot with Xatkit
Chatbot Tutorial - Create your first bot with Xatkit Chatbot Tutorial - Create your first bot with Xatkit
Chatbot Tutorial - Create your first bot with Xatkit Jordi Cabot
 
Code Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysisCode Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysisMikalai Alimenkou
 
Webinar: How to Use Integrated Version Control in Rasa X
Webinar: How to Use Integrated Version Control in Rasa XWebinar: How to Use Integrated Version Control in Rasa X
Webinar: How to Use Integrated Version Control in Rasa XRasa Technologies
 
DevOps & Technical Agility: From Theory to Practice
DevOps & Technical Agility: From Theory to PracticeDevOps & Technical Agility: From Theory to Practice
DevOps & Technical Agility: From Theory to PracticeLemi Orhan Ergin
 
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in Paris
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in ParisDeveloping Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in Paris
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in ParisOW2
 
When you get lost in api testing #ForumPHP
When you get lost in api testing #ForumPHPWhen you get lost in api testing #ForumPHP
When you get lost in api testing #ForumPHPPaula Čučuk
 
Best Practices for a Repeatable Shift-Left Commitment
Best Practices for a Repeatable Shift-Left CommitmentBest Practices for a Repeatable Shift-Left Commitment
Best Practices for a Repeatable Shift-Left CommitmentApplause
 
Skillshare - From Noob to Tech CEO - nov 7th, 2011
Skillshare - From Noob to Tech CEO - nov 7th, 2011Skillshare - From Noob to Tech CEO - nov 7th, 2011
Skillshare - From Noob to Tech CEO - nov 7th, 2011Kareem Amin
 
Kaiser Permanente CSUN 2018
Kaiser Permanente CSUN 2018Kaiser Permanente CSUN 2018
Kaiser Permanente CSUN 2018Mark Stimson
 
The 7 minute accessibility assessment and app rating system
The 7 minute accessibility assessment and app rating systemThe 7 minute accessibility assessment and app rating system
The 7 minute accessibility assessment and app rating systemAidan Tierney
 
Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)Yan Cui
 
Writing Testable Code in SharePoint
Writing Testable Code in SharePointWriting Testable Code in SharePoint
Writing Testable Code in SharePointTim McCarthy
 
Research Updates from Rasa: Transformers in NLU and Dialogue
Research Updates from Rasa: Transformers in NLU and DialogueResearch Updates from Rasa: Transformers in NLU and Dialogue
Research Updates from Rasa: Transformers in NLU and DialogueRasa Technologies
 
Low-code vs Model-Driven Engineering
Low-code vs Model-Driven EngineeringLow-code vs Model-Driven Engineering
Low-code vs Model-Driven EngineeringJordi Cabot
 
Android application development part2
Android application development part2Android application development part2
Android application development part2Mayank Bhatt
 
Elements of a Test Framework
Elements of a Test FrameworkElements of a Test Framework
Elements of a Test FrameworkSmartBear
 

What's hot (20)

AI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using PythonAI and Python: Developing a Conversational Interface using Python
AI and Python: Developing a Conversational Interface using Python
 
Introduction to Aspect Oriented Programming
Introduction to Aspect Oriented ProgrammingIntroduction to Aspect Oriented Programming
Introduction to Aspect Oriented Programming
 
Chatbot Tutorial - Create your first bot with Xatkit
Chatbot Tutorial - Create your first bot with Xatkit Chatbot Tutorial - Create your first bot with Xatkit
Chatbot Tutorial - Create your first bot with Xatkit
 
Aspect Oriented Programing - Introduction
Aspect Oriented Programing - IntroductionAspect Oriented Programing - Introduction
Aspect Oriented Programing - Introduction
 
Code Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysisCode Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysis
 
Webinar: How to Use Integrated Version Control in Rasa X
Webinar: How to Use Integrated Version Control in Rasa XWebinar: How to Use Integrated Version Control in Rasa X
Webinar: How to Use Integrated Version Control in Rasa X
 
DevOps & Technical Agility: From Theory to Practice
DevOps & Technical Agility: From Theory to PracticeDevOps & Technical Agility: From Theory to Practice
DevOps & Technical Agility: From Theory to Practice
 
Presentation delex
Presentation delexPresentation delex
Presentation delex
 
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in Paris
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in ParisDeveloping Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in Paris
Developing Intelligent Chatbots using RASA, OW2con'19, June 12-13, 2019 in Paris
 
When you get lost in api testing #ForumPHP
When you get lost in api testing #ForumPHPWhen you get lost in api testing #ForumPHP
When you get lost in api testing #ForumPHP
 
Best Practices for a Repeatable Shift-Left Commitment
Best Practices for a Repeatable Shift-Left CommitmentBest Practices for a Repeatable Shift-Left Commitment
Best Practices for a Repeatable Shift-Left Commitment
 
Skillshare - From Noob to Tech CEO - nov 7th, 2011
Skillshare - From Noob to Tech CEO - nov 7th, 2011Skillshare - From Noob to Tech CEO - nov 7th, 2011
Skillshare - From Noob to Tech CEO - nov 7th, 2011
 
Kaiser Permanente CSUN 2018
Kaiser Permanente CSUN 2018Kaiser Permanente CSUN 2018
Kaiser Permanente CSUN 2018
 
The 7 minute accessibility assessment and app rating system
The 7 minute accessibility assessment and app rating systemThe 7 minute accessibility assessment and app rating system
The 7 minute accessibility assessment and app rating system
 
Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)Introduction to Aspect Oriented Programming (DDD South West 4.0)
Introduction to Aspect Oriented Programming (DDD South West 4.0)
 
Writing Testable Code in SharePoint
Writing Testable Code in SharePointWriting Testable Code in SharePoint
Writing Testable Code in SharePoint
 
Research Updates from Rasa: Transformers in NLU and Dialogue
Research Updates from Rasa: Transformers in NLU and DialogueResearch Updates from Rasa: Transformers in NLU and Dialogue
Research Updates from Rasa: Transformers in NLU and Dialogue
 
Low-code vs Model-Driven Engineering
Low-code vs Model-Driven EngineeringLow-code vs Model-Driven Engineering
Low-code vs Model-Driven Engineering
 
Android application development part2
Android application development part2Android application development part2
Android application development part2
 
Elements of a Test Framework
Elements of a Test FrameworkElements of a Test Framework
Elements of a Test Framework
 

Similar to How to Effectively Test Your Chatbot | Rasa Summit

Thomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdfThomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdfQA or the Highway
 
Creating testing tools to support development
Creating testing tools to support developmentCreating testing tools to support development
Creating testing tools to support developmentChema del Barco
 
Test automation lesson
Test automation lessonTest automation lesson
Test automation lessonSadaaki Emura
 
Test Automation Architecture That Works by Bhupesh Dahal
Test Automation Architecture That Works by Bhupesh DahalTest Automation Architecture That Works by Bhupesh Dahal
Test Automation Architecture That Works by Bhupesh DahalQA or the Highway
 
Karishma Kolli – Myth Busters on Test Automation
Karishma Kolli – Myth Busters on Test AutomationKarishma Kolli – Myth Busters on Test Automation
Karishma Kolli – Myth Busters on Test AutomationPractiTest
 
CV_Sachin_11Years_Automation_Performance
CV_Sachin_11Years_Automation_PerformanceCV_Sachin_11Years_Automation_Performance
CV_Sachin_11Years_Automation_PerformanceSachin Kodagali
 
Automated Testing but like for PowerShell (April 2012)
Automated Testing but like for PowerShell (April 2012)Automated Testing but like for PowerShell (April 2012)
Automated Testing but like for PowerShell (April 2012)Rob Reynolds
 
Test team dynamics, Антон Мужайло
Test team dynamics, Антон МужайлоTest team dynamics, Антон Мужайло
Test team dynamics, Антон МужайлоSigma Software
 
Testing Conversational AI
Testing Conversational AITesting Conversational AI
Testing Conversational AIShama Ugale
 
How to scale your Test Automation
How to scale your Test AutomationHow to scale your Test Automation
How to scale your Test AutomationKlaus Salchner
 
Why test automation projects are failing
Why test automation projects are failingWhy test automation projects are failing
Why test automation projects are failingIgor Khrol
 
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)Dinis Cruz
 
SauceCon 2017: Making Your Mobile App Automatable
SauceCon 2017: Making Your Mobile App AutomatableSauceCon 2017: Making Your Mobile App Automatable
SauceCon 2017: Making Your Mobile App AutomatableSauce Labs
 
A Sampling of Tools
A Sampling of ToolsA Sampling of Tools
A Sampling of ToolsDawn Code
 
Unit Testing and role of Test doubles
Unit Testing and role of Test doublesUnit Testing and role of Test doubles
Unit Testing and role of Test doublesRitesh Mehrotra
 
Winning the battle against Automated testing
Winning the battle against Automated testingWinning the battle against Automated testing
Winning the battle against Automated testingElena Laskavaia
 
How to Go Codeless for Automated Mobile App Testing
How to Go Codeless for Automated Mobile App TestingHow to Go Codeless for Automated Mobile App Testing
How to Go Codeless for Automated Mobile App TestingApplause
 
Automated Acceptance Test Practices and Pitfalls
Automated Acceptance Test Practices and PitfallsAutomated Acceptance Test Practices and Pitfalls
Automated Acceptance Test Practices and PitfallsWyn B. Van Devanter
 

Similar to How to Effectively Test Your Chatbot | Rasa Summit (20)

Thomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdfThomas Haver - Mobile Testing.pdf
Thomas Haver - Mobile Testing.pdf
 
QAorHighway2016
QAorHighway2016QAorHighway2016
QAorHighway2016
 
Creating testing tools to support development
Creating testing tools to support developmentCreating testing tools to support development
Creating testing tools to support development
 
Test automation lesson
Test automation lessonTest automation lesson
Test automation lesson
 
Test Automation Architecture That Works by Bhupesh Dahal
Test Automation Architecture That Works by Bhupesh DahalTest Automation Architecture That Works by Bhupesh Dahal
Test Automation Architecture That Works by Bhupesh Dahal
 
Karishma Kolli – Myth Busters on Test Automation
Karishma Kolli – Myth Busters on Test AutomationKarishma Kolli – Myth Busters on Test Automation
Karishma Kolli – Myth Busters on Test Automation
 
CV_Sachin_11Years_Automation_Performance
CV_Sachin_11Years_Automation_PerformanceCV_Sachin_11Years_Automation_Performance
CV_Sachin_11Years_Automation_Performance
 
Automated Testing but like for PowerShell (April 2012)
Automated Testing but like for PowerShell (April 2012)Automated Testing but like for PowerShell (April 2012)
Automated Testing but like for PowerShell (April 2012)
 
Test team dynamics, Антон Мужайло
Test team dynamics, Антон МужайлоTest team dynamics, Антон Мужайло
Test team dynamics, Антон Мужайло
 
Testing Conversational AI
Testing Conversational AITesting Conversational AI
Testing Conversational AI
 
How to scale your Test Automation
How to scale your Test AutomationHow to scale your Test Automation
How to scale your Test Automation
 
Agile testing
Agile testingAgile testing
Agile testing
 
Why test automation projects are failing
Why test automation projects are failingWhy test automation projects are failing
Why test automation projects are failing
 
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)
Start with passing tests (tdd for bugs) v0.5 (22 sep 2016)
 
SauceCon 2017: Making Your Mobile App Automatable
SauceCon 2017: Making Your Mobile App AutomatableSauceCon 2017: Making Your Mobile App Automatable
SauceCon 2017: Making Your Mobile App Automatable
 
A Sampling of Tools
A Sampling of ToolsA Sampling of Tools
A Sampling of Tools
 
Unit Testing and role of Test doubles
Unit Testing and role of Test doublesUnit Testing and role of Test doubles
Unit Testing and role of Test doubles
 
Winning the battle against Automated testing
Winning the battle against Automated testingWinning the battle against Automated testing
Winning the battle against Automated testing
 
How to Go Codeless for Automated Mobile App Testing
How to Go Codeless for Automated Mobile App TestingHow to Go Codeless for Automated Mobile App Testing
How to Go Codeless for Automated Mobile App Testing
 
Automated Acceptance Test Practices and Pitfalls
Automated Acceptance Test Practices and PitfallsAutomated Acceptance Test Practices and Pitfalls
Automated Acceptance Test Practices and Pitfalls
 

More from Rasa Technologies

Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Rasa Technologies
 
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...Rasa Technologies
 
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Rasa Technologies
 
The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...Rasa Technologies
 
Boss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitRasa Technologies
 
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitRasa Technologies
 
Applying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseRasa Technologies
 
Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Rasa Technologies
 
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 Rasa Technologies
 
Continuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitRasa Technologies
 
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Rasa Technologies
 
The State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureRasa Technologies
 
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021Rasa Technologies
 
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Rasa Technologies
 
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Technologies
 
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...Rasa Technologies
 
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...Rasa Technologies
 
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from Rasa
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from RasaRasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from Rasa
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from RasaRasa Technologies
 
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & Intro
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & IntroRasa Developer Summit - Alan Nichol, Rasa - Welcome & Intro
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & IntroRasa Technologies
 
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...Rasa Technologies
 

More from Rasa Technologies (20)

Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
Beyond Sentiment Analysis: Creating Engaging Conversational Experiences throu...
 
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
End-to-end dialogue systems, or a feature which wasn’t meant to happen | Rasa...
 
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
Voice First: Ready Your Content to Serve 50% of Global Searches | Rasa Summit...
 
The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...The missing link: How AI can help create a safer society and better businesse...
The missing link: How AI can help create a safer society and better businesse...
 
Boss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa SummitBoss - Bringing More Diversity to Tech | Rasa Summit
Boss - Bringing More Diversity to Tech | Rasa Summit
 
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa SummitHow Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
How Our Team Uses Rasa to Learn from Real Conversations | Rasa Summit
 
Applying Conversational AI in the Enterprise
Applying Conversational AI in the EnterpriseApplying Conversational AI in the Enterprise
Applying Conversational AI in the Enterprise
 
Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021
 
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021 STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
STAR: A Schema-Guided Dialog Dataset for Transfer Learning | Rasa Summit 2021
 
Continuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa SummitContinuous Improvement of Conversational AI in Production | Rasa Summit
Continuous Improvement of Conversational AI in Production | Rasa Summit
 
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
Ethnobots: Reimagining Chatbots as Ethnographic Research Tools | Rasa Summit ...
 
The State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational FutureThe State of Conversation Design - Designing for the Conversational Future
The State of Conversation Design - Designing for the Conversational Future
 
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
What’s next in CDD: Intent Clashes and Selective Confidence | Rasa Summit 2021
 
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
Conversational Teams: Moving Fast at Scale | Rasa Summit 2021
 
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
Rasa Developer Summit - Bing Liu - Interactive Learning of Task-Oriented Dial...
 
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
Rasa Developer Summit - Josh Converse, Dynamic Offset - Three Part Harmony: H...
 
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...
Rasa Developer Summit - Praneeth Gubbala, NLP Engineer, Sam's Club at Walmart...
 
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from Rasa
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from RasaRasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from Rasa
Rasa Developer Summit - Tom Bocklisch, Rasa - Product Updates from Rasa
 
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & Intro
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & IntroRasa Developer Summit - Alan Nichol, Rasa - Welcome & Intro
Rasa Developer Summit - Alan Nichol, Rasa - Welcome & Intro
 
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...
Rasa Developer Summit - Juste Petraityte, Rasa - Rasa Community Updates & Out...
 

Recently uploaded

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 

Recently uploaded (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

How to Effectively Test Your Chatbot | Rasa Summit

  • 1. How to ”Effectively” ”Test” your Chatbot Soumya Mukherjee Director QA, DevOps & AIML Apty.IO
  • 2. How are we doing our QA today • Testing is Blackbox for testers • Mostly manual testing done in organization • Conversational flow testing • Small Talk • Fallback checks • Integrations • Automation done on UI and API layer • Testing is mostly done on same training data • Models are trained by engineers and are not being monitored by QA • There are analytics tools available to monitor but it needs technical expertise for the QA • Result : More than 90% times bot breaks (no one understands when it will break), most of them fallback and get stuck - once bot is stuck it is stuck Q ? A
  • 3. What are the issues in QA ? • Bots are evolving and continuous story creation is a problem • No tool manage story coverage • Your training data may not correspond to new stories or vice versa (it’s a mismatch) – most org keep training on the same data • Most automation tools offers record and playback (My stories are already written how to port is the question)
  • 4. What are the issues in QA ? • No (unified) centralized dashboard present where QA can check (everything is quite scattered) • Intent Matching • Entity Testing – Slot identification • Entity Testing – Entity Validation • Confidence score • Confusion Matrix along with Precision/Recall/F1-Score • No easy way to reset the failed bot ! • Bot versioning is a mess and A/B testing becomes difficult • Multilingual bot QA is a challenge (have to make 2 separate bots) • High confidence score is also a problem as your bot will only predict same thing (if the data is same for multiple intents then it will predict the one with highest confidence score – may be incorrect) How to make sure your bot never breaks ?
  • 5. How to make your test effective ? • Create scenarios for happy path, contextual questions, digressions, domain specific questions, stateless conversations • Map proper entities for common scenarios (example bus fee, tuition fee) – flow should change with entities in the stories • Automated tests should consume all stories and run them each time as part of regression testing • Story coverage visualization • For Manual Testing use Bot emulation product (like RasaX, Botfront) to test
  • 6. How to make your test effective ? • Central dashboarding including : • Confusion matrix, Precision, Recall and F1-Score • Cumulative accuracy profile • Cross validation results • Perform Exhaustive testing (bot resiliency), Integration checks across platforms, Webhooks • Perform fault tolerance testing by performing performance testing (bot response, session management) & security testing (api interaction, typing speed check, punctuations, typo errors)
  • 7. Other KPIs to track • Activity Volume • Bounce rate • Retention rate • Open sessions count • Session times (conversation length) • Goal completion rate • User feedback (sentiments) • Fallback rate (Confusion rate, reset rate & Human takeover rate)