… or how I learned stop worrying and love the chatbot framework | Rasa Summit 2021

Rasa Technologies
Rasa TechnologiesRasa Technologies
PAGE1
… or how I learned stopworryingand love the
chatbot framework
formypals at RasaSummit2021
Heather Nolis
MachineLearningEngineer
AI@T-Mobile– @heatherklus
PAGE2
Heather Nolis
machine learning engineer @T-Mobilesince2017
formerneurosciencePhDhopeful
very activeontwitter@heatherklus
forbetterresponsetime, email meatwork
heather.wensler1@t-mobile.com
PAGE3
TheTeam
Scope customer care
PAGE4
Fully StackedTeamfor Real-TimeAI
 Data:
 Data scientists
 Analysts
 Data engineers
 Machine learning engineers
 Software:
 Developers
 Architects
 Ops specialists
 Product:
 Product managers
 Delivery managers
from idea
to deployment
to support ✨
PAGE5
August 18, 2018
PAGE6
PAGE7
Weserveover2millioninsightsaday(andgrowing!)
I thinkI forgot to pay my bill 😅
Can I do that now?
Absolutely!
CUSTOMER
T-MOBILE
EXPERT
Flagship product:EXPERT ASSIST
Coach Assist
PAGE 8| AI @ T-MOBILE
Neural networks with TensorFlow
A Convolutional Neural Network (CNN) processes initial customer message
and customer data.
Models aredeployed in containers using Kubernetes.
“Unlockmyphone.”
ACCOUNT UNLOCK ORDER
0.80 0.15
0.05
Recent order: YES
CNN
PAGE9
February 12, 2021
Heather@ Rasa Summit ??
PAGE10
Some CustomersPreferSelf-Service
One third of care callsopt-in to a bot experience
Messaging care volume continuallyincreases.
More customers prefer messaging each year.
The onlywaywecould trulybelistening toourcustomersis to builda chatbot– forthose whowantit.
PAGE11
Solets’s makea bot!
Wehaveathatgreat topicmodel….
Let’sjust throwsomething ontopofthat!
Makessense,right?
PAGE12
RealquickcanI havelike 10new
intents?
They’resuperspecific.
Tensorflow models take thousands and thousands of human created labels.
PAGE13
A taleof 10 intents, a Self-AssistBotStory
In-HouseTensorflow topic model
88 intents
Hierarchical, defined taxonomy
General topics
Runs on a 10-message window
2,000 utterances per intent (at minimum)
Self-Assist Bot Ask
10 new intents
Overlapping
Highly specific
Runs on a single message
No labeled data
No data labeling support
PAGE14
Our topic: General Payment
Intent they wanted: Pay My Bill
Things thatare the topic general payment but are not “pay my bill”
I won’t pay my bill because I don’t understand it.
Checking to see if my payment hasgone through.
I want to change my payment method.
If we treated our topics as intents,we risked showing nonsense
responses to customers.
PAGE15
Shop for a device
vs
Add a line and get a new
device too
vs
Add a line but bring your
old phone
“Iwannabuy aCoolPhone”
“Upgrademy phoneto the CoolPhone
“Buya CoolPhoneformy sister’sline”
“Buya CoolPhoneformy sisterandadd the lineforher”
“Iwanttoadd my sistertomy accountbut Iwantthe CoolPhonebogo”
“Mysisterneedstobeadded tomy accountbut isbringingherown CoolPhone”
“We can’t wait monthsfor you to add new intents.”
PAGE16
Idea:
Try Rasa
 Why Rasa?
 Solid machinelearning
 open source(wecanchecktheircode)
 uses the sameframework(Tensorflow)as our internaltopic model
 Lesstraining data
 Reuseour custom embeddings
 Extensibleinto further bot functionality
 Problem:
 Time boxedto 4 hours devtime
 (training timenot included)
PAGE17
4-hour results
Accuracy:
83.1%
F1 Score:
82.1%
Precision:
83.6%
… so now we useRasa
PAGE 18 |AI @ T-MOBILE
So what’s different with Rasa?
PAGE19
The modelsare different…
BespokeTensorflow Topic Model
Runs on a window of messages
2,000 utterances (minimum) to bootstrap an
intent
About 80% accuracy
Rasa NLU Model
Runs on a single message
About 100 utterances to bootstrap
an intent
83.1% accuracy
PAGE20
…but so isthe pace.
BespokeTensorflow Topic Model
2Yearsin market
Hundreds of production releases
<10 model releases
+2 intents
Rasa NLU Model
5months in market
43 production releases
19 model releases.
+28 intents
PAGE21
Visibilityleads totrust.
With Rasa X, visibility comes out-of-the-box.
 Immediately review the impact of releases in realtime.
 Allow stakeholders to review conversations, building trust in our
systems.
 Allows stakeholders to suggest improvements directly to mygit
repo– without knowinggit.
PAGE22
The burdenof initial data is lessened.
Fast intent creation leads to
rapid experimentation.
Intent:Broken
Canreleasesmallintent “stubs”andquickly iteratewithlive
conversationreviews
Reporting available out of the
box.
Topic modelaudit… stillongoing.
ProdAccuracy is King–butcross-validationmetricshelptarget
areasfor incrementalimprovement.
PAGE23
UX “tiger team” runs parallel scrum
to software.
 Smallerteam allowsforfasterimprovements.
 Productowner
 ConversationDesigner
 Datascientist/machinelearning engineer
 Bottuners:researchnew intents,implement weeklyupgradesto
models
 Rotationalsoftwareengineer
 Allows forcross-trainingonRasamodels
 Createstight cohesionnecessaryforfun,personalbotresponses
withlots ofapiintegrations
PAGE24
Sowhat’s the impact?
CustomerAssist (aka“Cassie”) took3.4milliondollarsworthofcarecontactssince
ourlaunchin July.
Wehavesolda fewmorechatbotprojects–andhavemultiple chatbotteams.
DatascientiststhroughoutT-Mobileareleveraging Rasamodelstolessen the
burdenofmanuallylabeling data.
PAGE25
Thank you!
HeatherNolis–MachineLearningEngineer–AI@T-Mobile-@heatherklus
(Special thankyoutoTeamKitt&SMPD!)
1 of 25

More Related Content

More from Rasa Technologies(20)

Ai = your data | Rasa Summit 2021Ai = your data | Rasa Summit 2021
Ai = your data | Rasa Summit 2021
Rasa Technologies99 views
Rasa Open Source - What's next?Rasa Open Source - What's next?
Rasa Open Source - What's next?
Rasa Technologies155 views

Recently uploaded(20)

[2023] Putting the R! in R&D.pdf[2023] Putting the R! in R&D.pdf
[2023] Putting the R! in R&D.pdf
Eleanor McHugh34 views
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation23 views
Web Dev - 1 PPT.pdfWeb Dev - 1 PPT.pdf
Web Dev - 1 PPT.pdf
gdsczhcet48 views

… or how I learned stop worrying and love the chatbot framework | Rasa Summit 2021

  • 1. PAGE1 … or how I learned stopworryingand love the chatbot framework formypals at RasaSummit2021 Heather Nolis MachineLearningEngineer AI@T-Mobile– @heatherklus
  • 2. PAGE2 Heather Nolis machine learning engineer @T-Mobilesince2017 formerneurosciencePhDhopeful very activeontwitter@heatherklus forbetterresponsetime, email meatwork heather.wensler1@t-mobile.com
  • 4. PAGE4 Fully StackedTeamfor Real-TimeAI  Data:  Data scientists  Analysts  Data engineers  Machine learning engineers  Software:  Developers  Architects  Ops specialists  Product:  Product managers  Delivery managers from idea to deployment to support ✨
  • 7. PAGE7 Weserveover2millioninsightsaday(andgrowing!) I thinkI forgot to pay my bill 😅 Can I do that now? Absolutely! CUSTOMER T-MOBILE EXPERT Flagship product:EXPERT ASSIST Coach Assist
  • 8. PAGE 8| AI @ T-MOBILE Neural networks with TensorFlow A Convolutional Neural Network (CNN) processes initial customer message and customer data. Models aredeployed in containers using Kubernetes. “Unlockmyphone.” ACCOUNT UNLOCK ORDER 0.80 0.15 0.05 Recent order: YES CNN
  • 10. PAGE10 Some CustomersPreferSelf-Service One third of care callsopt-in to a bot experience Messaging care volume continuallyincreases. More customers prefer messaging each year. The onlywaywecould trulybelistening toourcustomersis to builda chatbot– forthose whowantit.
  • 11. PAGE11 Solets’s makea bot! Wehaveathatgreat topicmodel…. Let’sjust throwsomething ontopofthat! Makessense,right?
  • 12. PAGE12 RealquickcanI havelike 10new intents? They’resuperspecific. Tensorflow models take thousands and thousands of human created labels.
  • 13. PAGE13 A taleof 10 intents, a Self-AssistBotStory In-HouseTensorflow topic model 88 intents Hierarchical, defined taxonomy General topics Runs on a 10-message window 2,000 utterances per intent (at minimum) Self-Assist Bot Ask 10 new intents Overlapping Highly specific Runs on a single message No labeled data No data labeling support
  • 14. PAGE14 Our topic: General Payment Intent they wanted: Pay My Bill Things thatare the topic general payment but are not “pay my bill” I won’t pay my bill because I don’t understand it. Checking to see if my payment hasgone through. I want to change my payment method. If we treated our topics as intents,we risked showing nonsense responses to customers.
  • 15. PAGE15 Shop for a device vs Add a line and get a new device too vs Add a line but bring your old phone “Iwannabuy aCoolPhone” “Upgrademy phoneto the CoolPhone “Buya CoolPhoneformy sister’sline” “Buya CoolPhoneformy sisterandadd the lineforher” “Iwanttoadd my sistertomy accountbut Iwantthe CoolPhonebogo” “Mysisterneedstobeadded tomy accountbut isbringingherown CoolPhone” “We can’t wait monthsfor you to add new intents.”
  • 16. PAGE16 Idea: Try Rasa  Why Rasa?  Solid machinelearning  open source(wecanchecktheircode)  uses the sameframework(Tensorflow)as our internaltopic model  Lesstraining data  Reuseour custom embeddings  Extensibleinto further bot functionality  Problem:  Time boxedto 4 hours devtime  (training timenot included)
  • 18. PAGE 18 |AI @ T-MOBILE So what’s different with Rasa?
  • 19. PAGE19 The modelsare different… BespokeTensorflow Topic Model Runs on a window of messages 2,000 utterances (minimum) to bootstrap an intent About 80% accuracy Rasa NLU Model Runs on a single message About 100 utterances to bootstrap an intent 83.1% accuracy
  • 20. PAGE20 …but so isthe pace. BespokeTensorflow Topic Model 2Yearsin market Hundreds of production releases <10 model releases +2 intents Rasa NLU Model 5months in market 43 production releases 19 model releases. +28 intents
  • 21. PAGE21 Visibilityleads totrust. With Rasa X, visibility comes out-of-the-box.  Immediately review the impact of releases in realtime.  Allow stakeholders to review conversations, building trust in our systems.  Allows stakeholders to suggest improvements directly to mygit repo– without knowinggit.
  • 22. PAGE22 The burdenof initial data is lessened. Fast intent creation leads to rapid experimentation. Intent:Broken Canreleasesmallintent “stubs”andquickly iteratewithlive conversationreviews Reporting available out of the box. Topic modelaudit… stillongoing. ProdAccuracy is King–butcross-validationmetricshelptarget areasfor incrementalimprovement.
  • 23. PAGE23 UX “tiger team” runs parallel scrum to software.  Smallerteam allowsforfasterimprovements.  Productowner  ConversationDesigner  Datascientist/machinelearning engineer  Bottuners:researchnew intents,implement weeklyupgradesto models  Rotationalsoftwareengineer  Allows forcross-trainingonRasamodels  Createstight cohesionnecessaryforfun,personalbotresponses withlots ofapiintegrations
  • 24. PAGE24 Sowhat’s the impact? CustomerAssist (aka“Cassie”) took3.4milliondollarsworthofcarecontactssince ourlaunchin July. Wehavesolda fewmorechatbotprojects–andhavemultiple chatbotteams. DatascientiststhroughoutT-Mobileareleveraging Rasamodelstolessen the burdenofmanuallylabeling data.