Conversational AI. Less Hard.
IVA IS THE NEW IVR
CMO & Head of Sales, SmartAction
Brian Morin
CX Solutions Consultant, SmartAction
Phillip Fisher
Confidential & Proprietary
©SmartAction | 2
About SmartAction
AI-Powered Virtual
Agents for Omnichannel
Self-Service
What We Do Our Mission
To make life less hard®
for enterprises and
their customers
#1 Rated
Solution on Gartner
Peer Insights
Achievements
Founded as AI research
firm & headquartered in
Los Angeles, CA
Company
100+ Customers
across 12 industries
Confidential & Proprietary
©SmartAction | 3
Don’t Take Our
Word for it
18 Published Case Studies
“Best Return on Customer Service
Investment”
©SmartAction 4
Confidential & Proprietary
©SmartAction | 4
Finding the Conversational AI Fit
Complex
Human
CONVERSATIONAL AI
LIVE AGENTS
LIVE AGENTS
Simple
IVR & CHATBOTS
“Good afternoon Lindsay, are you
calling about your upcoming
appointment on Tuesday?”
“Yes, I need to reschedule it for the
same time next week.”
Imagine an automated
system that is…
intelligent, personalized,
and natural.
©SmartAction
©SmartAction
15%
of all customer service interactions
globally will be handled completely
with AI
Gartner Predicts by 2021
Confidential & Proprietary
©SmartAction | 7
 Intent capture + authentication
 Proof of Insurance
 SMS integration
 Change address
 Add vehicle
LIVE DEMO
Insurance
©SmartAction | 8
Confidential & Proprietary
 25,000 Minutes / Month
Threshold for Change
©SmartAction
Tip #1
Choose your
reference architecture
© SmartAction
©SmartAction 10
Confidential & Proprietary
©SmartAction | 10
Scenario A – offshoot from DTMF
Phone
Chat
Text
Customer Contact Center
Platform
(Routing)
Data
Sources
Virtual
Agent
Live
Agent
©SmartAction 11
Confidential & Proprietary
©SmartAction | 11
Scenario B – Natural Language Front Door
Phone
Chat
Text
Customer Data
Sources
Virtual
Agent
Live
Agent
Virtual
Agent
©SmartAction
Tip #2
Choose your
approach
© SmartAction
©SmartAction
DIY? Partner?
©SmartAction
Tip #3:
Choose your CX
team
© SmartAction
Conversational AI
Solution Design
Use Case
Mapping
Enhancements
& Upgrades
Reporting
& Analytics
Project
Management
Training
& Tuning
Technology
Technology
Services
Conversational AI
is not an “off the
shelf” product
It’s an iterative process
that requires care and
feeding
©SmartAction
©SmartAction
Best-of-Breed
CX Team
Automate more without
sacrifice one once of CX
Client
Advocacy
Marilyn
Customer Success
Analytics
& Tuning
Eli
Customer Insights
AI Automation
Assessment
Charles
CX Consultant
CX Design
& Strategy
Mark
CX Design
Quality
Assurance
Ryan
QA
CX TEAM
Solutioning
& ROI
Trent
Solutions
Expert
Project
Management
Kathy
Project Manager
Engineering
David
Engineer
©SmartAction
Tip #4:
Choose your AI
approach
© SmartAction
Confidential & Proprietary
©SmartAction | 18
Why Speech
Rec Is So Hard
Hi-Def
Input
Lo-Def
8K Telephony
Noise
SmartAction
Confidential & Proprietary
©SmartAction | 19
AAA Vehicle Capture
©SmartAction | 19
 Conditions – outside, cross talk,
wind, line noise, bad cell
connection, accent, speaker
phone
 Caller intended to say “Ford F250
truck”
 Speech-to-text heard “aboard
have to fifty truck”
 With confidence scoring, the NLU
engine correctly determined
“Ford F250 truck”
AI: Recognition + Cognition
Automatic Speech Rec
“Ford F250”
“To confirm, you said
Ford F250?”
NLU Output
Hypotheses Confidence
Scores
Lexical
Analysis
Topic
Classification
Information
Extraction
Entity
Detection
Syntactical
Analysis
Semantic
Parsing
Advanced NLU
Intent Matching
Acoustic Models Expected Intents
Cognition
Recognition
Data
Capture
Data
Labeling
Machine
Learning
“Aboard have to fifty”
Confidential & Proprietary
©SmartAction | 21
Defined Scope
or Pattern
“90292”
Address Capture
Admiralty Way Lighthouse Court
Anchorage Street Eastwind Street
Bora Bora Way Maxella Avenue
Culver Boulevard S Esplanade
E Ketch Street Topsail Court
Glencoe Ave Pacific Avenue
Next Steps
info@smartaction.com
o Identify:
1. Call/chat types perfect for automation
2. Expected completion rates
3. Expected call volume deflection
o ROI Calculation
Free AI–Readiness
Assessment
Request Demo
or
What You’ll Get
By 2021, 15% of the customer experience will be managed without humans.
©SmartAction
Confidential & Proprietary
© SmartAction
Confidential & Proprietary
© SmartAction

IVA is the New IVR Masterclass

  • 1.
    Conversational AI. LessHard. IVA IS THE NEW IVR CMO & Head of Sales, SmartAction Brian Morin CX Solutions Consultant, SmartAction Phillip Fisher
  • 2.
    Confidential & Proprietary ©SmartAction| 2 About SmartAction AI-Powered Virtual Agents for Omnichannel Self-Service What We Do Our Mission To make life less hard® for enterprises and their customers #1 Rated Solution on Gartner Peer Insights Achievements Founded as AI research firm & headquartered in Los Angeles, CA Company 100+ Customers across 12 industries
  • 3.
    Confidential & Proprietary ©SmartAction| 3 Don’t Take Our Word for it 18 Published Case Studies “Best Return on Customer Service Investment”
  • 4.
    ©SmartAction 4 Confidential &Proprietary ©SmartAction | 4 Finding the Conversational AI Fit Complex Human CONVERSATIONAL AI LIVE AGENTS LIVE AGENTS Simple IVR & CHATBOTS
  • 5.
    “Good afternoon Lindsay,are you calling about your upcoming appointment on Tuesday?” “Yes, I need to reschedule it for the same time next week.” Imagine an automated system that is… intelligent, personalized, and natural. ©SmartAction
  • 6.
    ©SmartAction 15% of all customerservice interactions globally will be handled completely with AI Gartner Predicts by 2021
  • 7.
    Confidential & Proprietary ©SmartAction| 7  Intent capture + authentication  Proof of Insurance  SMS integration  Change address  Add vehicle LIVE DEMO Insurance
  • 8.
    ©SmartAction | 8 Confidential& Proprietary  25,000 Minutes / Month Threshold for Change
  • 9.
    ©SmartAction Tip #1 Choose your referencearchitecture © SmartAction
  • 10.
    ©SmartAction 10 Confidential &Proprietary ©SmartAction | 10 Scenario A – offshoot from DTMF Phone Chat Text Customer Contact Center Platform (Routing) Data Sources Virtual Agent Live Agent
  • 11.
    ©SmartAction 11 Confidential &Proprietary ©SmartAction | 11 Scenario B – Natural Language Front Door Phone Chat Text Customer Data Sources Virtual Agent Live Agent Virtual Agent
  • 12.
  • 13.
  • 14.
    ©SmartAction Tip #3: Choose yourCX team © SmartAction
  • 15.
    Conversational AI Solution Design UseCase Mapping Enhancements & Upgrades Reporting & Analytics Project Management Training & Tuning Technology Technology Services Conversational AI is not an “off the shelf” product It’s an iterative process that requires care and feeding ©SmartAction
  • 16.
    ©SmartAction Best-of-Breed CX Team Automate morewithout sacrifice one once of CX Client Advocacy Marilyn Customer Success Analytics & Tuning Eli Customer Insights AI Automation Assessment Charles CX Consultant CX Design & Strategy Mark CX Design Quality Assurance Ryan QA CX TEAM Solutioning & ROI Trent Solutions Expert Project Management Kathy Project Manager Engineering David Engineer
  • 17.
    ©SmartAction Tip #4: Choose yourAI approach © SmartAction
  • 18.
    Confidential & Proprietary ©SmartAction| 18 Why Speech Rec Is So Hard Hi-Def Input Lo-Def 8K Telephony Noise SmartAction
  • 19.
    Confidential & Proprietary ©SmartAction| 19 AAA Vehicle Capture ©SmartAction | 19  Conditions – outside, cross talk, wind, line noise, bad cell connection, accent, speaker phone  Caller intended to say “Ford F250 truck”  Speech-to-text heard “aboard have to fifty truck”  With confidence scoring, the NLU engine correctly determined “Ford F250 truck”
  • 20.
    AI: Recognition +Cognition Automatic Speech Rec “Ford F250” “To confirm, you said Ford F250?” NLU Output Hypotheses Confidence Scores Lexical Analysis Topic Classification Information Extraction Entity Detection Syntactical Analysis Semantic Parsing Advanced NLU Intent Matching Acoustic Models Expected Intents Cognition Recognition Data Capture Data Labeling Machine Learning “Aboard have to fifty”
  • 21.
    Confidential & Proprietary ©SmartAction| 21 Defined Scope or Pattern “90292” Address Capture Admiralty Way Lighthouse Court Anchorage Street Eastwind Street Bora Bora Way Maxella Avenue Culver Boulevard S Esplanade E Ketch Street Topsail Court Glencoe Ave Pacific Avenue
  • 22.
    Next Steps info@smartaction.com o Identify: 1.Call/chat types perfect for automation 2. Expected completion rates 3. Expected call volume deflection o ROI Calculation Free AI–Readiness Assessment Request Demo or What You’ll Get By 2021, 15% of the customer experience will be managed without humans. ©SmartAction
  • 23.
    Confidential & Proprietary ©SmartAction Confidential & Proprietary © SmartAction

Editor's Notes

  • #2 Brian
  • #3 We deliver AI-powered virtual agents as a service. That means we deliver the full conversational AI technology stack. It’s turnkey. It’s omnichannel. All of our clients use our voice self-service module. About half rely on us for more than voice and include their chat and SMS channels as well. But what makes us a little different is that we’re not just trying to sell a software tool set and wish you good luck on your journey. Conversations with machines are complex. It needs experts. So we bundle end-to-end CX services with our technology. And when I say end-to-end, that means everything – the design, the build, and even the ongoing operation after go-live because it requires care and feeding. So at the end of the day, we’re really stepping in more as a partner instead of just a technology provider. That makes us responsible for delivering the CX that was promised and the ROI that was promised. We’d like to think that approach is working for us. We operate the AI-powered CX for more than 100 brands and currently the top-rated conversational solution on Gartner Peer Insights. So if you’re interested in what others have to say about us, starting with those reviews is a good place to start.
  • #6 Onscreen is the experience we want you imagine – an experience that is personalized and predictive where everything happens in natural language with AI that sounds like a human, has the ability to read and record data like a human – take cognitive action like a human This means you can go beyond just automating simple call types like status or balance, but you can automate complex conversations that have multiple back-and-forth exchanges. We do a lot of scheduling, reservations, complex things like emergency roadside assistance, warranties, returns, claims. As long as the interaction is somewhat linear in progression, it can be automated. When a customer calls in, they are starting from a place of tension. You want them to self-serve but they’re going to immediately zero out to wait on hold for an agent unless you’re able to showcase enough intelligence right away to win their confidence to attempt self-service. This means being personalized, predictive, sounding like a human, giving before taking, and accurately understand their intent from the first open ended question
  • #7 Brian
  • #8 Mark is n the final stages of delivering this application to some auto dealerships. I think it’s one everyone can identify with, calling into a dealership to schedule an appointment for some kind of service Mark, is there any setup you need to give here before kickstarting the demo?
  • #10 Brian
  • #13 Dan
  • #14 Dan
  • #15 Brian
  • #16 Ok, so lets step beyond the technology conversation and into the humans required to actually run it and open up this black box a little bit. If you were to attempt to do voice automation on your own, particularly natural language automation, that outer ring is all the jobs or functions where you need experts in their field doing that role. Ultimately, this is why we deliver our technology as a service because of the complexity involved in delivering a great voice experience The big thing to understand is that Conversational AI is not an off the shelf product – it’s merely a tool set. Contrast that with touchtone IVR. In a matter of a few short days, you design, build, and POOF - you're done. You may not even touch it again. Conversational AI is very different. It's a solution that requires ongoing care and feeding. In fact, once you "go live," you've only started. There is nothing easy about conversations with machines. To do voice well, you need real human bodies who are experts in AI-powered CX and committed to the ongoing process of perpetual improvement. This means obsessing over the CX and scrutinizing containment across every interaction to identify points of friction to iterate and improve week by week. [OPTIONAL IF TIME ALLOWS] So what does that actually mean – that means a team of both developers and trained CX professionals who are experts at tuning and customizing the application and underlying technologies like speech recognition and natural language processing to your customer-specific criteria. This means daily monitoring and reporting to troubleshoot for friction and containment, expanding grammars, widening guardrails, finding new data sources, listening to call recordings, tweaking language acoustic models, QA’ing any change. We are constantly looking for opportunities to tune the application or language models and improve the experience and you need real human bodies committed to that process.]
  • #18 Brian
  • #19 We’re talking about conversational AI that’s purpose-built for telephony and purpose-built for limited grammar use cases. I’m going to explain in a minute why that’s so important to having a good voice experience. Speech rec over telephony is really hard to do well. That might not surprise anyone considering the poor experiences we’ve all had. But you may not have understood why and what the bleeding edge of AI is doing to solve for it. It’s very different than speaking directly into your phone or home device, which is a high def experience. That’s why the speech rec on your phone is so good because you are capturing all the highs and lows at the device level which makes it easy to distinguish utterances and relate those to letters, syllables and words. But the moment you call into a customer service line and those sound waves travel over outdated telephony infrastructure, the resolution is reduced to 8K in most cases – it’s cutting out all the highs in lows by more than half. And if that wasn’t bad enough, it adds noise. This is why conversational AI over telephony is a really difficult challenge. It’s also why these same transcription-based engines like a Google or Amazon don’t deliver a good enough customer experience at the contact center level, because they are now 50% less accurate. So you have to have AI that is purpose-built for this kind of challenge I should note we have found that Google will perform the best in certain customer service use cases, and I’ll explain where and why in a minute
  • #20 Due to these challenges, the transcription from speech-to-text engines is wrong more often than you think. Here’s an example from AAA. Our virtual agents handle their emergency roadside assistance calls which are often in the worst conditions possible – they’re outside of their vehicle in the wind with traffic noise on speakerphone and there’s no way we could deliver the accuracy that we do for them if we only relied on speech recognition because it is wrong so often I’ll play this call then we’ll come back to the play-by-play on what the speech rec heard and what our NLU engine did to make sure it was a successful call [play call] If you were watching closely, you’ll see speech-to-text transcription was wrong. It transcribed very literally what it heard. If you recall, when he tried to say the word “Ford,” the “F” was not even audible. What we heard is also what the speech to text heard was “Ord.” Since “Ord’ isn’t a word, it transcribed the closest thing it could find which was “Aboard.” Since “F” isn’t a word, it was transcribed that as “have” and the “250” was transcribed as “to” and the word fifty If the speech to text was so far off, how were we able to get it right?
  • #21 Here’s where our secret sauce comes in that we do uniquely to really raise the accuracy in voice. In customer service you can predict what the caller might say in response to a question. In this case, we knew we were listening for vehicle names, so we were able to program our NLU engine to only listen for vehicle names. And here’s what’s even more important – listen for anything that sounds even remotely similar to a vehicle name because speech rec is never 100% right. So even though the speech rec engine was wrong, the NLU engine was able to flag that it didn’t match any vehicle names. And by pattern matching the language acoustic models from what it heard against what we were listening for, the NLU was able to determine Ford F250 was the closest match A lot of voice platforms are relying on the accuracy of their speech rec and that’s it. And the reason isn’t necessarily a lack of know-how but rather they are trying to be a software platform or voice API that has to be all things to all people. On the converse, we’re of the position that speech rec by itself isn’t good enough. That’s why we bundle services with the technology so we can tailor the AI and tailor the experience to your business question by question across every interaction. In our opinion, the experience over voice just isn’t good enough unless it’s augmented by this level of customization The best way to think of this is as conversational AI that is purpose-built for the contact centers because only customer service asks questions that have a specific range of answers you know you’re going to get. So as long as you know what those grammars are, you can narrow the aperture of what you’re listening for and tune for only those grammars or anything that sounds even remotely similar to one of those grammars. And that’s what really drives up the accuracy
  • #22 So let’s talk about a use case where the expected response from a customer is much wider than yes/no – Address Capture is really good example. We do Address Capture for a lot of clients (like Designer Shoe Warehouse and Choice Hotels), but the only reason we can do it (and we do it very well) is because we can match against street names as long as we know the zip code. As you can see onscreen, when we get zipcode from a customer, we’re able to do a data dip to pull up street names to match against. This is what gives that really high accuracy. If we didn’t have pattern matching ability, we would default to a transcription-based approach like Google’s for something like this We do alphanumeric capture for certain use cases, and we are the only company doing alphanumeric capture - model names and serial numbers for product registration, VIN numbers, policy number for insurance. We don’t like alphanumeric capture unless there’s a defined scope or pattern we can match against. The cases I mentioned have that which is why we’re the only player doing it. In any given sequence, we’re not looking for all letters of the alphabet – only a select ones, so we can weight whatever we hear against what we know we’re listening for We capture the make and model of vehicles for AAA emergency roadside assistance (we also do it for dealerships which you just heard). That is not a narrow scope – the aperture of makes and models is so wide, you would typically need a transcription-based engine, which, frankly, wouldn’t work very well. But since we can pattern match against a database of makes and models, we’re able to do it far better than you can get from a Google or Nuance or similar engine
  • #23 DAN