AWS offers a family of AI services that provide cloud-native Machine Learning and Deep Learning technologies, allowing developers to build an entirely new generation of apps that can hear, speak, understand, and converse with application users. When creating chat- and voice-enabled applications, developers have the choice of building with Amazon Lex and Amazon Polly, or, with the Alexa Skills Kit, available now in Australia and New Zealand. With the Alexa Skills Kit, you can build engaging skills to reach customers through tens of millions of Alexa-enabled devices, like the Amazon Echo and Echo Dot.
Reimagining your user experience with Amazon Lex, Amazon Polly and Alexa Skills Kit- Dev Lounge
1. D
LOUNGE
Adam Larter
Principal Solutions Architect, Developer Specialist, AWS AU/NZ
EV
Re-imagine your user experience with
Amazon Lex, Amazon Polly and Alexa Skills Kit
2018
March, 2018
2. D
LOUNGE
EV
1 9 7 0 s 1 9 8 0 s 1 9 9 0 s P R E S E N T2 0 0 0 s
Character Mode
GUI
Web
Mobile
VUI
Evolving HCI
4. D
LOUNGE
EV
Amazon Echo Dot with
Amazon Alexa
What are my options for VUI?
Amazon Lex Amazon Polly&
“I want to create skills that leverage
the Alexa ecosystem (including AVS
on my own hardware)”
“I want to integrate a voicebot or
chatbot into my own website,
application or hardware, and other
channels such as social networks”
5. D
LOUNGE
EV
Learning outcomes
• Concepts to understand in Voice User Interface design
• Amazon Lex, Amazon Polly and Amazon Alexa Skills Kit
controls and features
• Creating and testing your skills & handlers
• Using Lambda functions to control conversation flow
• Managing the conversation context using
session-based attributes
• Integrating with a website, mobile app,
social platform or device hardware
9. D
LOUNGE
EV
Business requirements
• Add a text-based chatbot to TravelBuddy to
compliment the existing GUI
• Extend my site’s reach to Facebook Messenger
• Install ‘kiosks’ in public places and use voice interaction
• Create an Alexa Skill to target voice-only and
voice-and-video B2C products like Amazon Echo,
Amazon Echo Dot and Amazon Echo Show
• All powered by the same dataset on all channels
11. D
LOUNGE
EV
Request
Audio
(or text for chat)
Speech Recognition
Machine Learning
Natural Language Understanding
Text to speech
Response
Your
Service
VUI dialogue flow with Lex & Alexa
Amazon Lex
12. D
LOUNGE
EV
Request
Audio
(or text for chat)
Visual component
Speech Recognition
Machine Learning
Natural Language Understanding
Text to speech
Response
Your
Service
VUI dialogue flow with Lex & Alexa
Amazon Lex
13. D
LOUNGE
EV
Utterances
Spoken or typed phrases that invoke
your intent
BookHotel
Intents
An intent performs an action in response
to natural language user input. Custom
and built-in intents are supported
Slots
Slots are input data required to fulfill
the intent
Fulfillment
Fulfillment mechanism for your
intent
VUI design - concepts
14. D
LOUNGE
EV
UTTERANCES, INTENTS & SLOTS
I want to book a flight
I would like to fly to {destinationCity}
I want to go from {originCity} to {destinationCity}
I need {numberOfPassengers} tickets to {destinationCity}
Intents
CheckFlightsToCity
AMAZON.CancelIntent
AMAZON.HelpIntent
AMAZON.StopIntent
VUI design - concepts
16. D
LOUNGE
EV
“I want a that’s hotel room please”{roomType}
Penthouse
Suite
Double
Single
Custom slot types
17. D
LOUNGE
EV
Synonyms
Two people
Two beds
Medium
“I want a that’s hotel room please”{roomType}
Double
Custom slot types & synonyms (entity resolution)
The ID that will be sent as the
slot value (or ID value)
The synonyms the user can
say that map to the ID value
25. D
LOUNGE
EV
TravelBuddy chatbot
• Ask about the weather at a destination
• Ask about the latest news at a destination
• Create an itinerary to fly from an origin to a destination
• Add a hotel to the itinerary
• Commit the itinerary – make the booking
The TravelBuddy chatbot allows users to…
29. D
LOUNGE
EV
• Customise user interaction
• Your application logic takes control of the decisions about
what slot to elicit next, what prompts to respond with
• Validate user input
• For each turn in the dialogue, your application logic can
validate the user input against your dataset
• Fulfill user intent
• At the conclusion of the dialogue, your application logic
can fulfil the request and respond with a result
Code hooks – trigger Lambda functions
31. D
LOUNGE
EV
• dialogAction
• type
• Close No further response
• ConfirmIntent Yes or no to confirm
• Delegate Let Amazon Lex decide
• ElicitIntent “Go to” an intent
• ElicitSlot “Go to” a slot
Dialog code hook response
35. D
LOUNGE
EV
Keeping state
• During the conversation, you need to keep state at
each turn in the dialogue:
• The values of each of the elicited slots
are retained between each turn
• You can use session attributes to
store additional metadata
• Your back-end could serialise state to persistent
storage if it makes sense for your application
(the Alexa SDK makes this easy!)
39. D
LOUNGE
EV
What tools do I need?
• Amazon Lex
• The AWS SDK
• Development environment for Lambda functions
(for example, AWS Cloud9)
• AWS CLI
• AWS Console
• Your application logic and code as Lambda functions
• Lambda Lex blueprints can help bootstrap your development
• Import/Export Lex– Alexa skill option
42. D
LOUNGE
EV
What tools do I need?
• Amazon Alexa
• The Alexa SDK (npm install -g alexa-sdk)
• Development environment for Lambda functions or HTTPS
hosting environment (ie EC2 or other)
(for example, AWS Cloud9)
• SMAPI & ASK CLI (npm install -g ask-cli)
• Amazon Developer account
• Alexa Skill Builder
• EchoSim or Echo device for testing (https://echoism.io)
• Import/Export Lex– Alexa skill option
43. D
LOUNGE
EV
Interactively create your Alexa Skill, design the dialogue flow, test, configure and publish all from the browser.
Amazon Developer Console
44. D
LOUNGE
EV
This is a set of API operations that allows you to programmatically manage and test Alexa skills and related
resources, such as interaction models. The Alexa Skills Kit Command-line Interface (ASK CLI) is a command-line
application that lets users create, update, test, and submit Alexa skills for publishing by calling SMAPI under the hood.
Alexa skill
Skill Lambda function
SMAPIASK CLI
Create, update,
test, submit
Upload,
download,
deploy
Calls
Skill Management API (SMAPI) / ASK CLI
46. D
LOUNGE
EV
• LaunchRequest
• Called when your skill is invoked without a matching Intent
• Alexa, open my simple calculator
• IntentRequest
• User speaks an utterance that maps to an Intent
• May or may not contain elicited slot values
• SessionEndRequest
• Sent when the user completes an interaction and the Intent is fulfilled
• Sent if the Intent is not fulfilled, and a timeout occurs
Types of responses sent by Alexa
47. D
LOUNGE
EV
Using the Alexa SDK to create skills
• Instantiate the Alexa.handler object to start
• Define and register Intent handlers with a call to registerHandlers()
• ‘Start’ the SDK processing with a call to execute()
48. D
LOUNGE
EV
'States’ in the Alexa SDK
The Alexa Skill SDK supports the notion of ‘States’ to partition the handlers’ implementation allowing multiple handlers
for the same Intent – the SDK will dispatch depending on the current ‘State’. This is an artificial partitioning by the SDK.
Change ‘States’ by calling this.emitWithState()
49. D
LOUNGE
EV
Generating Responses with the Alexa SDK
The Alexa Skill SDK supports two ways to generate Response Objects:
this.emit(':${action}', 'responseContent');
1 Response syntax
this.emit(':tell', 'This is something to say');
For example – to say the provided text and close the session:
this.emit('OtherIntent');
For example – to transfer to another Intent:
• The ResponseBuilder approach is more flexible when creating rich response objects
• You could instead craft the JSON response payload yourself
50. D
LOUNGE
EV
The Alexa Skill SDK supports two ways to generate Response Objects:
this.emit(':$responseReady');
2 ResponseBuilder syntax
For example – to say the provided text and close the session:
• The ResponseBuilder approach is more flexible when creating rich response objects
• You could instead craft the JSON response payload yourself
this.response.speak('This is something to say')
.listen('Reprompt speech’);
Generating Responses with the Alexa SDK
54. D
LOUNGE
EV
• Are not chatbots
• Need an invocation name (with Lex you specify the bot in your code)
• Fire the LaunchRequest if your skill is launched without any Intent
• Call your back-end for every turn of the dialogue
• Can use an HTTPS endpoint (Lex only supports AWS Lambda)
• Use the developer console or the ASK CLI tool to create/test/publish
• Can support multiple directives such as display and audio replay
depending on the device and your code must only return
directives that the target device can handle
Some differences between Lex & Alexa
Amazon Alexa skills…
56. D
LOUNGE
EV
• Simple configuration for social
platforms integration
• Facebook Messenger, Slack, Kik
and Twilio SMS native support
• Response cards allow user to
select from a set of responses
• Easily return response cards using
the helper functions in the
Lambda Blueprints
Lex integration with social platforms
59. D
LOUNGE
EV
Automated speech recognition (ASR)
1. Voice is a non-linear input à give up on 100% test coverage…
there are 7 billion voices out there
What makes testing VUI’s different?
60. D
LOUNGE
EV
ASR
Natural language understanding (NLU)
2. You build on top of an always-learning cloud service that keeps changing its
behavior à how do you simulate that?
What makes testing VUI’s different?
61. D
LOUNGE
EV
ASR
NLU
Skill Lambda
Request handlers
3. Your skill deals with non-deterministic user interactions à how do you make
sure tests cover the most common situations?
What makes testing VUI’s different?
62. D
LOUNGE
EV
4. There are no barriers for users to interact with your skill à your tests should
cover the robustness of your skill using unexpected user input
User: “Alexa, open my skill”
Alexa: “Welcome to my skill. Where do you want to fly to?”
User: “Banana”
User: “I want to blah blah blah”
Lex: “Sorry, can you please repeat that?”
User: “Banana”
What makes testing VUI’s different?
69. D
LOUNGE
EV
ASR
NLU
Skill Lambda
Request handlers
Echo device
AVS API
ASK Skill Management API
(SMAPI)
Skill Simulation API
Skill Invocation API
EchoSim.io
Service simulator
Alexa Skills Kit Command-
line Interface (ASK CLI)
ASK CLI to access testing APIs via SMAPI
70. D
LOUNGE
EV
ASR
NLU
Skill Lambda
Request handlers
Echo device
Unit testing SDK
AVS API
Skill Simulation API
Skill Invocation API
EchoSim.io
Service simulator
Custom service simulator
Custom device simulator
ASK Skill Management API
(SMAPI)
Alexa Skills Kit Command-
line Interface (ASK CLI)
Custom extensions and OS tools
71. D
LOUNGE
EV
There are lots of ways for a user to go through your skill, and they vary based on one of the following:
Either the skill responds differently in particular situations (date is Monday; user is first-timer; personalized or
randomized content, etc.) or the user making one of many decisions does.
The test client can take these factors into consideration and execute non-deterministic test paths.
Skill
Skill
Invocation
API
Test
client
Conversation step 1
Conversation step 2
Conversation step 3
Conversation step 1
Conversation step 2
Conversation step 3
Test path
Multipath conversations
74. D
LOUNGE
EV
Converts text/SSML
to life-like speech
52 voices 25 languages Low latency,
real time
Fully managed
Amazon Polly
• Automatic, accurate text processing
• Intelligible and easy to understand
• Add semantic meaning to text
• Customized pronunciation
• Supports plain text or SSML via SDKs, CLI & Console
75. D
LOUNGE
EV
• Hot word detection to get kiosk’s attention
• Snowboy https://snowboy.kitt.ai/
• Silence detection during live speech capture for
start/stop
• SoX http://sox.sourceforge.net/
• Streaming of audio capture in real time to reduce
latency
• AWS IoT
• NLU powered by Amazon Lex
• Text to speech powered by Amazon Polly
TravelBuddy Kiosk – key features
81. D
LOUNGE
EV
Create your developer account and get registered
for our Skills workshops:
www.developer.amazon.com/alexa-skills-kit/anz
@AlexaDevs alexa-anz-marketing@amazon.com
Alexa Skills Life Hacks Build-a-thon and Workshop
https://aws.amazon.com/summits/sydney/
Join the Amazon Alexa Community
83. Join us between 10 and 12 April 2018
Learn how cloud technology can help your business lower costs,
improve efficiency and innovate at scale.
Register Today!
#AWSOnAir
@AWSCloudANZ
#AWSSummit