Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

3-in-1 talk on Serverless Chatbots, Alexa skills & Voice UI best practices (the Amazon way)


Published on

Slides for Serverless Toronto User Group meetup cover:
1. Creating Serverless Chatbots for Twilio SMS, Slack & Facebook in minutes!
2. Alexa Bot/Skill from the same Node.js codebase! Rework of the Alexa code for the "AWS Lambda purists”.
3. Important (non-Serverless) Voice UI specific topics:
• An in-depth look at creating Alexa Skills
• Understanding Voice-First design & how it differs from designing mobile and web apps, even Interactive Voice Response (IVR) systems
• Best practices for designing Voice User Interfaces (VUI).

The session was not recorded, but "The Good, the Bad and the Ugly of the voice-first experience" demos & sample Alexa Skill Interaction Model were uploaded to for you to enjoy.

Published in: Science
  • Be the first to comment

  • Be the first to like this

3-in-1 talk on Serverless Chatbots, Alexa skills & Voice UI best practices (the Amazon way)

  1. 1. 3-in-1 Talk on Serverless Chatbots, Alexa Skills, and Voice UI Best Practices (The Amazon way) Author: Daniel Zivkovic, Cloud Solutions Architect, TriNimbus Date: July 2018
  3. 3. SERVERLESS TORONTO USER GROUP 3 About this Meetup
  4. 4. SERVERLESS TORONTO USER GROUP 4 About my company, role… sponsors
  5. 5. SERVERLESS TORONTO USER GROUP 5 Around the room
  6. 6. SERVERLESS TORONTO USER GROUP 6 Serverless – Where we left off Event driven compute as a Glue
  7. 7. SERVERLESS TORONTO USER GROUP 7 Chatbots are one of the Use Cases 1. RESTful Microservices 2. Web Applications 3. Mobile Backends 4. IoT 5. Data Processing – Batch / Streaming / Big Data 6. IT Automation – DevOps 7. Chatbots & Amazon Alexa Refresher Serverless course at: or And again, since the only cloud I know is AWS – I’m limiting presentation to this brand, but most concepts are portable to other clouds & other AI-assisted services/devices. Let’s dive in…
  8. 8. SERVERLESS TORONTO USER GROUP 8 Bots, Chatbots, Virtual Assistants, Voice Assistants are cool buzzwords! They are everywhere, and it’s getting harder and harder to understand what they really are and what they are not. The lines are getting blurry as the media uses this word to describe simple scripts, intelligent and conversational versions; even “messaging” and “chatbot” are intertwined for some. That’s why it’s difficult to explain the actual differences between a bot, chatbot, virtual assistant, and messaging and to understand what customers really want.
  9. 9. SERVERLESS TORONTO USER GROUP 9 Do all these terms mean the same thing? • A bot (short for “robot”) is an automated program that runs over the Internet. Some bots run automatically, while others only execute commands when they receive specific input. Chat bots were one of the first types of automated programs to be called “bots” and became popular in the 1990s, with the rise of online chatrooms. • Chatbots have evolved since then. Some say that “virtual assistant” and “chatbot” are the same; some disagree. • Both chatbots and virtual assistants are more intelligent than a simple bot. When a bot only follows the script, the chatbot and virtual assistant have more options to interpret the command. Supported by artificial intelligence (AI) they understand the meaning of what was said or typed. They can look at the phrases but also understand what specific words mean in a certain context. Why Voice Assistants are in the same category as Chatbots Voice assistants are chatbots, it’s just that instead of written text we use voice. One difference is that they have to have “wake word” to know you are addressing them (like a person’s name). That’s why common joint name for voice assistants & text assistants is Virtual Assistants.
  10. 10. SERVERLESS TORONTO USER GROUP 10 Products Landscape Conversational Systems
  11. 11. SERVERLESS TORONTO USER GROUP 11 Messaging Platforms Used for/by chatbots 1. Facebook 2. Slack 3. Skype 4. Viber 5. WhatsApp 6. Telegram 7. Twilio (Phone call & SMS) 8. Amazon Connect (cloud-based Contact Centre) …
  12. 12. SERVERLESS TORONTO USER GROUP 12 Evolution of Interaction with Technology
  13. 13. SERVERLESS TORONTO USER GROUP 13 The De-evolution of Mankind
  14. 14. SERVERLESS TORONTO USER GROUP 14 New Evolutionary Leap for Mankind Notice how all the time throughout the mankind history “the tools” were within hand's reach.
  15. 15. SERVERLESS TORONTO USER GROUP 15 • With voice, for the first-time in human evolution ever – we can operate machines out of your hand's reach! • That explains why setting the timer to N-minutes is “the killer voice app” 😊
  16. 16. SERVERLESS TORONTO USER GROUP 16 I gave up on speech recognition over a decade ago (blaming my tick accent, hoarseness & pace of speech) … so How is this possible? Enabled by the advances in Artificial Intelligence (AI) technologies Recent developments in machine learning (ML) algorithms have improved the performance of AI tasks such as: • Automatic Speech Recognition (ASR), • Natural Language Understanding (NLU), • Text to Speech Synthesis (TTS), and • Image Recognition.
  17. 17. SERVERLESS TORONTO USER GROUP 17 Voce Dialogue System Architecture 1. Noise to words 2. Words to meaning 3. Meaning to task (command/intent/function) and/or send 4. Responses and/or current dialogue context/state back to user, using 5. Text-to-Speech (TTS)
  18. 18. SERVERLESS TORONTO USER GROUP 18 Technology is moving toward Intelligent AI-based Conversations … that was until recently seen only in Sci-Fi movies like: Hall from “2001: A Space Odyssey” Samantha from the movie “Her”
  19. 19. SERVERLESS TORONTO USER GROUP 19 Ava from “Ex Machina”
  20. 20. SERVERLESS TORONTO USER GROUP 20 or Teddy and Dolores from “Westworld”
  21. 21. SERVERLESS TORONTO USER GROUP 21 When machines will pass the Turing Test:
  22. 22. SERVERLESS TORONTO USER GROUP 22 Is Artificial Intelligence threat to Mankind? • Elon Musk thinks AI is our biggest existential threat
  23. 23. SERVERLESS TORONTO USER GROUP 23 • I believe that AI & ML are just threat for the white-collar workers: AI will be for us what Machines were for blue-collar workers. But this falls into Sci-Fi like Jules Verne’s Nautilus submarine was in “Twenty Thousand Leagues Under the Sea” back in 1870… computers still need us 😊
  24. 24. SERVERLESS TORONTO USER GROUP 24 2018 Reality Check Our users started to expect intelligence, and if the voice experience does not go well, 97% will not come back to our app… Let’s round it down to 9 out of 10. So, we have to create really good voice experience…. Voice first design differs…. And we have to fake intelligence by coding intelligent-sounding responses - mimicking what humans would do.
  25. 25. SERVERLESS TORONTO USER GROUP 25 We’re still faking it With Chatbots we're faking lots of intelligences, much like that short-statured chess player in the Mechanical Turk from 18th century – You are now “developer in a box”: Mechanical Turk ( An 18th Century automaton that could beat human chess opponents seemingly marked the arrival of artificial intelligence.
  26. 26. SERVERLESS TORONTO USER GROUP 26 Fiction vs. Reality – Confusion or Marketing Opportunity Everybody uses the word chatbot, so it allows vendors to sell something very simple and narrow in function. The one buying into the vendor’s claims, expects something more sophisticated and, when everything is deployed, is often disappointed in what they get. The explosion of new buzzwords: AI, Machine Learning (ML), Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) as a subset of the wider world of Natural Language Processing (NLP) – especially • when mixed with Bot, Chatbot, VA, CUI, VUI terms, and • multiplied by numerous Messaging Platforms, enables creative/pretentious marketing, and becomes the source of good PR:
  27. 27. SERVERLESS TORONTO USER GROUP 27 From Big banks Like Wells Fargo that “received industrywide attention as the first U.S. bank to pilot an artificial intelligence chatbot on Facebook Messenger” or TD – promoting info about Alexa App availability to death with ~245,000 results for search
  28. 28. SERVERLESS TORONTO USER GROUP 28 and Non-profits E.g. WordPress community at recent AWS Launchpad event titled “Boost Your WordPress Website with AI & ML”
  29. 29. SERVERLESS TORONTO USER GROUP 29 to Personal Branding “Building a Twitter art bot with Python, AWS, and socialist realism art” is to me the best Blog title ever!
  30. 30. SERVERLESS TORONTO USER GROUP 30 Will it Really Work? AI Industry is still young & error-prone – e.g. Microsoft Tay shutdown after “Twitter taught Microsoft’s AI chatbot to be a racist asshole in less than a day” back in 2016.
  31. 31. SERVERLESS TORONTO USER GROUP 31 Coding Serverless Chatbots This mixture of buzzwords, technologies, vendors & errors produces too many options and risks… so it is easy to not see the big picture… and find the needle in the “haystack of chatbots”. The Plan for Today is 1.Find the tools that will help us cast the widest net • Create Serverless Chatbots for Twilio SMS, Slack & Facebook in minutes • Create Alexa Bot/Skill from the same codebase • Rework Alexa code for the “Serverless purists” 2.Cover important (non-Serverless) Voice UI specific topics: • In-depth look at creating Alexa Skills • Understanding Voice-First design • Best practices for designing Voice User Interfaces
  32. 32. SERVERLESS TORONTO USER GROUP 32 Casting the Widest Net… Tooling-wise…
  33. 33. SERVERLESS TORONTO USER GROUP 33 Meet Claudia.js Discovering Claudia in the “Serverless Applications with Node.js” book felt like discovering Spring Framework in Rod Johnson’s book “Professional Java Development” back in 2005!
  34. 34. SERVERLESS TORONTO USER GROUP 34 has much narrower scope than Serverless Framework – just AWS Cloud, just Node.js language. Also: • Claudia is a command-line deployment utility, not a framework. • It does not abstract away AWS services, but instead automates all the error-prone deployment and configuration tasks and sets everything up the way JavaScript developers expect out of the box. It comes with two very useful Node.js libraries to make getting started with AWS easier: • Claudia API Builder allows you to create the API on API Gateway, and • Claudia Bot Builder allows you to create chatbots for many messaging platforms.
  35. 35. SERVERLESS TORONTO USER GROUP 35 Claudia Bot Builder
  36. 36. SERVERLESS TORONTO USER GROUP 36 Supported messaging platforms as of July 2018: 1. Facebook Messenger 2. Slack (channel slash commands and apps with slash commands) 3. Skype 4. Viber 5. Telegram 6. Twilio (SMS service) 7. Amazon Alexa 8. Line 9. Kik 10. GroupMe Just Academic? Claudia.js & Claudia Bot Builder – not just academic, used in real products like (by the Claudia creator) and (by Montreal’s
  37. 37. SERVERLESS TORONTO USER GROUP 37 OK, Let’s Create Serverless Chatbots in Minutes It is so easy that I feel a little bit like that Mechanical Turk – with right tools, there is no magic to it 😊 Let’s code Fictitious Customer Support bot, over SMS, Slack, Facebook & Alexa… and then rework Alexa example. Out bot will have dumbed down back-end logic:
  38. 38. SERVERLESS TORONTO USER GROUP 38 Global Dependencies 1. Own an AWS account and properly set up the AWS credentials file. 2. Install Node.js and its package manager, NPM. 3. Install Claudia from NPM as a global dependency. $ npm install claudia -g Local Dependencies Create an empty folder, and a new NPM project inside it $ npm init -y $ npm install claudia-bot-builder -S $ npm install huh -S # generate dynamic content with excuse generator
  39. 39. SERVERLESS TORONTO USER GROUP 39 $ cat bot.js var botBuilder = require('claudia-bot-builder'), excuse = require('huh'); module.exports = botBuilder(function (message) { return 'Thanks for sending ' + message.text + '. Your message is very important to us, but ' + excuse.get(); });
  40. 40. SERVERLESS TORONTO USER GROUP 40 Twilio SMS $ claudia create --region us-east-1 --api-module bot --configure- twilio-sms-bot Follow prompts… as each Messaging Platform has specific setup requirements, you will need to check its documentation. E.g. Twilio SMS setup requires: Twilio Account ID: xyz Twilio Auth Token: zyx Twilio SMS Number: 647-931-6363 … "twilio": " .com/latest/twilio", Finish Twilio SMS setup by specifying webhook for the specified phone number:
  41. 41. SERVERLESS TORONTO USER GROUP 41 Try it, text anything to 647-931-6363 😊
  43. 43. SERVERLESS TORONTO USER GROUP 43 Slackbot $ claudia update --configure-slack-slash-command
  45. 45. SERVERLESS TORONTO USER GROUP 45 Facebook Messenger $ claudia update --configure-fb-bot
  46. 46. SERVERLESS TORONTO USER GROUP 46 Alexa $ claudia update --configure-alexa-skill … "alexa": "https:// .com/latest/alexa" Plus, some more configs… then
  47. 47. SERVERLESS TORONTO USER GROUP 47 Alexa Bot – Demo Time Test it: “Alexa, ask Messenger to send support request”. Voice 539 - alexa demo.m4a
  48. 48. SERVERLESS TORONTO USER GROUP 48 Alexa – Take 2 Claudia Bot Builder supports Alexa skills, but because Alexa can trigger a Lambda function, you can save money and decrease latency if you deploy the skill without an API Gateway. Code $ cat skill.js const alexaSkillKit = require('alexa-skill-kit') exports.handler = function(event, context) { alexaSkillKit(event, context, parsedMessage => { return 'Hello there! This is Alexa speaking.'; }) }
  49. 49. SERVERLESS TORONTO USER GROUP 49 $ npm init -y $ npm install alexa-skill-kit –save $ claudia create --region us-east-1 --handler skill.handler -- version skill $ claudia allow-alexa-skill-trigger --version skill { "Sid": "Alexa-1234567890321", "Effect": "Allow", "Principal": { "Service": "" }, "Action": "lambda:InvokeFunction", "Resource": "arn:aws:lambda:us-east-1:987654321012:function:ale xa:skill" } $ claudia update --region us-east-1 --handler skill.handler -- version skill
  50. 50. SERVERLESS TORONTO USER GROUP 50 Point Alexa Skill Endpoint to AWS Lambda
  51. 51. SERVERLESS TORONTO USER GROUP 51 Alexa – Take 3 begins Full story about Alexa Devices & Skill, Skills creation/deployment & Voice-first design follows… In Depth Look at Alexa Skills • What is Voice User Interface (VUI)? • High Level View • Creating an Alexa Skill • ASK Developer Console – Demo Time What is Voice User Interface (VUI)? A voice user interfaces (VUIs) allow people to use voice input to control computers and devices. They enable that “Sci-Fi Movie”-like experience. High Level View…
  52. 52. SERVERLESS TORONTO USER GROUP 52 Meet the Alexa Device Family
  53. 53. SERVERLESS TORONTO USER GROUP 53 Alexa Skills Kit (ASK) There’s also Alexa Voice Service (AVS) for creating Alexa enabled products – but not in scope of this talk.
  54. 54. SERVERLESS TORONTO USER GROUP 54 Alexa, tell messenger to send {support request} Alexa Skill Invocation
  55. 55. SERVERLESS TORONTO USER GROUP 55 Intents, Utterances, and Slots
  56. 56. SERVERLESS TORONTO USER GROUP 56 Variables (slots) in One-shot Invocation
  57. 57. SERVERLESS TORONTO USER GROUP 57 Problems with One-shot / Single-breath Invocations Command stops at silence – so you cannot use it for dictation Long commands cause Cognitive-overload
  58. 58. SERVERLESS TORONTO USER GROUP 58 Solution to cognitive overload Use conversation, it’s more natural… like when you call your travel agent.
  59. 59. SERVERLESS TORONTO USER GROUP 59 Creating an Alexa Skill Amazon Developer Portal • Invocation name • Language Model (Intents, Slots, Samples) • Publishing Details AWS Management Console – optionally • Web service code – Skill Fulfilment & Dialog Managements
  60. 60. SERVERLESS TORONTO USER GROUP 60 Fulfilment Implementation Choices
  61. 61. SERVERLESS TORONTO USER GROUP 61 Voice Input / Request Path
  62. 62. SERVERLESS TORONTO USER GROUP 62 Automatic Speech Recognition
  63. 63. SERVERLESS TORONTO USER GROUP 63 Voice Output / Response Path
  64. 64. SERVERLESS TORONTO USER GROUP 64 ASK Developer Console – Demo Time Alexa Skills Kit = ASK
  65. 65. SERVERLESS TORONTO USER GROUP 65 Understanding Voice First Design • How Building for Voice Differs from Traditional App Design • Problems with Hierarchical Graph UI • Frame-based UI design alternative • Dialog Management – Demo Time • Emerging Voice-first Design Patterns
  66. 66. SERVERLESS TORONTO USER GROUP 66 How Building for Voice Differs from Traditional App Design Wording choices Often too many ways to express the same intention – Bot needs to be trained.
  67. 67. SERVERLESS TORONTO USER GROUP 67 Talk with Users, not at Users – but How? Multi-turn Dialogs can reduce cognitive overload caused by one-shot / singe-breath commands.
  68. 68. SERVERLESS TORONTO USER GROUP 68 Problems with Hierarchical Graph UI Traditional Web sites, Mobile apps, and Interactive Voice Response (IVR) systems force users via specific path in the hierarchical graph. Information Architecture – typical Website Website hierarchies are hard to navigate/skim via voice. E.g. you are looking for the routing number on the Banking website… it is buried deep down in the site taxonomy tree, because it’s rarely needed:
  69. 69. SERVERLESS TORONTO USER GROUP 69 In voice, the opposite is true. You'd never want the user to have to say, “Ask ABC Bank for: menu, menu, menu, routing number.” Instead, you’d want to enable the user to simply say, “Ask ABC Bank for my routing number.” In other words, the routing number intent is now presented at the top level. Turn taxonomy tree upside down – Make all options Top-Level; this allows finding needed info quickly. While menus add depth to GUIs, they introduce friction to voice-first UIs. Voice interactions should instead offer their experience at the top level – without the need to learn its information architecture.
  70. 70. SERVERLESS TORONTO USER GROUP 70 Finite State - typical IVR call flow • System completely controls the conversation with the user. • It asks the user a series of questions, • ignoring (or misinterpreting) anything the user says that is not a direct answer to the system’s questions.
  71. 71. SERVERLESS TORONTO USER GROUP 71 Frame-based UI Design Alternative
  72. 72. SERVERLESS TORONTO USER GROUP 72 Frame-based UI… • Frames are mixed-initiative, because conversational initiative can shift between system & user. • User can answer multiple questions at once (over-answer). • System asks questions of user, filling any slots that user specifies. • If user answers more questions at once, system has to fill slots and not ask these questions again.
  73. 73. SERVERLESS TORONTO USER GROUP 73 Dialog Management in Alexa
  74. 74. SERVERLESS TORONTO USER GROUP 74 Dialog Management – Demo Time Plan My Trip Demo Alexa cookbook “recipe” that demonstrates Alexa features related to dialog management (like Dialog.Delegate directive) my-trip 1. Alexa open “Plan My Trip Demo” • I’m leaving Boston 2. Alexa tell “Plan My Trip Demo, I'm going skiing on Sunday” 3. Demonstrate how non-US city like Toronto does not mesh nice with my accent.
  75. 75. SERVERLESS TORONTO USER GROUP 75 Demo Code Generator for the same Interaction Model Paste Interaction Model into then create AWS Lambda from the resulting code… and wire into another Alexa skill. 1. Alexa open “code generator demo” 2. Help 3. I'm leaving Boston tomorrow
  76. 76. SERVERLESS TORONTO USER GROUP 76 Emerging Voice-first Design Patterns 1. Be adaptable: Let users speak in their own words 2. Be personal: Individualize your entire interaction – have context & history 3. Be available: Collapse your menus; make all options top-level 4. Be relatable: Talk with them, not at them – over-answers, ambiguities, implicit confirmations
  77. 77. SERVERLESS TORONTO USER GROUP 77 Closing Thoughts • Good Use Cases • Misuse Cases & Tips • Let’s go deeper • Serverless Toronto next steps
  78. 78. SERVERLESS TORONTO USER GROUP 78 Good Use Cases Focus on apps that remove friction points in life. E.g. every time you can make interaction with machines: 1. Faster – break “within hands reach” rule (timers, alarms & switches are “the killer voice apps”) 2. Simpler – e.g. reduce # of clicks to get to mobile app for driving instructions, playlists, or 3. Bring delight – that’s why VA jokes are so popular (they make people happier). “Friction will always lose. Alexa and Voice is the future of the frictionless world. It is going to explode” – Gary Vaynerchuk Misuse Cases & Tips • Using Voice as an API wrapper! • Forcing voice everywhere. If you even think it’s easier to walk up to the machine, or use the Web Browser – don’t create a Voice app.
  79. 79. SERVERLESS TORONTO USER GROUP 79 Your app will become another friction point – a.k.a. unused, if you don’t: • understand the Voice-first vs. traditional Web/Mobile apps design differences, • approach Voice UI more like writing a conversational dialogue for a screenplay, than an API design.
  80. 80. SERVERLESS TORONTO USER GROUP 80 Voice apps have to be intuitive, because there’s no skimming like through web pages – so please don’t expect users to learn your API or Taxonomy… no more “Norman Doors” please 😊
  81. 81. SERVERLESS TORONTO USER GROUP 81 Let’s go deeper Pick your mentors carefully • Info overload – unlike 30 years ago, when we had lack of information • Finding good info – nowadays feels like going through the junkyard. Here is my list so far – if you find more great sources, please share: Alexa & Voice-First Design - Head of Alexa Voice Design Education - Live coding with Paul’s team (check out early “Building a college coaching skill” episodes) - Stay on top od Alexa development, catch-up with past best practices & tips! - Our local User Group about Voice-First design - The Stanford “Spoken Language Processing” CS224S course
  82. 82. SERVERLESS TORONTO USER GROUP 82 - ACM SIGCHI - Special Interest Group on Computer-Human Interaction ASK-CLI Command Line Interface Books conversational-ui-development - good book with wide chatbot platforms coverage & interesting projects (PaaS) - best selling book (by Don Norman) on the topic of designing Intuitive User Interfaces
  83. 83. SERVERLESS TORONTO USER GROUP 83 Source Code Repos - Create chatbots for Facebook Messenger, Slack, Amazon Alexa, Skype, Telegram, Viber, Line, GroupMe, Kik and Twilio and deploy to AWS Lambda in minutes in-five-minutes/ - meetup demos & Alexa skill Interaction Model Amazon Tools - Alexa code samples & tips - ASK-SDK Code Generator for Alexa Skill Language Models - Skill design is more like writing a screenplay, than coding
  84. 84. SERVERLESS TORONTO USER GROUP 84 Food for Thought – Demo Time The Good, the Bad and the Ugly of the voice-first experience Play some of he recordings I ran into while preparing this talk: 1. ACM SIGCHI video – Radar Pace - A Conversational System for Coaching Good – strive towards this! 2. Mobile Protection IVR recording – 2 minutes of friction while driving Bad – rewrite this. IVRs can be & should be better nowadays! 3. Connected Lab video about CNN Voice App – #ConversationsWorthHaving Ugly – don’t do this! You know better now 😊
  85. 85. SERVERLESS TORONTO USER GROUP 85 Serverless Toronto Next Steps See how little Serverless can be in a Serverless Talk? That’s exactly because Serverless Computing moves infrastructure issues out of the way, and allows you – the developer, to focus more on the business domain at hand – e.g. Conversational User Interfaces (CUI) we covered today! Any Volunteers for our Future Talks? Remaining Use Cases… Microservices, Websites, IoT, Big Data… and Security is always interesting • especially in the serverless world where clients can talk directly to back-end services via HTTPS, • without the need for traditional Application servers.