Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Voice User Interface Design - Big Design 2017

2,204 views

Published on

Amazon Skills for Alexa, Google Actions for Home – Should your company build a conversational voice interface for one of these systems, and if so, how? What are the differences between a voice user interface and other types of UIs? What types of skills does a VUI designer need? What are some best practices for these VUIs? This session will explore all these questions and more. You’ll walk away with answers to the questions “If, Why, and How” you might choose to explore this interesting new area of design.

Published in: Technology
  • Login to see the comments

Voice User Interface Design - Big Design 2017

  1. 1. © 2017 Versay Solutions Voice User Interface Design: Skills, Actions, And The Future Crispin Reedy, Versay Solutions @crispinTX crispinreedy.com #BigD17
  2. 2. © 2017 Versay Solutions Voice User Interface Design: Skills, Actions, And The Future Disclaimer: This session was NOT sponsored by Dominos
  3. 3. © 2017 Versay Solutions • Voice User Interface Designer • 15+ years in the field • Former coder; got interested in UX • President of the Association for Voice Interaction Design • Consultant for Versay Solutions @crispinTX crispinreedy.com
  4. 4. © 2017 Versay Solutions Session Description • Amazon Skills for Alexa, Google Actions for Home – Should your company build a conversational voice interface for one of these systems, and if so, how? • What are the differences between a voice user interface and other types of UIs? • What types of skills does a VUI designer need? • What are some best practices for these VUIs? • You’ll walk away with answers to the questions “If, Why, and How” you might choose to explore this interesting new area of design.
  5. 5. © 2017 Versay Solutions Easy Answer To #1 • If your company is involved in home automation: • Mostly likely Yes, and Yesterday • Although how you do it will depend on your platform • More on that later! • Everyone else • Let’s keep talking!
  6. 6. © 2017 Versay Solutions Basic Terms
  7. 7. © 2017 Versay Solutions Terms & Technologies •Speech Recognition •Natural Language Understanding •Voice Verification (Biometrics) •Text to Speech
  8. 8. © 2017 Versay Solutions Speech Recognition “ASR” “See the cat.”
  9. 9. © 2017 Versay Solutions Natural Language Understanding •Extracting meaning from natural text “Hello, yes, I’d like to pay my water bill. Can you help me with that? Intent = BillPay Entity (Bill Type) = Water
  10. 10. © 2017 Versay Solutions Voice Verification “My voice is my password.” “Authenticated. Welcome, Mr. Smith.” ✓
  11. 11. Text To Speech
  12. 12. © 2017 Versay Solutions Speech Recognition • Hands-free command / control • Dictation • Input text • Small form factor device, etc. Text To Speech • Output text dynamically • Respond to input • Useful when no display is available Natural Language Understanding • Necessary for all language-based input • Extract meaning • Parse large volumes of text Voice Verification • Security
  13. 13. ASR Application Data • Sign-In • Interaction • Request • Action • Meaning • Access Data • Output TTS NLU Voice prints Verifi- cation
  14. 14. © 2017 Versay Solutions Speech Technology Today
  15. 15. © 2017 Versay Solutions Speech Agents, Apps, and APIs Speech Agents: • Amazon Alexa • Echo, Dot, Echo Show • Google Assistant • Pixel, Android, Google Home, iPhone app • Apple’s Siri • iPhone, iPad, MacOS (Sierra), AppleTV • Microsoft’s Cortana • Windows 10, Windows Phone, Xbox, iPhone app • Samsung’s Bixby • Galaxy S8, Family Hub 2.0 Fridge
  16. 16. © 2017 Versay Solutions Speech Agents, Apps, and APIs Speech Agents can be extended with “Voice Apps” • Alexa Skills • Google Actions • SiriKit • Cortana SDK
  17. 17. © 2017 Versay Solutions Speech Agents, Apps, and APIs Agent capabilities and apps are somewhat determined by: • Platform: Device • Screen, keyboard, phone, mics, etc. • Environment: Web site, apps that interact with the agent • Ecosystem: Underlying connections, technical partnerships
  18. 18. © 2017 Versay Solutions Platforms
  19. 19. © 2017 Versay Solutions Environment Google “Actions” or “Apps” • Curated • Direct vs. Conversational Siri - Works via apps Order Uber Order Lyft
  20. 20. © 2017 Versay Solutions New York Times
  21. 21. © 2017 Versay Solutions Speech Agents, Apps, and APIs APIs: Allow you access to the underlying technology • Amazon • AVS (Alexa Voice Service) Create an “Alexa” on your own device • Amazon Lex, Amazon Polly • Google • Cloud Speech API • API.ai • Apple • Apple Speech Framework • Microsoft • Bing Speech API Ecobee Smart Thermostat
  22. 22. © 2017 Versay Solutions Use Cases
  23. 23. Use Case “Bakeoff” from Tech Insider •Travel •Email •Messaging •Sports •Music •Weather •Calendar •Social • Translation • Basic tasks • General knowledge • Personality http://www.businessinsider.com/siri-vs-google-assistant-cortana-alexa-2016-11/
  24. 24. © 2017 Versay Solutions Use Case “Bakeoff” from Tech Insider • “wildly finicky when it comes to phrasing.” • “Each assistant still feels like a fragile, thinly veiled web of loosely connected services — because that's what they are.” • “incredibly uncomfortable to speak to an inanimate thing in public.” • “In Google Assistant's case, normalizing the need to call on a brand ("OK Google") whenever you need a hand is Orwellian.” • “None of these things are at a place I could comfortably call "good.””
  25. 25. © 2017 Versay Solutions Personal Assistant vs. Home Assistant The Google Pixel XL. Hollis Johnson/Business Insider Google.com
  26. 26. © 2017 Versay Solutions Personal Assistant vs. Home Assistant
  27. 27. © 2017 Versay Solutions Getting Specific With Alexa
  28. 28. © 2017 Versay Solutions “Layers” of Alexa •Alexa Native Capabilities •Alexa Skills •Alexa Voice Services
  29. 29. © 2017 Versay Solutions “Layers” of Alexa • Alexa Native Capabilities • Come out of the box • Require Alexa wake word (can be changed) • Alexa Skills • Alexa’s “Extensions” or “Add-Ons” • Designed for and deployed on Echo Device • Skills must be downloaded to Echo • Require Alexa wake word + Skill name • Alexa Voice Services • Add Alexa voice control to your own device
  30. 30. © 2017 Versay Solutions Alexa “Native” Capabilities Alexa, what’s 3 + 5? Alexa, set an alarm for 3 am. Alexa, set a thirty second timer. Alexa, what’s the weather? Note: Mix of TTS & Pre-Recorded Audio Note: “Hint”
  31. 31. © 2017 Versay Solutions Design Considerations •Proactive “Hints” • Similar to “Hover Help” or “Tool Tip” • But less avoidable! • Pro: Can teach user about other capabilities • Con: Can be annoying! • Guideline: If used, be sparing • Develop rules for when and how frequently to offer
  32. 32. © 2017 Versay Solutions Amazon.com Native & Skill Skill Skill Skill Native & Skill Alexa Skills
  33. 33. © 2017 Versay Solutions Source: David Attwater, EIG Inc.
  34. 34. © 2017 Versay Solutions Amazon.com
  35. 35. Alexa Skills Amazon.comAmazon.com
  36. 36. © 2017 Versay Solutions Amazon.com
  37. 37. © 2017 Versay Solutions Design Considerations • Invoking Skills: • Alexa, open Oprah Magazine • Alexa, order a pizza from Domino’s • Alexa, ask Cook Reference what’s the safe temperature for chicken • Syntax: Open <skill> Ask <skill> for (about, to, with, etc.) <action> Ask <skill> <question> Also: Search, Tell, Talk to, Launch, Start, Resume, Run, Load, Begin Oprah Magazine
  38. 38. © 2017 Versay Solutions Design Considerations •Skills can be “installed” on the fly •If the user knows the name of the skill •Skills that require account information will need extra steps Cook Reference Domino’s
  39. 39. © 2017 Versay Solutions Alexa App + Linking
  40. 40. © 2017 Versay Solutions Design Considerations •Managing access to skills may become difficult or confusing.
  41. 41. © 2017 Versay Solutions Design Considerations •Attention (or lack of attention!) to technical details can become “deal- killing” part of overall experience Domino’s
  42. 42. © 2017 Versay Solutions Really? Dominos.com
  43. 43. © 2017 Versay Solutions No Dominos.com
  44. 44. © 2017 Versay Solutions Design Considerations • Confirmation • What’s the phone number? • 214-555-1235 • You said 214-555-1235. Is that correct? • Yes • Note: System confirmed the phone number but not the address • Was the address really correct?
  45. 45. © 2017 Versay Solutions Dominos.com
  46. 46. © 2017 Versay Solutions Design Considerations • “Would you like to place your Easy Order, reorder your most recent order, or start a new order?” • If I’m not logged into my account on the Alexa app, options 1 and 2 don’t make much sense. • “Would you like” is ambiguous – could be used for Yes / No questions or for multi-item questions • First part of the sentence runs into the choices • Reuse of the word “order” just seems odd (but may be unavoidable). • Could have used more pauses (SSML) Domino’s
  47. 47. © 2017 Versay Solutions Design Considerations: SSML • Speech Synthesis Markup Language • Can control the way your TTS playback sounds • Very important if your output is mostly TTS • Which is true of all most platforms • Should be supported by all types of TTS engine • Amazon has platform specific options • Plan on using it to fine tune your audio output
  48. 48. © 2017 Versay Solutions New Prompts & SSML Examples • Note: TTS Samples with SSML created with Amazon Polly, not Alexa • “You can: Place your easy order. Reorder your most recent order. Or, start a *new* order.” • You can: <break time="500ms"/>Place your easy order, <break time="500ms"/> Reorder your <emphasis level="moderate">most recent</emphasis> order, <break time="500ms"/> Or, start a <emphasis level="strong">new</emphasis> order. • Placing an order, great! Choose from: My easy order. My most recent. Or, start a *new* order. • <speak>Placing an order. <prosody pitch="high">Great!</prosody> Choose from: My easy order. My most recent. Or, start a <emphasis> <prosody pitch="high">new</prosody> </emphasis> order.</speak> Domino’s
  49. 49. © 2017 Versay Solutions Still Trying To Order That Pizza • Start of the interaction has changed! • Probably due to login • “Would you like to place an order, or track an order?” • What just happened!!!? •System was expecting me to say “Start a new order” and I only said “New Order.” Domino’s
  50. 50. © 2017 Versay Solutions Design Considerations • Make sure your input grammar covers all possible logical utterances (what user can say) • Don’t leave this stuff up to the programmers! • Provide examples of coverage • Coverage should match prompts • Use some kind of markup to show coverage • [] optional • () grouping • | or • “Would you like to place your Easy Order, reorder your most recent order, or start a new order?” • [place] [my | an] Easy Order • [reorder] [my] most recent [order] • [start a] new [order]
  51. 51. © 2017 Versay Solutions Design Considerations •Reprompts: • What do you do when you didn’t understand what the caller said? • Probably don’t want to say “Sorry” • This can be annoying • But you CAN rephrase the prompt to make it different • Using the same prompt gives the user a sense that something has gone wrong
  52. 52. © 2017 Versay Solutions Pizza Pizza Pizza • Hey you didn’t really need to explain about the phone number since I saved it but OK…. • Address has been saved to profile, great! • And then boom Domino’s
  53. 53. © 2017 Versay Solutions With Speech, you need to spend a lot more time thinking about what happens when things go wrong.
  54. 54. © 2017 Versay Solutions I Didn’t Really Want to Order Pizza But By Now I Am Hungry And So Is Somebody Else • Note “Easy Order” and Credit Card cannot be set up on the website unless you’re actually placing an order. • Give people enough time to talk! • There’s that grammar coverage issue again • Bell pepper = Green pepper • What synonyms is your user likely to say? • At some point couldn’t you just give me a list? • Notice how they screwed up the article + the item “… adding a parmesan bread twists” Meow Domino’s
  55. 55. © 2017 Versay Solutions
  56. 56. © 2017 Versay Solutions
  57. 57. © 2017 Versay Solutions Design Considerations • Confirm and correct • “Do you want to add anything else?” • “Yes, I want to add peppers.” • Disambiguation • “Olives” • “Ok, we have two kinds of olives. Black olives, or green olives.” • A Voice User Interface design is a time-based interface • As a designer concerned with user experience you’re going to be involved in things (such as pauses) which may not occur to you
  58. 58. © 2017 Versay Solutions How Did Google Home Do? •“OK Google, Order Dominos” • “There are stores at….” • Had to go find the right “App Name” online •“OK Google, Talk to Dominos” • “You can link to your Domino’s account…” • Had a terrible time finding the “Google Apps.”
  59. 59. © 2017 Versay Solutions How Did Google Home Do? •Menu worked! • System did not recognize “Ham” (Should offer list of ingredients) • System became very laggy
  60. 60. © 2017 Versay Solutions How Did Google Home Do? • Edited for time • Original was 3:35 • This is 2:15 • Use of “Dom” persona and male voice • “Hand off” • Playback of address: • Alexa: “Eighty seven twenty three” • Google: “Eight thousand seven hundred twenty three” • Same issue with “twists” • “Your day just got cheesier”
  61. 61. © 2017 Versay Solutions Design Considerations •Discoverability • “OK Google, Order Dominos” •Persona • Google Home has more control over the voice • Branding considerations – “Dom” name and male TTS •Playback of Dynamic Data • Attention to detail – don’t trust the platform to do it the way you want it
  62. 62. © 2017 Versay Solutions Design Considerations Maintaining State: •Between dialogs • “Who is Seth McFarlane?” • “Seth McFarlane is…” • “When’s his birthday?” • “I’m not sure what you’re talking about.” •From session to session Oprah Magazine
  63. 63. © 2017 Versay Solutions Home Automation •Onboarding issues are very similar to “Skills,” but there is an additional layer of complexity • Companies are working to improve the experience • After setup, you get a lot of bang for the buck
  64. 64. © 2017 Versay Solutions “Computer, turn on the library lights”
  65. 65. © 2017 Versay Solutions TP Link
  66. 66. © 2017 Versay Solutions Amazon
  67. 67. © 2017 Versay Solutions Design Considerations: Summary • Managing access to Skills (App, Store) • Managing the Onboarding Experience • Discoverability • Invoking Skills • Hints • Confirmation • Asking Yes/No Questions vs. Multi-Item Questions • SSML • Silences • Reprompting • Coverage (prompt vs. possible input) • Managing technical errors • Timing and Timeouts • Article matching the noun • Confirm and Correct • Disambiguation • Persona • Playback of Dynamic Data • Maintaining State
  68. 68. © 2017 Versay Solutions What Makes a Good VUI Designer? •Concern with the overall experience • All of the channels that go into making up how something happens •Attention to “small” technical details • Pauses • SSML •Writing skills! • Dialog, not tech doc • English majors, screenwriters
  69. 69. © 2017 Versay Solutions Session Description • Amazon Skills for Alexa, Google Actions for Home – Should your company build a conversational voice interface for one of these systems, and if so, how? • What are the differences between a voice user interface and other types of UIs? ✔ • What types of skills does a VUI designer need? ✔ • What are some best practices for these VUIs? ✔ • You’ll walk away with answers to the questions “If, Why, and How” you might choose to explore this interesting new area of design. ✔
  70. 70. © 2017 Versay Solutions If, Why, How •What are you trying to build? •Existing guidelines / research •User testing is key • Especially if you’re trying to do something complicated
  71. 71. © 2017 Versay Solutions If, Why, How: Beyond Skills Write an app (skill) for an agent such as Google Assistant / Alexa Use cloud APIs to add ASR / NLU to your app / device / page / gadget Download software and use full-featured capabilities for more robust recognition on a specific device Build your own
  72. 72. © 2017 Versay Solutions If, Why, How: What’s the Use Case? •Enabling application • User can’t do it any other way • New tasks •Enhancing application • User can do it now • But speech makes it better • Faster • Safer
  73. 73. © 2017 Versay Solutions API-Based Device- Based Roll Your Own / Open- Source •Flexibility •Power •Customization •Time •Difficulty
  74. 74. © 2017 Versay Solutions Existing Guidelines / Research • Caveat: Best practices evolved in one modality (e.g. voice-only) may not apply the same way in another (e.g. combined voice + touch) • But they could be adapted • Association for Voice Interaction Design (AVIxD.org) • Wiki • Peer-Reviewed Journal • Virtual “Brown Bags” • Academic Sources, Books
  75. 75. © 2017 Versay Solutions AVIxD.org CUI Working Group is actively recruiting!
  76. 76. © 2017 Versay Solutions @crispinTX Crispin Reedy Thank You!

×