AWS re:Invent 2016: Tips and Tricks on Bringing Alexa to Your Products (ALX304)

548 views

Published on

Ever wonder what it takes to add the power of Alexa to your own products? Are you curious about what Alexa partners have learned on their way to a successful product launch? In this session you will learn about the top tips and tricks on how to go from VUI newbie to an Alexa-enabled product launch. Key concepts around hardware selection, enabling far field voice interaction, building a robust Alexa Voice Service (AVS) client and more will be discussed along with customer and partner examples on how to plan for and avoid common challenges in product design, development and delivery.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
548
On SlideShare
0
From Embeds
0
Number of Embeds
77
Actions
Shares
0
Downloads
81
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

AWS re:Invent 2016: Tips and Tricks on Bringing Alexa to Your Products (ALX304)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Matt Tavis, Principal Solutions Architect Alexa Voice Service (@alexadevs) December 2, 2016 Tips and Tricks on Bringing Alexa to Your Products ALX 304
  2. 2. What to Expect from the Session • Key concepts for using the Alexa Voice Service • Tips and Tricks for implementing an AVS client • Considerations for evolving your solution • Key components of a hands-free solution
  3. 3. Amazon Alexa Enabled Open and extensible solution to add Alexa to any connected, for free Alexa Skills Kit (ASK) Works With Alexa Open APIs and tools that make it fast & easy to build skills for Alexa products. Lives In The Cloud Automated Speech Recognition (ASR) Natural Language Understanding (NLU) Always Getting Smarter (AI) Alexa Voice Service (AVS) The Alexa Ecosystem Supported by two powerful frameworks that leverage open APIs Devices
  4. 4. Intelligent Cloud Service Optimized suite of on-device + cloud-based technologies and services that power a wide array of connected devices ON-DEVICE COMPONENTS DEVICE TYPES AMAZONSPEECH OS 3P CONTENT HWSW Mic Arrays Speaker Notification LEDs Mute Button SoC/DSP Audio Player AEC Beamforming State Machine HTTP Manager LWA Auth Speech PrimitivesProduct Platform Platform Services ASR TTS NLU State Mgr Knowledge (Evi) Model Training Analytics Data Ingestion Auth Tools Personalization GUI Cards Domains Services VUI UX Speech orchestrator 3P Skills Smart Home Smart Things Wink Insteon SmartHome APIUber Dominos + 3000 more Dialog Mgr
  5. 5. Skills ASR NLU TTS Learning Alexa Voice Service – how it works ? Your Product
  6. 6. Recent AVS Announcements “Omate Rise 3G smartwatch slaps Amazon Alexa on your wrist” – Engadget 9/1/16 Adding Alexa to the already- intriguing Pebble Core takes it from “Huh, that’s interesting” to “Did we just catch up to Star Trek?” – Forbes 6/3/16 “This smart watch puts Alexa on your wrist” – The Verge 4/20/16 “Amazon Alexa is now available on first device not made by Amazon” – TechCrunch 4/28/16 “Nucleus debuts first Alexa-enabled touchscreen video device” – Mashable 8/4/16  “Amazon Alexa support coming to LG's SmartThinQ hub” – Engadget 9/2/16 “Sonos Bringing Voice Control To Its Speakers With Amazon Partnership” – Forbes 8/30/16 Beam me up, Alexa! Onyx communicator gets voice assistant integration – CNET 9/14/16
  7. 7. Tip #1: Follow the Sample brick road • Get started working almost anywhere • PC, Linux, Mac, Raspberry Pi, CHIP, … • NEW! – Includes hands-free implementation • Application.log shows the proper message flow • 3 Sample companion apps for linking and tokens • Android, iOS, Web app • First stop for all debugging! https://github.com/alexa/alexa-avs-sample-app
  8. 8. Example AVS Client Architecture AVS Client Companion Apps Connection Management Messaging Layer Controller Audio Input (Mic) Audio Player Alert Management Wake Word Engine Web App iOS Android Native Media Player Native Timers and Alarms Wake Word Process Alexa model GUI / Attention System State Mgmnt Directive Queues Event Dispatch Audio Output HTTP/2 AVS Control Logic 3rd Party / Built-in Custom dev / Sample
  9. 9. Interacting with AVS Cloud Service • AVS is Amazon’s intelligent cloud service that allows you as a developer to voice-enable any connected product with a microphone and speaker • API endpoint (https://avs-alexa-na.amazon.com) • /events – for all speech, playback and alert events • /directives – the source path of AVS directives (read-only) • /ping – to keep connection open • Message bus for all Events and Directives • Response messages and a down channel • State machine to determine how to handle messages • Pause playback? Duck audio? Alert versus music?
  10. 10. Tip #2: Take a Phased Approach Port sample • Re-platform (e.g., Java to C++) • Swap 3rd party components (e.g., Jetty to OkHttp) • Integrate with native components (e.g., Android MediaPlayer, local buttons) Harden tap-to-talk solution • Implement AVS functional design guidelines • Define device monitoring and management • Design an update and deployment process • Perform functional validation of core features and music Integrate hands-free • Integrate hands-free components • Test and tune hands- free performance • Responsiveness • Distance testing • Testing with audio output • Testing with ambient noise AVS functional design guidelines: https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/alexa-voice-service-functional-design-guide
  11. 11. Tip #3 Love the logs – Events 21:22:55.064 [AWT-EventQueue-0] INFO com.amazon.alexa.avs.http.AVSClient - Request metadata: { "event" : { "header" : { "namespace" : "SpeechRecognizer", "name" : "Recognize", "messageId" : "b15376c6-6265-451c-acee-bc5b9168af8e", "dialogRequestId" : "919336ea-25d5-43d9-8af8-61d1344fbcb5" }, "payload" : { "profile" : "CLOSE_TALK", "format" : "AUDIO_L16_RATE_16000_CHANNELS_1" } … } Thread Id Event Name Message Id
  12. 12. Tip #3: Love the logs - Directives 21:23:00.827 [RequestThread] INFO com.amazon.alexa.avs.http.AVSClient - x-amzn-requestid: 0e8aaffffee24de5-000017a1-0008f272-94b39f8f1fc8f82d- 50d324c7-5- 21:23:00.926 [RequestThread] INFO com.amazon.alexa.avs.http.MessageParser - Response metadata: { "directive" : { "header" : { "namespace" : "SpeechSynthesizer", "name" : "Speak", "messageId" : "65106c28-f005-4f5a-87d5-f38ccaa58e0a", "dialogRequestId" : "919336ea-25d5-43d9-8af8-61d1344fbcb5" }, … } Request Id Event Name Message Id
  13. 13. Complex Sequences - Multi-turn Alexa, set a timer. Recognize event Speak directive For how long? ExpectSpeech directive SpeechStarted event SpeechFinished event Recognize event AVS Controller AudioPlayer Microphone 10 minutes. 10 minutes starting now. … PCM PCM
  14. 14. Complex Sequences - Setting an Alarm Alexa, set a timer for 10 minutes. Recognize event Speak directive 10 minutes starting now. SpeechStarted event SpeechFinished event SetAlertSucceeded event AVS Controller AudioPlayer Alert Manager PCM SetAlert directive Alert Store AlertStarted event AlertEnteredForeground event Time passes…. Local management
  15. 15. Complex Sequences – Music Playback Alexa, play classical music. Playing classical music from Amazon Music. PlaybackStarted event AVS Controller AudioPlayer Play directive ProgressReportDelayElapsed event ProgressReportIntervalElapsed event PlaybackNearlyFinished event ProgressReportIntervalElapsed event … PlaybackFinished event Play directive …
  16. 16. Tip #4: Music comes in many formats - Common formats - Need support for all current codecs - Need to handle playlists as well AAC/MP4 Amazon Music, iHeartRadio, TuneIn MP3 Amazon Music, TuneIn HLS Amazon Music, iHeartRadio, TuneIn, Audible PLS iHeartRadio, TuneIn m3u TuneIn, Amazon Music Shoutcast / ICY iHeartRadio, TuneIn ID3 Tags iHeartRadio, TuneIn
  17. 17. Audio Player State Machine Playing Stopped Idle Buffer Underrun Paused Finished
  18. 18. Audio Player State Machine Playing Stopped Idle Buffer Underrun Paused Finished Playback initiated via voice or companion app. - Directive: Play - Events: PlaybackStarted, Progress events Superseded by other channels: 1. Dialog 2. Alerts 3. Content Next Play directive comes after PlaybackNearlyFinished event.
  19. 19. Audio Player State Machine Playing Stopped Idle Buffer Underrun Paused Finished Playback paused by user action or other channels. - Directive: none - Events: PlaybackPaused, PlaybackResumed (back to Playing)
  20. 20. Audio Player State Machine Playing Stopped Idle Buffer Underrun Paused Finished Playback stopped via voice command or companion app. - Directive: Stop or ClearQueue.CLEAR_ALL - Events: PlaybackStopped Playback continues with a Play directive.
  21. 21. Audio Player State Machine Playing Stopped Idle Buffer Underrun Paused Finished Playback reaches end of content. - Directive: none - Events: PlaybackFinished Playback ends when no Play directives follow PlaybackNearlyFinished/ PlaybackFinished events. Playback continues with a new Play directive.
  22. 22. Tip #4: Design for the Future • Events and Directives • Directives can come in at any time – don’t assume order • New directives and events can be added at any time – drop unknown directives on the floor • Message Formats • New elements should be able to be added to JSON formats at any time • Software Updating • All AVS devices should have an OTA update mechanism • Updates should not “brick” the device and support fallback
  23. 23. Hands-free Requires Hands-on • Building a hands-free experience requires sourcing multiple components and libraries • Plan months (>3) in advance for tuning of a hands-free solution • No all-in-one offerings today but multiple solutions to consider • Wake word spotter: • Front-end hardware: • Audio libraries:
  24. 24. Hands-free Front End Architecture Mic Array Echo Cancellation Wake Word Spotter Beamforming (only for multiple mics) Noise Reduction One of more input microphones (SNR >= 65dB, Sensitivity: -38dB ±1dB @ 94dB SPL) Hardware (DSP) or software solution to subtract device audio output from mic input Software process and library trained to “spot” the Alexa wake word from an audio buffer Decision making library to pick the best quality mic for capturing user utterance Optional component to further reduce ambient noise and tune audio for an ASR All of these components need to be sourced or developed for your solution from 3rd party offerings or by hand.
  25. 25. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. MILES KINGSTON General Manager Smart Home Group Intel and Amazon Partner To Enable Natural Voice Interaction For Consumers
  26. 26. Amazon + Intel CLOUD & DATA CENTER THINGS & DEVICES AWS IOT Alexa Voice Services • 10+ year partnership • Joint development • Shared customer passion • High performance + low costs • World class supply chain Amazon EC2 Amazon S3
  27. 27. Did You Know? Collateral & SW/HW Dev Kits Standards Influence Form Factor Reference Design Innovation Excellence Program ODM Reference System Ethnographic Research
  28. 28. Enabling Personal Assistance Design Speech Context VoiceAudio
  29. 29. Enabling Personal Assistance Design Speech Context VoiceAudio
  30. 30. Intel’s Solid Voice & Speech Expertise • Support for multiple designs and form factors • Broad set of voice processing components • Low power, highly optimized noise reduction • High quality tuning & configuration tools • Audio labs fully synchronized with leading partners
  31. 31. Enriching Daily Life with the Personal Experience and Simple, Natural Interaction of Voice Intel and Amazon are Collaborating to Extend Natural Voice Interaction For Consumers
  32. 32. Call to Action • Download the Sample from GitHub – build out a Raspberry Pi! ~ 2 hours • Start your new product today… https://github.com/alexa/alexa-avs-sample-app Port sample Harden tap-to- talk solution Integrate hands- free
  33. 33. Other Alexa Sessions Thursday 11:30am ALX202: How Amazon is enabling the future of Automotive Venetian, Level 3, Lido 3003 1pm ALX303: Building a Smarter Home with Alexa Venetian, Level 3, Murano 3203 3:30 ALX307: Voice-enabling Your Home and Devices with Amazon Alexa and AWS IoT Venetian, Level 2, Opaline Theatre 5pm ALX302: Build a Serverless Back End for Your Alexa-Based Voice Interactions Venetian, Level 2, Opaline Theatre 9:30am ALX304: Tips and Tricks on Bringing Alexa to Your Products Venetian, Level 1, Marco Polo 806 11am ALX305: From VUI to QA: Building a Voice-Based Adventure Game for Alexa Venetian, Level 1, Marco Polo 806 Friday 11am ALX203: Workshop: Creating Voice Experiences with Alexa Skills: From Idea to Testing in Two Hours Mirage, Jamaica B 1pm ALX306: State of the Union: Amazon Alexa and Recent Advances in Conversational AI Venetian, Level 2, Sands Showroom 11:30am and 2:30pm ALX204: Workshop: Build an Alexa-Enabled Product with Raspberry Pi Mirage, Antigua B 5pm ALX301: Alexa in the Enterprise: How JPL Leverages Alexa to Further Space Exploration with Internet of Things Venetian, Level 2, Venetian B Wednesday
  34. 34. Thank you!
  35. 35. Remember to complete your evaluations!

×