Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Harnessing Artificial Intelligence in your Applications - Level 300

743 views

Published on

AWS offers a family of AI services that provide cloud-native Machine Learning and Deep Learning technologies allowing developers to build an entirely new generation of apps that can see, hear, speak, understand, and interact with the real world. In this session we take a look at Amazon Rekognition, Amazon Polly, and Amazon Lex.

Speakers:
Adam Larter, Developer Solutions Architect, Amazon Web Services
Alastair Cousins, Solutions Architect, Amazon Web Services

Published in: Technology
  • Be the first to comment

Harnessing Artificial Intelligence in your Applications - Level 300

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Adam Larter Principal Solutions Architect, Developer Specialist, Amazon Web Services Alastair Cousins Senior Solutions Architect, Amazon Web Services Harnessing Artificial Intelligence in Your Applications Amazon Rekognition, Amazon Polly, and Amazon Lex Level 300
  2. 2. Intelligent Multimodal Interfaces
  3. 3. What is Amazon Polly? • A service that converts text into lifelike speech • Offers 47 lifelike voices across 24 languages • Low latency responses enable developers to build real-time systems • Developers can store, replay, and distribute generated speech
  4. 4. Amazon Polly: Quality Natural-sounding speech A subjective measure of how close TTS output is to human speech. Accurate text processing Ability of the system to interpret common text formats such as abbreviations, numerical sequences, homographs etc. Today in Sydney Australia, it's 26°C It’s nice to know, we’re going to Nice Highly intelligible A measure of how comprehensible speech is. Peter Piper picked a peck of pickled peppers
  5. 5. Amazon Polly: SSML Speech Synthesis Markup Language Is a W3C recommendation, an XML-based markup language for speech synthesis applications <speak> My name is Adam Larter. It is spelled <prosody rate='x-slow'> <say-as interpret-as="characters">Larter</say-as> </prosody> </speak>
  6. 6. Example Use Case Adding speech synthesis to any app
  7. 7. Polly Voice Synthesis Demo Amazon Polly Amazon API Gateway Lambda function Amazon S3 Mobile App IoT Device Calling through API Gateway allows us to implement caching and use throttling and API Keys via Usage Plans
  8. 8. Images – Another Untapped Interface
  9. 9. Amazon Rekognition Deep learning-based image recognition service Search, verify, and organise millions of images Object and Scene Detection Facial Analysis Face Comparison Facial Recognition
  10. 10. Amazon Rekognition Deep learning-based image recognition service Search, verify, and organise millions of images Object and Scene Detection Facial Analysis Face Comparison Facial Recognition
  11. 11. Detecting Faces in a Crowd IoT Camera Amazon Rekognition Lambda function Amazon API Gateway DetectFaces() Image with Faces "Emotions": [ {"Confidence": 99.1335220336914, "Type": "HAPPY" }, {"Confidence": 3.3275485038757324, "Type": "CALM"}, {"Confidence": 0.31517744064331055, "Type": "SAD"} ], "Eyeglasses": {"Confidence": 99.8050537109375, "Value": false}, "EyesOpen": {Confidence": 99.99979400634766, "Value": true},
  12. 12. Understanding Bounding Boxes Turn Ratios into X/Y co-ordinates: multiply by the image width/height "BoundingBox": { "Height": 0.3449999988079071, "Left": 0.09666666388511658, "Top": 0.27166667580604553, "Width": 0.23000000417232513 },
  13. 13. Tip: Capture Additional Context Introduce a coefficient to capture additional image context by inflating the bounding box
  14. 14. Cropping Faces
  15. 15. Scaling to Many Faces Amazon Rekognition Lambda function Amazon ElasticSearch Amazon SNS Lambda function Amazon S3 User’s Face Image Fan Out of Lambda Functions via SNS. 1 Notification per Face detected Metadata from DetectFaces() + S3 Object Ref to Face Image Metadata + Location + Timestamp User’s Face Image
  16. 16. Example Use Case Authentication using face image
  17. 17. Sign In Using Face • Cognito User Pools (CUP) as System of Record for users • Create a Developer-Authenticated Identity Provider (IdP) to perform AuthN using Amazon Rekognition • Federate CUP and Developer IdP through Cognito Identity Federation • CUP user names are unique – make use of the ExternalId parameter in indexFaces()
  18. 18. CUP and Developer Authenticated Identities will be linked after this call Linking Identities in Cognito Federation
  19. 19. Amazon Cognito User Pool Username and password sent to Cognito User Pools Identity Provider Link Face to Cognito User Mobile App
  20. 20. Amazon Cognito User Pool Cognito Identity Token returned Link Face to Cognito User Mobile App
  21. 21. Amazon Cognito Identity Pool Cognito Identity Token Link Face to Cognito User Mobile App User’s Face Image Amazon API Gateway Lambda functionUser’s Face Image + Cognito User Pool username stored in the Rekognition collection as the ExternalId for the user’s face vector Amazon Rekognition username as ExternalId Store in Collection Identities linked by call to getOpenIdTokenForDeveloperIdentity()
  22. 22. Amazon Cognito Identity Pool Mobile App User’s Face Image Amazon API Gateway Lambda functionUser’s Face Image ExternalId used as the unique user identifier in call to CognitoIdentity::getOpenIdTokenForDeveloperIdentity Amazon Rekognition Sign In Using Face FaceId + ExternalId AccessKeyId / SecretAccessKey / SessionToken
  23. 23. Sign In Using Face – Implementation Linking face to Cognito User: • Sign in first using Cognito User Pools via Cognito SDK • Take user’s picture & send image with JWT • Rekognition::indexFaces() to store user’s face vector in collection and use Cognito User Pools username as the External Id • CognitoIdentity::getOpenIdTokenForDeveloperIdentity to create a Cognito Token and link the identities together
  24. 24. Sign In Using Face – Implementation Sign in using face: • Rekognition::searchFacesByImage() to get External Id • Cognito::getOpenIdTokenForDeveloperIdentity() with retrieved External Id to generate the Token and Identity Id the client app needs • Client app then follows standard Cognito process using CognitoCachingCredentialsProvider()
  25. 25. Amazon Lex AWS Lambda Polly Amazon CloudWatch Monitoring Text Speech Text Amazon DynamoDB AWS IoT Amazon API Gateway Conversational Interfaces Applications
  26. 26. Walkthrough Lex Bot Creation Process
  27. 27. Example Use Case The Smart Assistant
  28. 28. Smart Assistant - Key Features • Triggers using any type of input, not just speech − This demo uses a camera, and on-device face detection with OpenCV – http://opencv.org • Hot word detection to get device’s attention − Snowboy - https://snowboy.kitt.ai/ • Silence detection during live speech capture − SoX - http://sox.sourceforge.net/ • NLU provided by Amazon Lex − Speech input SDK not yet available − Don’t let that stop you calling the API directly!
  29. 29. Smart Assistant Wait for Hot Word (Snowboy) Wait for Face to appear in camera view Listen for audio command START
  30. 30. Smart Assistant Wait for Face to appear in camera view Capture image from webcam (fswebcam) Recognise Face (Amazon Rekognition) Resize to improve process effiiency (Imagemagick) Detect face on device (OpenCV) Known User State Replay Audio Is the face in the collection? YES NO Run User Speech Dialogue Interaction and NLU
  31. 31. Smart Assistant Process intent (API Gateway/Lambda) Listen for speech input with silence detection (SoX) Play audio response & loop back to listen for speech input Construct Lex payload and submit to API (HTTPS Request) Parse response headers YES Run User Speech Dialogue Interaction and NLU Is the interaction Ready for Fulfillment ? NO Listen for speech input with silence detection (SoX)
  32. 32. Thank you!

×