Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)

1,768 views

Published on

This session will introduce you to Amazon Polly, a new deep learning service that turns text into lifelike speech. Polly enables existing applications to speak as a first class feature and creates the opportunity for entirely new categories of speech-enabled products – from mobile apps and cars, to devices and appliances. Polly includes 47 lifelike voices and support for 24 languages, so you can select the ideal voice and distribute your speech-enabled applications in many geographies. Polly is easy to use – you just send the text you want converted into speech to the Polly API, and Polly immediately returns the audio stream to your application so you can play it directly or store it in a standard audio file format, such as MP3. Polly supports Speech Synthesis Markup Language (SSML) tags like prosody so you can adjust the speech rate, pitch, or volume. Polly is a secure service that delivers all of these benefits at high scale and at low latency. You can cache and replay Polly’s generated speech at no additional cost. Polly lets you convert 5M characters per month for free during the first year. Polly’s pay-as-you-go pricing, low cost per request, and lack of restrictions on storage and reuse of voice output make it a cost-effective way to enable speech synthesis everywhere.  Join this session to learn more and find out how you get can started with Amazon Polly, today!

Published in: Technology
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

AWS re:Invent 2016: NEW LAUNCH! Introducing Amazon Polly (MAC204)

  1. 1. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rafal Kuklinski – Amazon Text-to-Speech November 30, 2016 MAC204 NEW LAUNCH! Introducing Amazon Polly A Service that Turns Text into Lifelike Speech
  2. 2. What to expect from the session • Introduction to Amazon Polly • Features and functionality • Text-to-speech: Under the hood • Getting started • Pricing • Case studies • Q&A
  3. 3. Introduction to Amazon Polly
  4. 4. Why we built Amazon Polly • Apps using voice to communicate with end-users are becoming more common every day • Naturalness of generated speech is a key element of user experience • Integration of speech varies across use cases
  5. 5. What is Amazon Polly? • A service that converts text into lifelike speech • Offers 47 lifelike voices across 24 languages • Low latency responses enable developers to build real-time systems • Developers can store, replay, and distribute generated speech
  6. 6. Amazon Polly: Quality Natural-sounding speech A subjective measure of how close TTS output is to human speech. Accurate text processing Ability of the system to interpret common text formats such as abbreviations, numerical sequences, homographs etc. Today in Las Vegas, NV it's 54°F. "We live for the music", live from the Madison Square Garden. Highly intelligibile A measure of how comprehensible speech is. ”Peter Piper picked a peck of pickled peppers.”
  7. 7. Amazon Polly: Language Portfolio Americas: • Brazilian Portuguese • Canadian French • English (US) • Spanish (US) A-PAC: • Australian English • Indian English • Japanese EMEA: • British English • Danish • Dutch • French • German • Icelandic • Italian • Norwegian • Polish • Portuguese • Romanian • Russian • Spanish • Swedish • Turkish • Welsh • Welsh English
  8. 8. Features and Functionality
  9. 9. Amazon Polly features: SSML Speech Synthesis Markup Language is a W3C recommendation, an XML-based markup language for speech synthesis applications <speak> My name is Kuklinski. It is spelled <prosody rate='x-slow'> <say-as interpret-as="characters">Kuklinski</say-as> </prosody> </speak>
  10. 10. Amazon Polly features: Lexicons Enables developers to customize the pronunciation of words or phrases My daughter’s name is Kaja. <lexeme> <grapheme>Kaja</grapheme> <grapheme>kaja</grapheme> <grapheme>KAJA</grapheme> <phoneme>"kaI.@</phoneme> </lexeme>
  11. 11. Text-to-Speech: Under the Hood
  12. 12. Goal: Convert text into intelligible, accurate, and natural speech Challenges: • Homographs: words written identically that have different pronunciation I live in Las Vegas vs This presentation broadcasts live from Las Vegas • Text normalization: disambiguation of abbreviations, acronyms, units ‘St.’ expanded as ‘street’ or ‘saint’ • Conversion of text to phonemes (Grapheme-to-Phoneme) in languages with complex mapping such as English e.g. tough, through, though • Foreign words (déjà vu), proper names (François Hollande), slang (ASAP, LOL) etc. Main Challenges of Text-to-Speech
  13. 13. TEXT Market grew by > 20%. WORDSPHONEMES { { { { { ˈtwɛn.ti pɚ.ˈsɛnt ˈmɑɹ.kət ˈgɹu baɪ ˈmoʊɹ ˈðæn PROSODY CONTOURUNIT SELECTION AND ADAPTATION TEXT PROCESSING PROSODY MODIFICATIONSTREAMING Market grew by more than twenty percent Speech units inventory
  14. 14. Unit Selection Conversion of phoneme sequence to waveform Database of recorded audio Unit – diphone Coverage of diphones and various features e.g. Allophonic variation • Pin vs Spin vs limping
  15. 15. Recording Data for TTS Tons of text Recording script: Few weeks of recordings Automatic selection of texts Recording script: • Covers all combinations of diphones and significant features in a language
  16. 16. an error occurred while searching for your route because snaps weren't all so obedient anymore, now we say apple again. and we say apple, general electric soars today. information on general electric quick breads, zucchini, holiday, crock pot, cake, so are you still keeping tabs on your old team, that weighs more than four tons, disrupts the herring's swim … An apple a day, keeps …
  17. 17. Getting started
  18. 18. Get started
  19. 19. First app from boto3 import Session from contextlib import closing polly = Session().client("polly") response = polly.synthesize_speech( Text="Hello world!", OutputFormat="mp3", VoiceId="Joanna") with closing(response["AudioStream"]) as stream: with open("speech.mp3", "wb") as file: file.write(stream.read())
  20. 20. Amazon Polly is cost-effective • Pay-as-you-go • $4 for 1M characters • Free Tier of 5M characters/month - first year • You can store and reuse generated speech
  21. 21. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Learning a language with TTS 11/30/2016 Amazon Polly in Duolingo Severin Hacker, CTO
  22. 22. Efficacy 34h of Duolingo = 1 college semester [Vesselinov et al, 2012]
  23. 23. Bots Demo
  24. 24. Why voice matters • Spoken language crucial for language learning • Accurate pronunciation matters • Faster iteration thanks to TTS • As good as natural human speech
  25. 25. How good is Amazon Polly?
  26. 26. Voice A A/B Testing Voices Voice B Learning = 5, Engagement = 5 Learning = 10, Engagement = 10 all results statistically significant (p=0.05)
  27. 27. Polly (Salli) English Old voice Winner! ”The new voice is a huge improvement ! I really like it, the old one was terrible at times.”
  28. 28. Polly (Vitoria) Portuguese Old voice Winner! “Just today, I started getting a new voice for my Portuguese lessons! It's SO much better than the previous one (...) in terms of comprehension it's miles better.”
  29. 29. Polly (Hans) German Old voice Winner! “The German male TTS is music to my ears”
  30. 30. We use … Danish: Naja (female) Dutch: Ruben (male) English: Salli (female), Joey (male) German: Hans (male) Spanish: Miguel (male) French: Mathieu (male) Italian: Carla (female) Norwegian: Liv (female) Polish: Maja (female) Portuguese: Vitoria (female), Ricardo (male) Swedish: Astrid (female) Turkish: Filiz (female) Welsh: Gwyneth (female)
  31. 31. worker Infrastructure Amazon S3 Amazon CloudFront webserver Amazon SQS Amazon DynamoDB Amazon Beanstalk TTS request TTS meta-data Amazon Polly Other TTS Other TTS TTS files Global distribution download
  32. 32. © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. GoAnimate Case Study Stacy Adams, Head of Marketing, GoAnimate @atl2oz Using Amazon Polly in Animated Video
  33. 33. About GoAnimate • Do-it-yourself animated video creation platform • Less resource-intensive than professional video creation • Companies use GoAnimate for: • Training and eLearning • HR • Marketing • GoAnimate for Schools supports K–12 educators and their students
  34. 34. Use cases for text-to-speech • Multi-language communication • Training or HR professionals who have to create content in many languages • Video preproduction • Video makers who need to iterate and fine-tune before the text-to-speech is eventually replaced by a professional voiceover • K–12 education • Students who make videos and don’t have access to professional voices or time for or knowledge of voiceover
  35. 35. GoAnimate demo
  36. 36. Thank you!
  37. 37. Remember to complete your evaluations!
  38. 38. Duolingo voices its language learning service Using Polly Duolingo is a free language learning service where users help translate the web and rate translations. With Amazon Polly our users benefit from the most lifelike Text-to-Speech voices available on the market. Severin Hacker CTO, Duolingo ” “ • Spoken language crucial for language learning • Accurate pronunciation matters • Faster iteration thanks to TTS • As good as natural human speech
  39. 39. GoAnimate is a cloud-based, animated video creation plarform. Amazon Polly gives GoAnimate users the ability to immediately give voice to the characters they animate using our platform. Alvin Hung CEO, GoAnimate ” “ • Multi-language communication • Training or HR professionals who have to create content in many languages • Video preproduction • Video makers who need to iterate and fine-tune before the text-to- speech is eventually replaced by a professional voiceover • K–12 education • Students who make videos and don’t have access to professional voices or time for or knowledge of voiceover With Polly, GoAnimate gives voice to the characters in their animations
  40. 40. Royal National Institute of Blind People creates and distributes accessible information in the form of synthesized content Amazon Polly delivers incredibly lifelike voices which captivate and engage our readers. John Worsfold Solutions Implementation Manager, RNIB ” “ • RNIB delivers largest library of audiobooks in the UK for nearly 2 million people with sight loss • Naturalness of generated speech is critical to captivate and engage readers • No restrictions on speech redistributions enables RNIB to create and distribute accessible information in a form of synthesized content RNIB provides the largest library in the UK for people with sight loss

×