Learning Objectives:
- Learn about applying conversational interfaces in applications through Amazon Polly
- Learn about popular use cases for Amazon Polly
- Learn how specific AWS customers have implemented Amazon Polly in different workflows
Amazon Polly is a service that turns text into lifelike speech, making it easy to develop applications that use high-quality speech to increase engagement and accessibility. In this tech talk, we will introduce Amazon Polly and walk through popular use cases in specific industries where Amazon Polly's natural sounding voices improve user experience and enables new ways to consume content: Education, Gaming, Content Creation and Telephony.
2. What to Expect from this Webinar
1. What is Amazon Polly?
2. Popular Use Cases
a. Education
b. Gaming
c. Content Creation
d. Telephony
3. Business motivations for using Amazon Polly
4. Q & A
3. What is Amazon Polly?
• AWS service that converts text into lifelike speech
• 48 voices across 24 languages
• Developers can generate, store, and replay or distribute
generated speech
6. Amazon Polly: Recent Feature Updates
4/19: Speech Marks
4/19: Whispering Voice Tag
5/18: Vicki, new German voice
Watch the SSML feature overview in the 3/27 webinar ”How to get the most out of Amazon Polly”,
Available @ https://aws.amazon.com/polly/getting-started/
7. Use Case: Education
Reading
“With Speech Marks from Polly, AppWriter can deliver an enhanced
reading experience which truly levels the playing field for anyone
struggling with reading and writing.” Stefan Pal, COO, Wizkids
Language Learning
“We have found that the Amazon Polly voices are not just high in
quality, but are as good as natural human speech for teaching a
language.” Severin Hacker, CTO, Duolingo
8. Use Case: Gaming
Pre-production voicing
<A quote from Felix Duchesneau, Lumberyard team?>
Production voicing
<A quote from Michael Robinson, Amazon Rapids?>
9. Use Case: Content Creation
Animated Videos
“Amazon Polly gives GoAnimate users the ability to immediately give
voice to the characters they animate using our platform.”
Alvin Hung, CEO and founder of GoAnimate
Personalized Videos
“Speech personalization was the most requested feature by our
customers and AWS Polly accelerated our ability to deliver this
feature." Greg Clark, CEO and founder of Storybulbs
10. Use Case: Content Creation Cont’d
Voiced Articles
“Amazon Polly gives us speed, quality and amazing phoneme
customization options. Its true-to-life audio allowed us to switch from
voice actors to a full programmatic solution at a fraction of the cost. As
an added benefit the audio ingestion process now occurs in real time
so users can listen to stories as they happen, not hours or days later.”
Pierre Bonnais, Founder & CEO of Aloud
11. Use Case: Telephony
Telephony
“By adding Amazon Polly TTS to the Aculab platforms, we have given
our customers even more choice and flexibility, along with extremely
high quality, natural sounding voices. Furthermore, it has enabled us to
make business automation viable and affordable to both enterprises
and SMBs.” David Samuel, Managing Director and CEO at Aculab
“We rely on Amazon Polly to do all the synthesized speech for our
automated telephone calls, which we make as a trusted partner of the
NHS... There are many individuals who are excluded from the digital
world for social, economic or health reasons. Amazon Polly helps us to
make sure these patients are being cared for.” Mike Wray, Inhealthcare
12. Amazon Polly: Business Motivations
Voice Talent
• 100% natural
Amazon Polly TTS
• Lifelike voices
• Generate speech
immediately anytime
• Very economical
VS
13. Voice Talent vs Amazon Polly: Cost Comparison
“As a comparison, one hour of studio rental alone is equivalent
to generating audio for tens of millions of characters using
Amazon Polly, the TTS provider with the best cost-
effectiveness. (This amount of audio is the equivalent of
reading A Christmas Carol by Charles Dickens a couple of
hundred times.)” André Kenji Horie, Duolingo
https://aws.amazon.com/blogs/ai/powering-language-learning-on-duolingo-with-amazon-polly/
14. Professional Voice Talent
Steps for initial recording include:
1. Find a company that sources voice talents
2. Find independent party to evaluate speech quality
3. Record and evaluate quality of speech samples
4. Record all required speech
5. QA all recorded speech
Steps for subsequent recordings require #4 and #5 above
16. Amazon Polly: Pricing
Pricing: $4 per 1M characters of speech or Speech Marks
Free Tier: 5M characters per month, for the first 12 months
Caching: Cache and replay generated speech or
Speech Marks at no additional cost
17. Amazon Polly: Pricing Examples
Example Text Length Speech Duration Cost
1K requests, 1K chars per request 1 M chars ~23 hrs, 8 min $4.00
10K requests, 100 chars per
request
1 M chars ~23 hrs, 8 min $4.00
”Adventures of Huckleberry Finn”
by Mark Twain
~600K chars, 224 pages ~13 hrs, 50 min $2.40
Typical news article ~6.5K chars, 3 pages ~9 min $0.03
Storytelling with text highlighting
for children
10K chars of speech
10K chars of Speech Marks
~13 min $0.08
18. Amazon Polly: News Article Pricing Example
Suppose you want to voice 1,000 news articles in June 2017
• ~6.5K chars per article x 1K articles = ~6.5M total characters
• The first 5M characters are free
• 1.5M characters @ $4 per 1M characters = $6
* Once you generate the audio for a given article, you can cache it and replay it for free.
20. Q & A
• Contact us with any question about this webinar or Polly in general
polly-webinars-feedback@amazon.com
• Previous Amazon Polly Webinars
https://aws.amazon.com/polly/getting-started/
• Amazon Polly Blog Posts
https://aws.amazon.com/polly/developers/#blog-posts