2. What is Speech
Recognition?
Instead of an automated voice recording that enables a
person to press buttons, he or she is able to speak specific
words into a device and command orders with the help of a
speech recognition program.
3. The Uses
Individuals With Disabilities – Assists those who have visual
impairment, hand immobility, dyslexia, etc.
Medical Transcription – Reduces delays to write out
medical transcriptions
Dictation - Converts words to text in emails or other word
documents (also helpful for English Language Learners).
Access Menu Commands – Opens files using voice commands.
5. How does it work?
Speech recognition functions as a
pipeline:
The pipeline converts PCM (pulse
code modulation) digital audio into
recognized speech from a sound
card.
6.
7. Transforming PCM Digital Audio
16,000 PCM values
per second, a “wavy
line”, that repeat while
the user speaks
Information is
converted for
better
recognition in
the program
Fast-Fourier
transform
identifies
frequency
components of a
specific sound
The program
can
approximate
how our ears
distinguish the
sound
8. Transform PCM digital audio
using Fast-Fourier Transform
Fast-Fourier analyzes every 1/100th of a second
and converts the audio data
Each 1/100th produces an amplitude graph
These graphs are in a database called a “codebook”
Sounds matched to the most similar entry in the codebook.
Sound is given a number which describes the sound, called the “feature
number”
9. Two Categories
Small Vocabulary/many-users:
• Leaves room for speech disparity (i.e. accents)
• Limited, preset number of commands that are able to be used
Large Vocabulary/limited-users:
• Best for business settings
• Train system to work with a small number of users
• Accuracy rate will increase as it learns its users
10. Discrete vs. Continuous Speech
Discrete
• Easier for program to understand
• Noticeable pause after each word
Continuous
• Allows speaking at conversational speed
• Used in most modern systems
Programs now can recognize accents and pronunciations better. In
earlier programs, accents, pronunciations, speed, and background noise
were all variables that made sounds difficult for programs to understand.
11. Using Talk – Text to Voice
This app allows you to type and then have the device repeat what was
typed. In this case, instead of the device saying Taryne as “Ta-rin”, it
pronounced it as “Ta-reen”. This is an example of speech recognition
programs still need some work to be done because of emphasis on a
syllable. The codebook did not have Taryne in it, so it was unable to
pronounce her name.
12. The Future of Assistive Technology
in Schools
Students who need assistance in their writing skills because
they have stronger oral skills.
Students who were absent for a class, have poor memory, or
need assistance hearing the lesson.
Students who need assistance during Guided Reading.
Students who are English Language Learners.
Students with visual/hearing impairments and learning
disabilities regarding reading/spelling/writing.