Speech recognition

Tiny Ears
Using Speech
Recognition To Teach
Kids To Read
Emily Toop
Radical Robot
Brighton iPhone Creators
November 2011

What is Speech Recognition?
• Converting spoken words to text
• Not targeted to a single speaker (voice
recognition)
• Utterances converted into phonemes that are
compared against language model & grammar
to generate a hypothesis
• Recognition score to give conﬁdence in
hypothesis

Why is Speech The human brain is

Recognition Hard? incredibly specialised -
speech recognition &
vision has taken millions
of years to perfect. Hard
to make a computer do
the same thing.

• Background Noise
• Detecting gaps
• Too many hypotheses generated
• Accents
• Other Languages
• Dictionary words vs unknown words (i.e.
names)

How Does Siri Work?

• Protocol Cracked - https://
github.com/plamoni/SiriProxy
• Server Based because of CPU & live
data updates - doesn’t work ofﬂine
• Limited vocabulary with well designed
grammar

Device Based
Recognition

• Works ofﬂine
• Immediate response for real time
processing
• No need for expensive data plans for
your app to work

Device Based
Recognition

• Open Ears - http://
www.politepix.com/openears
• Pocket Sphinx/ Sphinx CMU http://
cmusphinx.sourceforge.net/2010/03/
pocketsphinx-0-6-release/
• Limited Language Model
• Limited Grammer

Number Recogniser

• Import OpenEars .xcodeproj into
project
• Add OpenEars as target dependency
• link libOpenEarsLibrary.a binary
• Add OpenEars, SphinxBase &
PocketSphinx to Header Search Path

Number Recogniser
• Create and start audioSessionManager
is delegate
didFinishLaunchingWithOptions

Number recogniser

• Rename .m ﬁle that runs
PocketSphinxController to .mm
• Add OpenEarsEventObserverDelegate

Number recogniser

• -(void)pocketsphinxRecognitionLoopDidStart{}
• -(void)fliteDidFinishSpeaking{} (if using flite for
text to speech)

Improving Recognition
with Face Detection
• Determine when user is speaking
directly to app and not to another
person to enhance accuracy
• Stop listening when face not detected.
• Detect when app has been abandoned
& shut down audio manager etc.
• Start listening when face is detected
again

Demo

• Decorator
• Using Core Image for face detection
WWDC Session Videos numbers 419 &
422

Tiny Ears
• iPad Storybook using Speech
Recognition to listen to children as they
read aloud
• Detect when child stumbles or does not
recognise a word & intervene with
assistance to teach child to read word
• Track reading progress over time to
provide targeted feedback.

Problems -
Educational

• Large Age Range - different kids have
different reading abilities and therefore
require different levels of feedback/
intervention
• Presenting learning in a fun way so
nothing is so difﬁcult child will give up
rather than learn

Problems -
Speech Recognition

• 4 year olds speak very differently from
adults
• how do we detect errors? - unknown
words & mispronounciations
• ‘noise’ words, detecting coughs, laughs
or sounds indicating distress or
difﬁculty

Problems -
Speech Recognition
• Is the child present?
• Is there more than one person present?
• Whose speech should we process?
• Can we even tell?
• Can we detect if the child is in distress
or struggling?
• Can we detect reading ability through
Speech Recognition?

Startup Chile

• Startup Accelerator run by Chilean
government
• US$40k for 6 month, no equity
• Starting January 16th
• Looking for collborators from
education, business, artiﬁcial
intelligence - email me

Questions?

• http://emilytoop.com
• @ﬂuffyemily
• emily@radicalrobot.co.uk
• http://radicalrobot.co.uk

Speech recognition

More Related Content

What's hot

Viewers also liked

Similar to Speech recognition

Recently uploaded

Speech recognition

Editor's Notes