Tiny Ears   Using SpeechRecognition To Teach    Kids To Read          Emily Toop        Radical Robot   Brighton iPhone Cr...
What is Speech Recognition?• Converting spoken words to text• Not targeted to a single speaker (voice  recognition)• Utter...
Why is Speech            The human brain is       Recognition Hard?          incredibly specialised -                     ...
How Does Siri Work?• Protocol Cracked - https://  github.com/plamoni/SiriProxy• Server Based because of CPU & live  data u...
Device Based        Recognition• Works offline• Immediate response for real time  processing• No need for expensive data pl...
Device Based        Recognition• Open Ears - http://  www.politepix.com/openears• Pocket Sphinx/ Sphinx CMU http://  cmusp...
DemoNumber Recogniser
Number Recogniser• Import OpenEars .xcodeproj into  project• Add OpenEars as target dependency• link libOpenEarsLibrary.a ...
Number Recogniser• Create and start audioSessionManager  is delegate  didFinishLaunchingWithOptions
Number recogniser• Rename .m file that runs  PocketSphinxController to .mm• Add OpenEarsEventObserverDelegate
Number Recogniser
Number recogniser•   -(void)pocketsphinxRecognitionLoopDidStart{}•   -(void)fliteDidFinishSpeaking{} (if using flite for  ...
Improving Recognition with Face Detection• Determine when user is speaking  directly to app and not to another  person to ...
Demo• Decorator• Using Core Image for face detection  WWDC Session Videos numbers 419 &  422
Kitten Break
Kitten Break
Tiny Ears• iPad Storybook using Speech  Recognition to listen to children as they  read aloud• Detect when child stumbles ...
Problems -          Educational• Large Age Range - different kids have  different reading abilities and therefore  require...
Problems -   Speech Recognition• 4 year olds speak very differently from  adults• how do we detect errors? - unknown  word...
Problems -   Speech Recognition• Is the child present?• Is there more than one person present? • Whose speech should we pr...
Startup Chile• Startup Accelerator run by Chilean  government• US$40k for 6 month, no equity• Starting January 16th• Looki...
Questions?• http://emilytoop.com• @fluffyemily• emily@radicalrobot.co.uk• http://radicalrobot.co.uk
Upcoming SlideShare
Loading in …5
×

Speech recognition

1,942 views

Published on

The talk I gave at Brighton iPhone Dev group on November 24th on Speech Recognition on iOS devices and my new startup, Tiny Ears.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,942
On SlideShare
0
From Embeds
0
Number of Embeds
23
Actions
Shares
0
Downloads
48
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • Background Noise - solution possible Noise Rejection Microphones. These are getting better but still aren’t fantastic\nDetecting gaps - need loads of training data to train statistical model on expected speech patterns\nHypotheses - lots of CPU required to whittle them down to most likely\nAccents - More training data to cover accents and more CPU to match against language/grammar models\nOther Languages - need a new model or every language\n\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • error detection - car/care, ph vs f and silent letters - hour\n
  • 1) should we ignore or accept sound input as speech?\n3) - visually or through ‘noise’ word detection\n\n
  • \n
  • \n
  • Speech recognition

    1. 1. Tiny Ears Using SpeechRecognition To Teach Kids To Read Emily Toop Radical Robot Brighton iPhone Creators November 2011
    2. 2. What is Speech Recognition?• Converting spoken words to text• Not targeted to a single speaker (voice recognition)• Utterances converted into phonemes that are compared against language model & grammar to generate a hypothesis• Recognition score to give confidence in hypothesis
    3. 3. Why is Speech The human brain is Recognition Hard? incredibly specialised - speech recognition & vision has taken millions of years to perfect. Hard to make a computer do the same thing.• Background Noise• Detecting gaps• Too many hypotheses generated• Accents• Other Languages• Dictionary words vs unknown words (i.e. names)
    4. 4. How Does Siri Work?• Protocol Cracked - https:// github.com/plamoni/SiriProxy• Server Based because of CPU & live data updates - doesn’t work offline• Limited vocabulary with well designed grammar
    5. 5. Device Based Recognition• Works offline• Immediate response for real time processing• No need for expensive data plans for your app to work
    6. 6. Device Based Recognition• Open Ears - http:// www.politepix.com/openears• Pocket Sphinx/ Sphinx CMU http:// cmusphinx.sourceforge.net/2010/03/ pocketsphinx-0-6-release/• Limited Language Model• Limited Grammer
    7. 7. DemoNumber Recogniser
    8. 8. Number Recogniser• Import OpenEars .xcodeproj into project• Add OpenEars as target dependency• link libOpenEarsLibrary.a binary• Add OpenEars, SphinxBase & PocketSphinx to Header Search Path
    9. 9. Number Recogniser• Create and start audioSessionManager is delegate didFinishLaunchingWithOptions
    10. 10. Number recogniser• Rename .m file that runs PocketSphinxController to .mm• Add OpenEarsEventObserverDelegate
    11. 11. Number Recogniser
    12. 12. Number recogniser• -(void)pocketsphinxRecognitionLoopDidStart{}• -(void)fliteDidFinishSpeaking{} (if using flite for text to speech)
    13. 13. Improving Recognition with Face Detection• Determine when user is speaking directly to app and not to another person to enhance accuracy• Stop listening when face not detected.• Detect when app has been abandoned & shut down audio manager etc.• Start listening when face is detected again
    14. 14. Demo• Decorator• Using Core Image for face detection WWDC Session Videos numbers 419 & 422
    15. 15. Kitten Break
    16. 16. Kitten Break
    17. 17. Tiny Ears• iPad Storybook using Speech Recognition to listen to children as they read aloud• Detect when child stumbles or does not recognise a word & intervene with assistance to teach child to read word• Track reading progress over time to provide targeted feedback.
    18. 18. Problems - Educational• Large Age Range - different kids have different reading abilities and therefore require different levels of feedback/ intervention• Presenting learning in a fun way so nothing is so difficult child will give up rather than learn
    19. 19. Problems - Speech Recognition• 4 year olds speak very differently from adults• how do we detect errors? - unknown words & mispronounciations• ‘noise’ words, detecting coughs, laughs or sounds indicating distress or difficulty
    20. 20. Problems - Speech Recognition• Is the child present?• Is there more than one person present? • Whose speech should we process? • Can we even tell?• Can we detect if the child is in distress or struggling?• Can we detect reading ability through Speech Recognition?
    21. 21. Startup Chile• Startup Accelerator run by Chilean government• US$40k for 6 month, no equity• Starting January 16th• Looking for collborators from education, business, artificial intelligence - email me
    22. 22. Questions?• http://emilytoop.com• @fluffyemily• emily@radicalrobot.co.uk• http://radicalrobot.co.uk

    ×