Speech Recognition


Published on

Published in: Technology, Education
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • - Avgvocab of a 3 year old-Beam search is an optimization of best-first search that reduces its memory requirements
  • Speech Recognition

    1. 1. Matt Oliverio, Shane Conceicao, and Sean Hanley
    2. 2.  Bell Laboratories designed in 1952 the "Audrey" system, which recognized digits spoken by a single voice. IBM demonstrated at the 1962 Worlds Fair its “Shoebox" machine, which could understand 16 words spoken in English and solve arithmetic on voice command
    3. 3.  U.S. Department of Defense’s DARPA Speech Understanding Research (SUR) program in the 1970’s was responsible for Carnegie Melon’s Harpy system. Harpy could understand 1011 words Harpy was significant because it was the first to use beam search technology A predetermined number of best partial solutions are kept as candidates and it predicts how close it is to complete solution
    4. 4.  In the 1980’s speech recognition vocabulary jumped dramatically due to a new statistical method called the Hidden Markov Model Instead of using templates for words and looking for sound patterns, HMM took the probability of unknown sounds being words This gave the potential for speech recognition programs recognize an unlimited number of words
    5. 5.  Introduced in 1987 children could train the doll to respond to their voice http://www.youtube.com/watch?feature=play er_embedded&v=UkU9SbIictc
    6. 6.  In the 1990’s faster computers made it possible for ordinary people to have speech recognition software In 1990 Dragon Dictate came out for $9000 Seven years later Dragon Naturally Speaking arrived for $695 Could understand words at a natural speed but you had to train it for 45 minutes
    7. 7.  By 2001, computer speech recognition had topped out at 80 percent accuracy and progress seemed to stall until the end of the decade Google’s voice search app for the iPhone and Apple’s Siri brought speech recognition back to the forefront
    8. 8.  Interact with the calendar. Search contacts. Read and write messages (text and email). Interact with the Maps app and location services. Utilize search providers Can understand English (US, UK, Australia), French, German, and Japanese
    9. 9.  Mobile App that allows the user to speak one language into the phone and produces a verbal translation iPhone and Android Thai, Chinese, French, German, Iraqi, Japanese, Korean, Spanish, TagologEnglish German-Spanish
    10. 10.  Ford SYNC technology ◦ Music ◦ Directions ◦ Handsfree Calling ◦ http://www.youtube.com/watch?v=My IgbcdOliw Nuance Dragon NaturallySpeaking ◦ Audi, BMW, Fiat, Hyundai, Mercedes, Jaguar, Porsche, Volkswagen ◦ “One-Shot Destination Entry” and full control of the “infotainment system”
    11. 11.  Microphone on Kinect Start by saying “Xbox,” and then saying one of the commands on screen Understands English, French, German, Italian, Spanish, Japanese Minimal background noise, clarity important
    12. 12.  Medical ◦ Allow doctors to talk into patient’s file to record notes during examinations Court of Law ◦ Record and digitize court proceedings in real time ◦ Reduce time and cost, increase efficiency Educational ◦ Rapid text-to-speech, aiding kids with disabilities ◦ http://tinyurl.com/5wtl8wv
    13. 13.  Speeds up “writing” Improvements in spelling Beneficial for the handicapped  Physically or Mentally Ability to multitask  Frees up physical limitations of using one’s hands
    14. 14.  Inaccuracy  Slang  Homonyms  Quiet environment  Distinct delivery Requires learning commands “Speaking to the Paper”  Professionalism  Editing Expensive
    15. 15.  Do you feel that the pros outweigh the cons? Is it worth investing in this software despite current limitations? Does anyone have Siri? Does it actually help you? Would anyone prefer to use the speech-to- text software to write papers?