• Save
Speech Recognition
Upcoming SlideShare
Loading in...5
×
 

Speech Recognition

on

  • 1,106 views

 

Statistics

Views

Total Views
1,106
Slideshare-icon Views on SlideShare
1,018
Embed Views
88

Actions

Likes
1
Downloads
0
Comments
0

1 Embed 88

http://vsites.villanova.edu 88

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • - Avgvocab of a 3 year old-Beam search is an optimization of best-first search that reduces its memory requirements

Speech Recognition Speech Recognition Presentation Transcript

  • Matt Oliverio, Shane Conceicao, and Sean Hanley
  •  Bell Laboratories designed in 1952 the "Audrey" system, which recognized digits spoken by a single voice. IBM demonstrated at the 1962 Worlds Fair its “Shoebox" machine, which could understand 16 words spoken in English and solve arithmetic on voice command
  •  U.S. Department of Defense’s DARPA Speech Understanding Research (SUR) program in the 1970’s was responsible for Carnegie Melon’s Harpy system. Harpy could understand 1011 words Harpy was significant because it was the first to use beam search technology A predetermined number of best partial solutions are kept as candidates and it predicts how close it is to complete solution
  •  In the 1980’s speech recognition vocabulary jumped dramatically due to a new statistical method called the Hidden Markov Model Instead of using templates for words and looking for sound patterns, HMM took the probability of unknown sounds being words This gave the potential for speech recognition programs recognize an unlimited number of words
  •  Introduced in 1987 children could train the doll to respond to their voice http://www.youtube.com/watch?feature=play er_embedded&v=UkU9SbIictc
  •  In the 1990’s faster computers made it possible for ordinary people to have speech recognition software In 1990 Dragon Dictate came out for $9000 Seven years later Dragon Naturally Speaking arrived for $695 Could understand words at a natural speed but you had to train it for 45 minutes
  •  By 2001, computer speech recognition had topped out at 80 percent accuracy and progress seemed to stall until the end of the decade Google’s voice search app for the iPhone and Apple’s Siri brought speech recognition back to the forefront
  •  Interact with the calendar. Search contacts. Read and write messages (text and email). Interact with the Maps app and location services. Utilize search providers Can understand English (US, UK, Australia), French, German, and Japanese
  •  Mobile App that allows the user to speak one language into the phone and produces a verbal translation iPhone and Android Thai, Chinese, French, German, Iraqi, Japanese, Korean, Spanish, TagologEnglish German-Spanish
  •  Ford SYNC technology ◦ Music ◦ Directions ◦ Handsfree Calling ◦ http://www.youtube.com/watch?v=My IgbcdOliw Nuance Dragon NaturallySpeaking ◦ Audi, BMW, Fiat, Hyundai, Mercedes, Jaguar, Porsche, Volkswagen ◦ “One-Shot Destination Entry” and full control of the “infotainment system”
  •  Microphone on Kinect Start by saying “Xbox,” and then saying one of the commands on screen Understands English, French, German, Italian, Spanish, Japanese Minimal background noise, clarity important
  •  Medical ◦ Allow doctors to talk into patient’s file to record notes during examinations Court of Law ◦ Record and digitize court proceedings in real time ◦ Reduce time and cost, increase efficiency Educational ◦ Rapid text-to-speech, aiding kids with disabilities ◦ http://tinyurl.com/5wtl8wv
  •  Speeds up “writing” Improvements in spelling Beneficial for the handicapped  Physically or Mentally Ability to multitask  Frees up physical limitations of using one’s hands
  •  Inaccuracy  Slang  Homonyms  Quiet environment  Distinct delivery Requires learning commands “Speaking to the Paper”  Professionalism  Editing Expensive
  •  Do you feel that the pros outweigh the cons? Is it worth investing in this software despite current limitations? Does anyone have Siri? Does it actually help you? Would anyone prefer to use the speech-to- text software to write papers?