Speech recognizers & generators

Speech Recognizers & Generators
Let’s Get Started…
Presented by: P. Kahoro
Presented to: Prof P. Okanda

Speech Recognizers: What are they?
A Speech is the vocalized form of human communication.
Incomputer scienceandelectrical engineering,speech recognition(SR) is the translation of spoken words into text. It is also known as "automatic speech recognition" (ASR). Speech Recognition (SR) is the ability to translate a dictation or spoken word to text.
-Speech recognition has evolved quite a bit over the past few years. Initially, it used to work in discrete dictation mode, where you had to pause between each spoken word. Today, however, it uses continuous dictation. It’s also become smarter, with its own set of grammar rules to make out the meaning of what’s being said.

Terms and Concepts
•Utterances
•Pronounciation
•Grammer
•Speaker Dependent System
•Speaker Independent System
•Training
•Accuracy

Terms &Concepts
Utterances:
An utterance is any stream of speech between two periods of silence. Silence delineates the start and end of an utterance. An utterance can be a single word, or it can contain multiple words (a phrase or a sentence)
Pronunciations:
One piece of information that the speech recognition engine uses to process a word is its pronunciation, which represents what the speech engine thinks a word should sound like.
Words can have multiple pronunciations associated with them. For example, the word “the” has at least two pronunciations in the U.S. English language: “thee” and “thuh”.

Cont…
Grammar: Grammars define the domain, or context, within which the recognition engine works. The engine compares the current utterance against the words and phrases in the active grammars. If the user says something that is not in the grammar, the speech engine will not be able to understand it correctly. So usually speech engines have a very vast grammar.
Accuracy: The ability of a recognizer can be examined by measuring its accuracy − or how well it recognizes utterances.
Training:
Somespeechrecognizershavetheabilitytoadapttoaspeaker.Whenthesystemhasthisability,itmayallowtrainingtotakeplace.

Cont…
Speaker Dependent Systems: Speech recognition systems that require a user to train the system to his/her voice are known as speaker-dependent systems. If you are familiar with desktop dictation systems, most are speaker dependent like IBM Via Voice.
Speaker Independent Systems:
Speech recognition systems that do not require a user to train the system are known as speaker-independent systems.

How do humans do it?
Articulation produces sound waves which the ear conveys to the brain for processing

How might computers do it?
Digitization
Acoustic analysis of the speech signal
Linguistic interpretation
Acoustic waveform
Acoustic signal
Speech recognition

How Speech Recognition Work?
•Audio input
•Apply a "grammar" so the speech recognizer knows what phonemes to expect.
•Acoustic Model
•Recognized text

How do computers do it?
•First, the user gives a voice command over the microphone, which is passed to the sound card in your system. This analog signal is sampled converted into digital form using a technique called Pulse Code Modulation or PCM. This digital waveform is a stream of amplitudes that look like a wavy line.
•The audio signal is further sampled and each sample is converted into a frequency domain. So, the incoming stream is now a set of discrete frequency bands, in a form that can be used by the speech recognizer.
•The next stage involves recognizing these bands of frequencies. For this, the speech recognition software has a database containing thousands of frequencies or "phonemes", as they’re called.

Hardware:
Sound Cards
Soundcard with the cleanest A/D (Analog to Digital) conversions are recommended.
Microphone
The best choice for microphone is the headset style.
Computers / Processors
The more the speed the better Speech Recognition would work. For good Speech Recognition you should be having 1 GHz processor and 1 GB of RAM.

Where can it be used?
•GPS: System control/navigation e.g. GPS-connected digital maps: “How far is it to the motorway junction?”
•Commercial/Industrial applicationsin-car steering systems
•Mobile telephony: Voice dialing hands-free use of mobile in car e.g. “Dial office”
•Home automation -heating, ventilation and air conditioning

Where can it be used?
•Military: System control/navigation e.g. Military -High-performance fighter aircraft, Helicopters, Training Air Traffic Controllers
•Computer and Video Games: Speech input has been used in a limited number of computer and video games. The Microsoft Xbox, Nintendo GameCube, and Sony PlayStation 2 consoles all offer games with speech input/output.
•Usage in education -Students who are blind
•Voice Security System: security locks of gates and doors
•Wearable Computers: The most futuristic application is in the use and functionality of wearable computers.

Speech Recognition Software
•Dragon Naturally Speeking
•IBM Via Voice
•Microsoft Speech Recognition System
•MacSpeechDictate
•Philips Speech Magic

Pros of Speech Recognition
•Faster than “hand-writing”.
•Allows for better spelling, whether it be in text or documents.
•Helpful for people with a mental or physical disability .
•Hands-free capability .

Cons of Speech Recognition
•No program is 100% perfect
•Factors that affect the accuracy of speech recognition are: slang, homonyms, signal-to-noise ratio, and overlapping speech
•Can be expensive depending on the program
•Easily misinterprets vocal commands e.gSIRI

Conclusion
•Revolutionize the way people conduct business over the Web and ,differentiate world-class e-businesses.
•VoiceXMLties speech recognition and telephony together
•voice-enabled Web solutions TODAY!

Generators:
•Software generators are programs that build other programs. In computer science, a generator is a special routine that can be used to control the iteration behavior of a loop. In fact, all generators are iterators.
•A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

Types of software generators:
•key generator(key-gen)
•RandomPassword Generators
•Code generator
•Natural language generator
•Random test generator
•Pseudorandom number generator

key generator(key-gen)
•Akeygenerator(key-gen)isacomputerprogramthatgeneratesaproductlicensingkey,suchasaserialnumber,necessarytoactivateforuseasoftwareapplication.
•Key-gensmaybelegitimatelydistributedbysoftwaremanufacturersforlicensingsoftwareincommercialenvironmentswheresoftwarehasbeenlicensedinbulkforanentiresiteorenterprise,ortheymaybedistributedillegitimatelyincircumstancesofcopyrightinfringementorsoftwarepiracy.
•Asoftwarelicenseisalegalinstrumentthatgovernstheusageanddistributionofcomputersoftware.
•Illegitimatekeygeneratorsaretypicallydistributedbysoftwarecrackerse.gkey-gensusedtocrackfakeWindowsOSe.gWindows8arealreadyavailable

Random password generator
•Arandompasswordgeneratorissoftwareprogramorhardwaredevicethattakesinputfromarandomorpseudo- randomnumbergeneratorandautomaticallygeneratesapassword.Randompasswordscanbegeneratedmanually,usingsimplesourcesofrandomnesssuchasdiceorcoins,ortheycanbegeneratedusingacomputer.
•Whiletherearemanyexamplesof"random"passwordgeneratorprogramsavailableontheInternet,generatingrandomnesscanbetrickyandmanyprogramsdonotgeneraterandomcharactersinawaythatensuresstrongsecurity.Acommonrecommendationistouseopensourcesecuritytoolswherepossible,sincetheyallowindependentchecksonthequalityofthemethodsused.Notethatsimplygeneratingapasswordatrandomdoesnotensurethepasswordisastrongpassword,becauseitispossible,althoughhighlyunlikely,togenerateaneasilyguessedorcrackedpassword.Infactthereisnoneedatallforapasswordtohavebeenproducedbyaperfectlyrandomprocess:itjustneedstobesufficientlydifficulttoguess.

Pseudorandom number generators
•Apseudorandom number generator(PRNG), also known as adeterministic random bit generator(DRBG),is analgorithmfor generating a sequence of numbers whose properties approximate the properties of sequences ofrandom numbers.
•Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

Code generator
•In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine.
•Sophisticated compilers typically perform multiple passes over various intermediate forms. This multi-stage process is used because many algorithms for code optimization are easier to apply one at a time, or because the input to one optimization relies on the completed processing performed by another optimization.

Men have become the tools of their tools. -P. Kahoro
The End

Speech recognizers & generators

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Speech recognizers & generators

Similar to Speech recognizers & generators (20)

Recently uploaded

Recently uploaded (20)

Speech recognizers & generators