ROBOTIC CONTROL THROUGH SPEECH
INTRODUCTIONThis voice recognition project  consists of two major components, a speech recognition module and a motorized robot.Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture).The motorized robot will consist of two DC motors and will make the robot forward and backward directions. DEPARTMENT OF ECE2
PROJECT DESCRIPTION	The Speaker Recognition can be classified into two phases.			1 Training  Phase.			2 Testing   Phase.DEPARTMENT OF ECE3
Training  Phase.In Training Phase ,the frequency  components of the given speech signal is extracted. Each registered speaker has to provide samples of their speech (given words).so that the system an build or train a reference model for that speaker.DEPARTMENT OF ECE4
Testing phaseIn testing phase ,the input speech is matched with stored references models (s)
Recognition decision is made on the basis of Mel Frequency Cepstrum  Coefficients (MFCC)
The command recognition is observed by the operation of stepper motor & DC motor and the control signals to the DC motor DEPARTMENT OF ECE5
ARCHITECTURE OF ADSP 2181DEPARTMENT OF ECE6
FEATURES OF ADSP 2181 PROCESSOR25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 VoltsSingle-Cycle Instruction ExecutionMultifunction InstructionsLow Power Dissipation in Idle Mode16K Words On-Chip Program Memory RAM16K Words On-Chip Data Memory RAMIndependent ALU, Multiplier/Accumulator, and Barrel Shifter Units3-Bus Architecture Allows Dual Operand Fetches in every Instruction CycleDEPARTMENT OF ECE7
ALU and MACThe ALU performs a standard set of arithmetic and logic operations in addition to division primitives.  The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations.DEPARTMENT OF ECE8
SHIFTERThe shifter performs logical and arithmetic shifts, normalization, de-normalization, and derive exponent operations. The shifter implements numeric format control including multiword floating-point representations.DEPARTMENT OF ECE9
SPEECHThe input speech is given in the form of nos. like1, 2,3..The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHzIn coding only 2000 samples are considered because only 0.25 sec will be taken for one character10DEPARTMENT OF ECE
REPRESENTATION OF SPEECH 			SIGNAL11DEPARTMENT OF ECE
Block DiagramInput speech via mic			ADSP 2181DEPARTMENT OF ECE12WINDOWINGFFTCODECFRAMMINGMELSPECTRUMMEL FREQWRAPMELCEPSTRUMDCMOTOR
FRAMINGSpeech signal is blocked into frames of N samples (n=256)Adjacent Frames are separated by M samples (M=100)Frame1= 0-256Frame2=100-356Such kind of 18 frames are required for 2000 samples/sec character.13DEPARTMENT OF ECE
FRAMING14DEPARTMENT OF ECE
WindowingMinimizes signal discontinuity in each frameReduced spectral distortionWindow signal is obtained by           Y1(n)=x1(n)*w(n)     ;    0<=n<N-1Where w(n) is Hamming Window and is given by    w(n)=0.54-0.46Cos(2∏ n/N-1);  0<=n<N-115DEPARTMENT OF ECE
Windowing16DEPARTMENT OF ECE
Result of Windowing256 values are o/p of this processThese values are given as an input for FFT.Some values of windowingfor 1 kHz is shown 0x00000x08260x0BE60x08B70x000F0xF6C70xF26C0xF5FC0xFFE80x0AA90x0FC7 17DEPARTMENT OF ECE
Fast Fourier TransformConverts time domain signal into frequency domain signalPower spectrum is obtained with real and imaginary part of the frequency domain of the speech signal.18DEPARTMENT OF ECE
WrappingA subjective pitch for each frequency is computed using Mel ScaleMel frequency scale is given by    mel(f)=2595*log10(1+f/700)19DEPARTMENT OF ECE
            Mel Frequency Coefficients20DEPARTMENT OF ECE
MFCCIt is Mel Frequency Cepstrum CoefficientIt consists of various frequency coefficient components.It contains:        Mel Spectrum (frequency domain)        Mel Cepstrum (time domain)21DEPARTMENT OF ECE
SPECTRUMSamples are convoluted with mel filter bank to obtain mel frequency spectrum.Mel frequency spectrum is given by            s(n)=y(n)*f(n)   s(n)------>mel frequency spectrum   y(n)------>samples   f(n)------->filter coefficients22DEPARTMENT OF ECE
Inverse Discrete Cosine TransformationMel frequency power spectrum is in frequency domain functionIn order to obtain a time domain function the signal undergoes IDCTNow mel frequency spectrum is converted into mel frequency cepstrum. 23DEPARTMENT OF ECE
CEPSTRUMMFCC  real numbers and are convoluted to time domain using IDCTThe time domain coefficients are called mel frequency cepstrum coefficients..MFCC is given by                   c(n)=sum of log (Sk * cos (n(k-.5)*pi/k)                                        24DEPARTMENT OF ECE
LEAST MEAN SQUARE ALGORITHM (LMS)This algorithm is used to find out the the minimum deviation between certain values.During testing phase the input speech is compared with the stored 4 values.The least deviated value is sent. 25DEPARTMENT OF ECE
INTERFACING PC WITH KIT	 	RS-232 SERIAL CABLEDEPARTMENT OF ECE26PCDSP PROCESSOR
DSP TO DC MOTORDEPARTMENT OF ECE27
CIRCUIT DIAGRAMDEPARTMENT OF ECE28
HARDWARE DETAILSThe latched output from the latch IC is given to the relays via resistor and transistor.
According to the predefined input, the coil gets energized and relay is switched to ON position.
Here we use SPDT relay
It causes a current flow in the DC Motor.DEPARTMENT OF ECE29
Details of dc motorSpeed of the motor  -  300 rpmCurrent – 750mAVoltage – 7.5VDEPARTMENT OF ECE30

My Project

  • 1.
  • 2.
    INTRODUCTIONThis voice recognitionproject consists of two major components, a speech recognition module and a motorized robot.Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture).The motorized robot will consist of two DC motors and will make the robot forward and backward directions. DEPARTMENT OF ECE2
  • 3.
    PROJECT DESCRIPTION The SpeakerRecognition can be classified into two phases. 1 Training Phase. 2 Testing Phase.DEPARTMENT OF ECE3
  • 4.
    Training Phase.InTraining Phase ,the frequency components of the given speech signal is extracted. Each registered speaker has to provide samples of their speech (given words).so that the system an build or train a reference model for that speaker.DEPARTMENT OF ECE4
  • 5.
    Testing phaseIn testingphase ,the input speech is matched with stored references models (s)
  • 6.
    Recognition decision ismade on the basis of Mel Frequency Cepstrum Coefficients (MFCC)
  • 7.
    The command recognitionis observed by the operation of stepper motor & DC motor and the control signals to the DC motor DEPARTMENT OF ECE5
  • 8.
    ARCHITECTURE OF ADSP2181DEPARTMENT OF ECE6
  • 9.
    FEATURES OF ADSP2181 PROCESSOR25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 VoltsSingle-Cycle Instruction ExecutionMultifunction InstructionsLow Power Dissipation in Idle Mode16K Words On-Chip Program Memory RAM16K Words On-Chip Data Memory RAMIndependent ALU, Multiplier/Accumulator, and Barrel Shifter Units3-Bus Architecture Allows Dual Operand Fetches in every Instruction CycleDEPARTMENT OF ECE7
  • 10.
    ALU and MACTheALU performs a standard set of arithmetic and logic operations in addition to division primitives.  The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations.DEPARTMENT OF ECE8
  • 11.
    SHIFTERThe shifter performslogical and arithmetic shifts, normalization, de-normalization, and derive exponent operations. The shifter implements numeric format control including multiword floating-point representations.DEPARTMENT OF ECE9
  • 12.
    SPEECHThe input speechis given in the form of nos. like1, 2,3..The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHzIn coding only 2000 samples are considered because only 0.25 sec will be taken for one character10DEPARTMENT OF ECE
  • 13.
    REPRESENTATION OF SPEECH SIGNAL11DEPARTMENT OF ECE
  • 14.
    Block DiagramInput speechvia mic ADSP 2181DEPARTMENT OF ECE12WINDOWINGFFTCODECFRAMMINGMELSPECTRUMMEL FREQWRAPMELCEPSTRUMDCMOTOR
  • 15.
    FRAMINGSpeech signal isblocked into frames of N samples (n=256)Adjacent Frames are separated by M samples (M=100)Frame1= 0-256Frame2=100-356Such kind of 18 frames are required for 2000 samples/sec character.13DEPARTMENT OF ECE
  • 16.
  • 17.
    WindowingMinimizes signal discontinuityin each frameReduced spectral distortionWindow signal is obtained by Y1(n)=x1(n)*w(n) ; 0<=n<N-1Where w(n) is Hamming Window and is given by w(n)=0.54-0.46Cos(2∏ n/N-1); 0<=n<N-115DEPARTMENT OF ECE
  • 18.
  • 19.
    Result of Windowing256values are o/p of this processThese values are given as an input for FFT.Some values of windowingfor 1 kHz is shown 0x00000x08260x0BE60x08B70x000F0xF6C70xF26C0xF5FC0xFFE80x0AA90x0FC7 17DEPARTMENT OF ECE
  • 20.
    Fast Fourier TransformConvertstime domain signal into frequency domain signalPower spectrum is obtained with real and imaginary part of the frequency domain of the speech signal.18DEPARTMENT OF ECE
  • 21.
    WrappingA subjective pitchfor each frequency is computed using Mel ScaleMel frequency scale is given by mel(f)=2595*log10(1+f/700)19DEPARTMENT OF ECE
  • 22.
    Mel Frequency Coefficients20DEPARTMENT OF ECE
  • 23.
    MFCCIt is MelFrequency Cepstrum CoefficientIt consists of various frequency coefficient components.It contains: Mel Spectrum (frequency domain) Mel Cepstrum (time domain)21DEPARTMENT OF ECE
  • 24.
    SPECTRUMSamples are convolutedwith mel filter bank to obtain mel frequency spectrum.Mel frequency spectrum is given by s(n)=y(n)*f(n) s(n)------>mel frequency spectrum y(n)------>samples f(n)------->filter coefficients22DEPARTMENT OF ECE
  • 25.
    Inverse Discrete CosineTransformationMel frequency power spectrum is in frequency domain functionIn order to obtain a time domain function the signal undergoes IDCTNow mel frequency spectrum is converted into mel frequency cepstrum. 23DEPARTMENT OF ECE
  • 26.
    CEPSTRUMMFCC realnumbers and are convoluted to time domain using IDCTThe time domain coefficients are called mel frequency cepstrum coefficients..MFCC is given by c(n)=sum of log (Sk * cos (n(k-.5)*pi/k) 24DEPARTMENT OF ECE
  • 27.
    LEAST MEAN SQUAREALGORITHM (LMS)This algorithm is used to find out the the minimum deviation between certain values.During testing phase the input speech is compared with the stored 4 values.The least deviated value is sent. 25DEPARTMENT OF ECE
  • 28.
    INTERFACING PC WITHKIT RS-232 SERIAL CABLEDEPARTMENT OF ECE26PCDSP PROCESSOR
  • 29.
    DSP TO DCMOTORDEPARTMENT OF ECE27
  • 30.
  • 31.
    HARDWARE DETAILSThe latchedoutput from the latch IC is given to the relays via resistor and transistor.
  • 32.
    According to thepredefined input, the coil gets energized and relay is switched to ON position.
  • 33.
    Here we useSPDT relay
  • 34.
    It causes acurrent flow in the DC Motor.DEPARTMENT OF ECE29
  • 35.
    Details of dcmotorSpeed of the motor - 300 rpmCurrent – 750mAVoltage – 7.5VDEPARTMENT OF ECE30