This is my University Project.

- 1. ROBOTIC CONTROL THROUGH SPEECH<br />
- 2. INTRODUCTION<br />This voice recognition project consists of two major components, a speech recognition module and a motorized robot.<br />Programmable module allows us to write the programming in Visual DSP++ (Programming applications for the ADSP 2181 Architecture).<br />The motorized robot will consist of two DC motors and will make the robot forward and backward directions. <br />DEPARTMENT OF ECE<br />2<br />
- 3. PROJECT DESCRIPTION<br /> The Speaker Recognition can be classified into two phases.<br /> 1 Training Phase.<br /> 2 Testing Phase.<br />DEPARTMENT OF ECE<br />3<br />
- 4. Training Phase.<br />In Training Phase ,the frequency components of the given speech signal is extracted. <br />Each registered speaker has to provide samples of their speech (given words).<br />so that the system an build or train a reference model for that speaker.<br />DEPARTMENT OF ECE<br />4<br />
- 5. Testing phase<br /><ul><li>In testing phase ,the input speech is matched with stored references models (s)
- 6. Recognition decision is made on the basis of Mel Frequency Cepstrum Coefficients (MFCC)
- 7. The command recognition is observed by the operation of stepper motor & DC motor and the control signals to the DC motor </li></ul>DEPARTMENT OF ECE<br />5<br />
- 8. ARCHITECTURE OF ADSP 2181<br />DEPARTMENT OF ECE<br />6<br />
- 9. FEATURES OF ADSP 2181 PROCESSOR<br />25 ns Instruction Cycle Time from 20 MHz Crystal at 5.0 Volts<br />Single-Cycle Instruction Execution<br />Multifunction Instructions<br />Low Power Dissipation in Idle Mode<br />16K Words On-Chip Program Memory RAM<br />16K Words On-Chip Data Memory RAM<br />Independent ALU, Multiplier/Accumulator, and Barrel Shifter Units<br />3-Bus Architecture Allows Dual Operand Fetches in every Instruction Cycle<br />DEPARTMENT OF ECE<br />7<br />
- 10. ALU and MAC<br />The ALU performs a standard set of arithmetic and logic operations in addition to division primitives.<br /> <br /> The MAC performs single-cycle multiply, multiply/add and multiply/subtract operations.<br />DEPARTMENT OF ECE<br />8<br />
- 11. SHIFTER<br />The shifter performs logical and arithmetic shifts, normalization, de-normalization, and derive exponent operations. <br />The shifter implements numeric format control including multiword floating-point representations.<br />DEPARTMENT OF ECE<br />9<br />
- 12. SPEECH<br />The input speech is given in the form of nos. like1, 2,3..<br />The frequency range of human voice is 4kHz hence sampling frequency is taken as 8kHz<br />In coding only 2000 samples are considered because only 0.25 sec will be taken for one character<br />10<br />DEPARTMENT OF ECE<br />
- 13. REPRESENTATION OF SPEECH SIGNAL<br />11<br />DEPARTMENT OF ECE<br />
- 14. Block Diagram<br />Input speech <br />via mic ADSP 2181<br />DEPARTMENT OF ECE<br />12<br />WINDOWING<br />FFT<br />CODEC<br />FRAMMING<br />MEL<br />SPECTRUM<br />MEL FREQ<br />WRAP<br />MEL<br />CEPSTRUM<br />DC<br />MOTOR <br />
- 15. FRAMING<br />Speech signal is blocked into frames of N samples (n=256)<br />Adjacent Frames are separated by M samples (M=100)<br />Frame1= 0-256<br />Frame2=100-356<br />Such kind of 18 frames are required for 2000 samples/sec character.<br />13<br />DEPARTMENT OF ECE<br />
- 16. FRAMING<br />14<br />DEPARTMENT OF ECE<br />
- 17. Windowing<br />Minimizes signal discontinuity in each frame<br />Reduced spectral distortion<br />Window signal is obtained by<br /> Y1(n)=x1(n)*w(n) ; 0<=n<N-1<br />Where w(n) is Hamming Window and is given by<br /> w(n)=0.54-0.46Cos(2∏ n/N-1); 0<=n<N-1<br />15<br />DEPARTMENT OF ECE<br />
- 18. Windowing<br />16<br />DEPARTMENT OF ECE<br />
- 19. Result of Windowing<br />256 values are o/p of this process<br />These values are given as an <br />input for FFT.<br />Some values of windowing<br />for 1 kHz is shown <br />0x0000<br />0x0826<br />0x0BE6<br />0x08B7<br />0x000F<br />0xF6C7<br />0xF26C<br />0xF5FC<br />0xFFE8<br />0x0AA9<br />0x0FC7 <br />17<br />DEPARTMENT OF ECE<br />
- 20. Fast Fourier Transform<br />Converts time domain signal into frequency domain signal<br />Power spectrum is obtained with real and imaginary part of the frequency domain of the speech signal.<br />18<br />DEPARTMENT OF ECE<br />
- 21. Wrapping<br />A subjective pitch for each frequency is computed using Mel Scale<br />Mel frequency scale is given by mel(f)=2595*log10(1+f/700)<br />19<br />DEPARTMENT OF ECE<br />
- 22. Mel Frequency Coefficients<br />20<br />DEPARTMENT OF ECE<br />
- 23. MFCC<br />It is Mel Frequency Cepstrum Coefficient<br />It consists of various frequency coefficient components.<br />It contains:<br /> Mel Spectrum (frequency domain)<br /> Mel Cepstrum (time domain)<br />21<br />DEPARTMENT OF ECE<br />
- 24. SPECTRUM<br />Samples are convoluted with mel filter bank to obtain mel frequency spectrum.<br />Mel frequency spectrum is given by<br /> s(n)=y(n)*f(n)<br /> s(n)------>mel frequency spectrum<br /> y(n)------>samples<br /> f(n)------->filter coefficients<br />22<br />DEPARTMENT OF ECE<br />
- 25. Inverse Discrete Cosine Transformation<br />Mel frequency power spectrum is in frequency domain function<br />In order to obtain a time domain function the signal undergoes IDCT<br />Now mel frequency spectrum is converted into mel frequency cepstrum. <br />23<br />DEPARTMENT OF ECE<br />
- 26. CEPSTRUM<br />MFCC real numbers and are convoluted to time domain using IDCT<br />The time domain coefficients are called mel frequency cepstrum coefficients..<br />MFCC is given by <br /> c(n)=sum of log (Sk * cos (n(k-.5)*pi/k) <br />24<br />DEPARTMENT OF ECE<br />
- 27. LEAST MEAN SQUARE ALGORITHM (LMS)<br />This algorithm is used to find out the the minimum deviation between certain values.<br />During testing phase the input speech is compared with the stored 4 values.<br />The least deviated value is sent. <br />25<br />DEPARTMENT OF ECE<br />
- 28. INTERFACING PC WITH KIT<br /> RS-232 SERIAL CABLE<br />DEPARTMENT OF ECE<br />26<br />PC<br />DSP <br />PROCESSOR<br />
- 29. DSP TO DC MOTOR<br />DEPARTMENT OF ECE<br />27<br />
- 30. CIRCUIT DIAGRAM<br />DEPARTMENT OF ECE<br />28<br />
- 31. HARDWARE DETAILS<br /><ul><li>The latched output from the latch IC is given to the relays via resistor and transistor.
- 32. According to the predefined input, the coil gets energized and relay is switched to ON position.
- 33. Here we use SPDT relay
- 34. It causes a current flow in the DC Motor.</li></ul>DEPARTMENT OF ECE<br />29<br />
- 35. Details of dc motor<br />Speed of the motor - 300 rpm<br />Current – 750mA<br />Voltage – 7.5V<br />DEPARTMENT OF ECE<br />30<br />
- 36. Advantages<br />It is SPEECH recognizable<br />Processing time is less<br />Easy and efficient<br />Useful for physically disable people<br />Less cost<br />Maintenance is easy<br />DEPARTMENT OF ECE<br />31<br />
- 37. Limitations<br />Mismatching of frequency may affect the compatibility with the hardware.<br />Each and everyone voice should be trained before testing it. <br />DEPARTMENT OF ECE<br />32<br />
- 38. APPLICATIONS<br />Physically and visually impaired friendly device where only the speech signals of the user is required.<br />In cases of acute problems like system crashes and all, this method can be utilized for emergency.<br />33<br />DEPARTMENT OF ECE<br />
- 39. CONCLUSION and FUTURE MODIFICATIONS<br />Speech recognition is still an active research area. <br />Speech Recognition brings in the communication between human and machine. <br />This project recognizes the given speech signal and the word is displayed on the PC. <br />DEPARTMENT OF ECE<br />34<br />
- 40. THANK YOU<br />DEPARTMENT OF ECE<br />35<br />

