Detection of Alertness Based on
Analysis of Speech Signal
Pulak Sarangi
Ojaswa Anand
Induja Sreekant
Bibek Kabi

Under the Guidance of
Prof. Aurobinda Routray
Department of Electrical Engineering
Indian Institute of Technology Kharagpur
Objectives
• Design and Develop System Capable of
detecting alertness of a person by analyzing
the speech signal
• Implementation on GPU
• Implement the system on STM32E
development board
• Implementation as app on Android 4.2
(Jelly Bean, target API 17)
Work Plan
Week 1

•

Literature Survey

Week 2

•

Formulation of Algorithm

Week 3

•

Algorithm testing on MATLAB

Week 4

•
•

Conversion of MATLAB code to C code
Conversion of MATLAB code to JAVA code

Week 5

•
•
•

Implementation on GPU
Implementation on STM32E
Implementation on Android platform
1. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts based on
energy

Noise
Removal
using SVD

Extraction of
Wavelet
Features

De-framing
for Enhanced
Signal

Framing &
Windowing
Selection of Wavelet Features
Enhanced
Speech

Segmentation of
speech signal
into
overlapping
samples

6 level
Decomposition
of signal using
Daubechies
wavelet

Computation of
ratio of 62.51000Hz energy to
the total energy
E(i)
Classification

E(i) input

<0.3

Comparis
on with
threshold

Silence

Single Segment
with same pre
& post segment

>0.8

Voiced

Series Segment
with same pre
& post
segment

Single or Series
with different
pre & post
segment
PROGRESS
• Fully Functional MATLAB & C/C++ code
• Fully Functional Java Code
• Literature survey for implementation of C/JAVA code onto
Embedded/ANDROID platform and GPU respectively.
Results
Voiced

Silence

288

311

186

413

151

448

54

545

Speech Signal
2. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts based on
Generalized
Eigenvalue

Noise
Removal
using SVD

Feature
Extraction
(MFCCs,
LPCCs)

De-framing
for Enhanced
Signal

Framing &
Windowing
Observation
• After feature extraction instead of independent statistical properties
like mean, standard deviation, kurtosis, etc. covariance property was
taken into consideration, making processing much faster.
Results
Speech Signal

Distance between
covariance matrices

4.013

6.831
PROGRESS
• Fully Functional MATLAB
• Literature survey for implementation of C/C++ code in GPU
3. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts by GMM,
SVM classifier

Noise
Removal
using SVD

Feature
extraction
(MFCCs,
LPCCs)

De-framing
for Enhanced
Signal

Framing &
Windowing
PROGRESS
• Fully Functional MATLAB code
• Literature survey for implementation of C/C++ code in GPU
PLAN FOR FURTHER WORK
•
•
•
•

Implementation on GPU
Implementation on STM32E development board
Implementation as Android App for Android 4.2(API 17, Jelly Bean)
Comparison of Results with other algorithms
Thank You
Voiced and Unvoiced Sounds
• Fundamental difference :
o Vibrations of the vocal cords produce voiced sounds.
o Rate at which the vocal cords vibrate dictates the pitch of the sound.
o Unvoiced sounds do not rely on the vibration of the vocal cords.
o Unvoiced sounds are created by the constriction of the vocal tract.
o Vocal cords remain open and the constrictions of the vocal tract force air out to produce
the unvoiced sounds
• The fundamental frequency of voiced segments is ranged from 60-500Hz
• The ratio between the energy of the bands between 62.5 Hz and 1000Hz to that of all bands
is computed and used in our algorithm as the fundamental parameter in formulating the
V/UV decision.
Literature Review
• Speech Enhancement using Singular Value
Decomposition(SVD)
• Wavelet based Voiced/Unvoiced Classification
Algorithm

Summer Research Project. Final Presentation 2013

  • 1.
    Detection of AlertnessBased on Analysis of Speech Signal Pulak Sarangi Ojaswa Anand Induja Sreekant Bibek Kabi Under the Guidance of Prof. Aurobinda Routray Department of Electrical Engineering Indian Institute of Technology Kharagpur
  • 2.
    Objectives • Design andDevelop System Capable of detecting alertness of a person by analyzing the speech signal • Implementation on GPU • Implement the system on STM32E development board • Implementation as app on Android 4.2 (Jelly Bean, target API 17)
  • 3.
    Work Plan Week 1 • LiteratureSurvey Week 2 • Formulation of Algorithm Week 3 • Algorithm testing on MATLAB Week 4 • • Conversion of MATLAB code to C code Conversion of MATLAB code to JAVA code Week 5 • • • Implementation on GPU Implementation on STM32E Implementation on Android platform
  • 4.
    1. Model AsImplemented in MATLAB and C/C++ S(n) Recording Formation of Henkel Matrix Classification of voiced/ silence parts based on energy Noise Removal using SVD Extraction of Wavelet Features De-framing for Enhanced Signal Framing & Windowing
  • 5.
    Selection of WaveletFeatures Enhanced Speech Segmentation of speech signal into overlapping samples 6 level Decomposition of signal using Daubechies wavelet Computation of ratio of 62.51000Hz energy to the total energy E(i)
  • 6.
    Classification E(i) input <0.3 Comparis on with threshold Silence SingleSegment with same pre & post segment >0.8 Voiced Series Segment with same pre & post segment Single or Series with different pre & post segment
  • 7.
    PROGRESS • Fully FunctionalMATLAB & C/C++ code • Fully Functional Java Code • Literature survey for implementation of C/JAVA code onto Embedded/ANDROID platform and GPU respectively.
  • 8.
  • 9.
    2. Model AsImplemented in MATLAB and C/C++ S(n) Recording Formation of Henkel Matrix Classification of voiced/ silence parts based on Generalized Eigenvalue Noise Removal using SVD Feature Extraction (MFCCs, LPCCs) De-framing for Enhanced Signal Framing & Windowing
  • 10.
    Observation • After featureextraction instead of independent statistical properties like mean, standard deviation, kurtosis, etc. covariance property was taken into consideration, making processing much faster.
  • 11.
  • 12.
    PROGRESS • Fully FunctionalMATLAB • Literature survey for implementation of C/C++ code in GPU
  • 13.
    3. Model AsImplemented in MATLAB and C/C++ S(n) Recording Formation of Henkel Matrix Classification of voiced/ silence parts by GMM, SVM classifier Noise Removal using SVD Feature extraction (MFCCs, LPCCs) De-framing for Enhanced Signal Framing & Windowing
  • 14.
    PROGRESS • Fully FunctionalMATLAB code • Literature survey for implementation of C/C++ code in GPU
  • 15.
    PLAN FOR FURTHERWORK • • • • Implementation on GPU Implementation on STM32E development board Implementation as Android App for Android 4.2(API 17, Jelly Bean) Comparison of Results with other algorithms
  • 16.
  • 17.
    Voiced and UnvoicedSounds • Fundamental difference : o Vibrations of the vocal cords produce voiced sounds. o Rate at which the vocal cords vibrate dictates the pitch of the sound. o Unvoiced sounds do not rely on the vibration of the vocal cords. o Unvoiced sounds are created by the constriction of the vocal tract. o Vocal cords remain open and the constrictions of the vocal tract force air out to produce the unvoiced sounds • The fundamental frequency of voiced segments is ranged from 60-500Hz • The ratio between the energy of the bands between 62.5 Hz and 1000Hz to that of all bands is computed and used in our algorithm as the fundamental parameter in formulating the V/UV decision.
  • 18.
    Literature Review • SpeechEnhancement using Singular Value Decomposition(SVD) • Wavelet based Voiced/Unvoiced Classification Algorithm