Summer Research Project. Final Presentation 2013

•Download as PPTX, PDF•

1 like•659 views

This document outlines a project to develop a system that can detect alertness by analyzing speech signals. The objectives are to design and implement the system on a GPU, STM32E development board, and Android platform. The work plan involves literature reviews, algorithm formulation/testing in MATLAB and conversion to C/Java code. Two algorithms are presented - one using generalized eigenvalues for noise removal and MFCC/LPCC features, the other uses GMM/SVM classifiers. The progress made includes functional MATLAB and C/C++ code. Future work involves implementation on GPU, STM32E board and Android app to compare results with other algorithms.

Technology

Detection of Alertness Based on
Analysis of Speech Signal
Pulak Sarangi
Ojaswa Anand
Induja Sreekant
Bibek Kabi

Under the Guidance of
Prof. Aurobinda Routray
Department of Electrical Engineering
Indian Institute of Technology Kharagpur

Objectives
• Design and Develop System Capable of
detecting alertness of a person by analyzing
the speech signal
• Implementation on GPU
• Implement the system on STM32E
development board
• Implementation as app on Android 4.2
(Jelly Bean, target API 17)

Work Plan
Week 1

•

Literature Survey

Week 2

•

Formulation of Algorithm

Week 3

•

Algorithm testing on MATLAB

Week 4

•
•

Conversion of MATLAB code to C code
Conversion of MATLAB code to JAVA code

Week 5

•
•
•

Implementation on GPU
Implementation on STM32E
Implementation on Android platform

1. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts based on
energy

Noise
Removal
using SVD

Extraction of
Wavelet
Features

De-framing
for Enhanced
Signal

Framing &
Windowing

Selection of Wavelet Features
Enhanced
Speech

Segmentation of
speech signal
into
overlapping
samples

6 level
Decomposition
of signal using
Daubechies
wavelet

Computation of
ratio of 62.51000Hz energy to
the total energy
E(i)

Classification

E(i) input

<0.3

Comparis
on with
threshold

Silence

Single Segment
with same pre
& post segment

>0.8

Voiced

Series Segment
with same pre
& post
segment

Single or Series
with different
pre & post
segment

PROGRESS
• Fully Functional MATLAB & C/C++ code
• Fully Functional Java Code
• Literature survey for implementation of C/JAVA code onto
Embedded/ANDROID platform and GPU respectively.

Results
Voiced

Silence

288

311

186

413

151

448

54

545

Speech Signal

2. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts based on
Generalized
Eigenvalue

Noise
Removal
using SVD

Feature
Extraction
(MFCCs,
LPCCs)

De-framing
for Enhanced
Signal

Framing &
Windowing

Observation
• After feature extraction instead of independent statistical properties
like mean, standard deviation, kurtosis, etc. covariance property was
taken into consideration, making processing much faster.

Results
Speech Signal

Distance between
covariance matrices

4.013

6.831

PROGRESS
• Fully Functional MATLAB
• Literature survey for implementation of C/C++ code in GPU

3. Model As Implemented in MATLAB and C/C++
S(n)
Recording

Formation of
Henkel
Matrix

Classification
of voiced/ silence
parts by GMM,
SVM classifier

Noise
Removal
using SVD

Feature
extraction
(MFCCs,
LPCCs)

De-framing
for Enhanced
Signal

Framing &
Windowing

PROGRESS
• Fully Functional MATLAB code
• Literature survey for implementation of C/C++ code in GPU

PLAN FOR FURTHER WORK
•
•
•
•

Implementation on GPU
Implementation on STM32E development board
Implementation as Android App for Android 4.2(API 17, Jelly Bean)
Comparison of Results with other algorithms

Voiced and Unvoiced Sounds
• Fundamental difference :
o Vibrations of the vocal cords produce voiced sounds.
o Rate at which the vocal cords vibrate dictates the pitch of the sound.
o Unvoiced sounds do not rely on the vibration of the vocal cords.
o Unvoiced sounds are created by the constriction of the vocal tract.
o Vocal cords remain open and the constrictions of the vocal tract force air out to produce
the unvoiced sounds
• The fundamental frequency of voiced segments is ranged from 60-500Hz
• The ratio between the energy of the bands between 62.5 Hz and 1000Hz to that of all bands
is computed and used in our algorithm as the fundamental parameter in formulating the
V/UV decision.

Literature Review
• Speech Enhancement using Singular Value
Decomposition(SVD)
• Wavelet based Voiced/Unvoiced Classification
Algorithm

Current network services such as Voice over IP or IP Television pose new challenges to network providers. Network operators need to know if their services are being properly provided. However, current quality of service parameters commonly used in data networks (e.g. throughput, packet delays, packet losses, etc.) do not show a clear view of what the users are experiencing. Thus, it is necessary to translate such measured quality parameters into a quality of experience value. Several models are being developed to cope with this problem. For instance, some approaches have used the packet loss rate to evaluate the experienced quality of an IP television channel. Unfortunately, packet loss just explains a fraction of the quality behavior. Then, we go one step further, taking into account the different MPEG frame types that are transmitted. In this paper, we have defined a model to predict the experienced quality that is a function of the loss of the different types of MPEG frames, providing a mean opinion score of the delivered service. The final results show that our model is able to better predict the quality of experience of such video services than just using the packet loss rate.

Erlang os

Pinche12345

Audio processing algorithms on the gpu

Luca Pintavalle

Ο όρος επαλήθευση λογικής κατά την εκτέλεση οριοθετεί ένα πεδίο που εκτείνεται από τον έλεγχο λογισµικού για τη συµµόρφωση µε ένα σύνολο προδιαγραφών, έως την εναρµόνιση µε καλές λογικές πρακτικές κατά τη συγγραφή κώδικα. Στο πλαίσιο αυτό, υλοποιήσαµε τη lovpy, µια ϐιβλιοθήκη µεταπρογραµµατισµού για τη γλώσσα Python, που εισάγει σε αυτή τις δυνατότητες της επαλήθευσης λογικής κατά την εκτέλεση. Ο καθορισµός της πρότυπης λογικής γίνεται χρησιµοποιώντας τη διαισθητική γλώσσα έκφρασης προδιαγραφών Gherkin, ενώ η χρήση της ϐιβλιοθήκης δεν απαιτεί καµία αλλαγή στον υπάρχον κώδικα. Για την υλοποίησή της αξιοποιήσαµε µια σειρά εργαλείων της ϑεωρίας γράφων, της ϑεωρίας τυπικών γλωσσών, της χρονικής λογικής καθώς και µοντέλα ϐαθιάς µηχανικής µάθησης, εστιάζοντας περισσότερο στα νευρωνικά δίκτυα γράφων. Θεµελιώσαµε µαθηµατικά ένα νέο είδος γράφου για την αναπαράσταση χρονικών προδιαγραφών και ορίσαµε για αυτόν ένα σύνολο µαθηµατικά τεκµηριωµένων λογικών αλγορίθµων. Στη συνέχεια, αξιοποιήσαµε τις δοµές αυτές προκειµένου να υλοποιήσουµε ένα νέο σύστηµα αυτόµατης απόδειξης ϑεωρη µάτων, το οποίο µας εξασφαλίζει την απόλυτη εγκυρότητα των εντοπισµένων παραβιάσεων. Αξιολογήσαµε πέντε διαφορετικές αποδεικτικές αρχιτεκτονικές, αποτελούµενες από ευριστικούς κανόνες και απλά νευρωνικά µοντέλα, µέχρι µεγάλα νευρωνικά δίκτυα γράφων. Για την εκπαίδευση των νευρωνικών συστηµάτων υλοποιήσαµε ένα µηχανισµό παραγω γής συνθετικών ϑεωρηµάτων, αξιοποιώντας µια σειρά από µαθηµατικές ιδιότητες. Τέλος, χρησιµοποιήσαµε τη lovpy για να εντοπίσουµε σφάλµατα σε δύο δηµοφιλή ϐιβλιοθήκες ανοιχτού κώδικα, την Django και την Keras.

MATLAB and Simulink for Communications System Design (Design Conference 2013)

Analog Devices, Inc.

Pycon 2011

limscoder

2022_03_28 "Raspberry Pi Applications in Electronics and Control Laboratories"

eMadrid network

Future Internet testbeds/experimentation between Brazil and Europe - FIBRE

FIBRE Testbed

Robotics competition 2016

Rohan Kotwani

ziad_cvziad ibrahim

MATLAB Thesis Projects

Phdtopiccom

Circuit Simplifier

Vineet Markan

Project_abstractparul gupta

Extracting a Rails Engine to a separated application

Jônatas Paganini

As a Rails Application grows, there is a need to decouple heavy systems from the monolithic applications. Several teams in different companies are doing the same: extracting (micro) services from their monolithic applications to give the engineering teams more flexibility to speed up the workflow. From the separation of the business logic to the server's setup, every change should respect the zero-downtime approach. This talk shares the automated steps and exercises we created to have a smooth transition to the new system. I'll share the context of the tool that is automatically extracting an entire rails engine from a project and moving it to a separate service.

Workshops

Todd Barr

[111]실내이동체정밀위치추정기술의세가지측면 도락주

NAVER D2

DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS

Praveen Penumathsa

Faster R-CNN

anna8885

MATLAB Project Guidance

Phdtopiccom

Resume_2017_Embeddedmohit ludhiyani

ECOC 2018 Market Focus: Open-source optical transmission performance estimato...

Mark Filer

As cloud providers, telcos, and enterprises increasingly consider deploying disaggregated, open optical networks, the onus of ensuring the end-to-end optical performance falls on the service providers themselves. Addressing this need, the Physical Simulation Environment working group of the Telecom Infra Project has developed an open-source optical transmission estimation tool for accurate prediction of multi-vendor optical network performance. The tool’s simulation engine is based on the well-known Gaussian Noise model, and the PSE team taken pains to validate the tool against a plurality of commercial equipment in carrier environments. This talk covers the development and validation of the open source tool.

Adaptive noise estimation algorithm for speech enhancementHarshal Ladhe

Speech Enhancer Study For Facebook

Geoffrey Cooling

Speech Enhancement Using A Minimum Mean Square Error Short Time Spectral Ampl...guestfb80e22

What's hot

Real-time Implementation of Sphere Decoder-based MIMO Wireless System (EUSIPC...

TSC University of Mondragon

Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...

ISSEL

MATLAB and Simulink for Communications System Design (Design Conference 2013)

Analog Devices, Inc.

Pycon 2011

limscoder

2022_03_28 "Raspberry Pi Applications in Electronics and Control Laboratories"

eMadrid network

Future Internet testbeds/experimentation between Brazil and Europe - FIBRE

FIBRE Testbed

Robotics competition 2016

Rohan Kotwani

ziad_cvziad ibrahim

MATLAB Thesis Projects

Phdtopiccom

Circuit Simplifier

Vineet Markan

Project_abstractparul gupta

Extracting a Rails Engine to a separated application

Jônatas Paganini

Workshops

Todd Barr

[111]실내이동체정밀위치추정기술의세가지측면 도락주

NAVER D2

DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS

Praveen Penumathsa

Faster R-CNN

anna8885

MATLAB Project Guidance

Phdtopiccom

Resume_2017_Embeddedmohit ludhiyani

ECOC 2018 Market Focus: Open-source optical transmission performance estimato...

Mark Filer

What's hot (19)

Real-time Implementation of Sphere Decoder-based MIMO Wireless System (EUSIPC...

Μεταπρογραµµατισµός κώδικα Python σε γλώσσα γραµµικού χρόνου για αυτόµατη επα...

MATLAB and Simulink for Communications System Design (Design Conference 2013)

Pycon 2011

2022_03_28 "Raspberry Pi Applications in Electronics and Control Laboratories"

Future Internet testbeds/experimentation between Brazil and Europe - FIBRE

Robotics competition 2016

ziad_cv

MATLAB Thesis Projects

Circuit Simplifier

Project_abstract

Extracting a Rails Engine to a separated application

Workshops

[111]실내이동체정밀위치추정기술의세가지측면 도락주

DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS

Faster R-CNN

MATLAB Project Guidance

Resume_2017_Embedded

ECOC 2018 Market Focus: Open-source optical transmission performance estimato...

Viewers also liked

Adaptive noise estimation algorithm for speech enhancementHarshal Ladhe

Speech Enhancer Study For Facebook

Geoffrey Cooling

Speech Enhancement Using A Minimum Mean Square Error Short Time Spectral Ampl...guestfb80e22

Comparison of Single Channel Blind Dereverberation Methods for Speech SignalsDeha Deniz Türköz

Speech enhancement for distant talking speech recognition

Takuya Yoshioka

Speech enhancement using spectral subtraction technique with minimized cross ...

eSAT Journals

Abstract The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several approaches for speech enhancement .earlier approaches didn’t consider cross spectral terms into account. Cross spectral terms become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques. Keywords: Spectral Subtaction,Cross Spectral Components

Active noise controlRishikesh .

Voice Activity Detection using Single Frequency Filtering

Tejus Adiga M

Final pptSharu Sparky

Antinoise system & Noise Cancellation

Gujarat Technological University

Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중

datasciencekorea

Voice recognition system

avinash raibole

Natural Language ProcessingJaganadh Gopinadhan

Speech recognition

Charu Joshi

Honda presentationRahulSN

Speech Recognition System By Matlab

Ankit Gujrati

Viewers also liked (16)

Adaptive noise estimation algorithm for speech enhancement

Speech Enhancer Study For Facebook

Speech Enhancement Using A Minimum Mean Square Error Short Time Spectral Ampl...

Comparison of Single Channel Blind Dereverberation Methods for Speech Signals

Speech enhancement for distant talking speech recognition

Speech enhancement using spectral subtraction technique with minimized cross ...

Active noise control

Voice Activity Detection using Single Frequency Filtering

Final ppt

Antinoise system & Noise Cancellation

Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중

Voice recognition system

Natural Language Processing

Speech recognition

Honda presentation

Speech Recognition System By Matlab

Similar to Summer Research Project. Final Presentation 2013

Plan_design and FPGA implement of MIMO OFDM SDM systems

Tan Vo

Resume_suresh_finalSomayajulu Suresh

FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS

IRJET Journal

ResumeSanjay Kumar

Girish one yeargirish bb

Resume_Naveena1Naveena Vemulapalli

BourrezCVEnglishPhilippe Bourrez

Praveen Kumar S S.docx(1)Praveenkumar S S

Resume_new1_mayanuradha m masanan

Kshama_ParakhKshama Parakh

Satyam_Singh_cvSatyam Singh

Resume_PraveenKumarPraveen Kumar

PKSengupta_TechAssocPrateep Kr Sengupta

VOICE CONTROLLED WHEELCHAIR using Amharic.pdf

Mubarek kebede

Rajendra Bareto-Resume-FinalRajendra Bareto

Giacomo Mellone CV

Giacomo Mellone

Rajas mhaskar resume2k19

Rajas Mhaskar

MANOJ_H_RAO_ResumeManoj Rao

Anand_Agrawal_CV.pdf

Anand Agrawal

Pragya_Tiwari_ResumePRAGYA TIWARI

Similar to Summer Research Project. Final Presentation 2013 (20)

Plan_design and FPGA implement of MIMO OFDM SDM systems

Resume_suresh_final

FACE COUNTING USING OPEN CV & PYTHON FOR ANALYZING UNUSUAL EVENTS IN CROWDS

Resume

Girish one year

Resume_Naveena1

BourrezCVEnglish

Praveen Kumar S S.docx(1)

Resume_new1_may

Kshama_Parakh

Satyam_Singh_cv

Resume_PraveenKumar

PKSengupta_TechAssoc

VOICE CONTROLLED WHEELCHAIR using Amharic.pdf

Rajendra Bareto-Resume-Final

Giacomo Mellone CV

Rajas mhaskar resume2k19

MANOJ_H_RAO_Resume

Anand_Agrawal_CV.pdf

Pragya_Tiwari_Resume

Recently uploaded

DevOps and Testing slides at DASA Connect

Kari Kakkonen

The Future of Platform Engineering

Jemma Hussein Allen

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...

Product School

Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...

Jeffrey Haguewood

Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows. We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases. This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams. Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.

How world-class product teams are winning in the AI era by CEO and Founder, P...

Product School

Neuro-symbolic is not enough, we need neuro-*semantic*

Frank van Harmelen

Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”. All of this illustrated with link prediction over knowledge graphs, but the argument is general.

Essentials of Automations: Optimizing FME Workflows with Parameters

Safe Software

Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place. Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects. Here’s what you’ll gain: - Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows. - Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy. - Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency. - Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity. We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic. Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.

Leading Change strategies and insights for effective change management pdf 1.pdf

OnBoard

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...

Ramesh Iyer

In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.

Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...

Thierry Lestable

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx

Abida Shariff

FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf

FIDO Alliance

UiPath Test Automation using UiPath Test Suite series, part 3

DianaGray10

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...

Sri Ambati

Mission to Decommission: Importance of Decommissioning Products to Increase E...

Product School

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...

Product School

Search and Society: Reimagining Information Access for Radical Futures

Bhaskar Mitra

The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.

De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...

Product School

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

James Anderson

Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management. The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM). Speakers: Bob Boule Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle. Gopinath Rebala Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.

To Graph or Not to Graph Knowledge Graph Architectures and LLMs

Paul Groth

Recently uploaded (20)