This document summarizes research on implementing musical instrument recognition using convolutional neural networks (CNNs) and support vector machines (SVMs). The researchers aim to preprocess audio excerpts into images and use CNNs to achieve high accuracy in instrument classification. They will then combine CNN and SVM classifications and take a weighted average to achieve even higher accuracy. The document reviews several related works that used features like MFCCs and classifiers like SVMs, GMMs, and neural networks for instrument recognition. The researchers intend to use mel spectrograms and MFCCs to represent audio as images for CNN classification and improve music information retrieval and organization.
This document discusses an approach to extract features from speech signals using Mel Frequency Cepstral Coefficients (MFCC). MFCC is a popular feature extraction technique that models human auditory perception. It involves preprocessing the speech signal, framing it, applying the discrete cosine transform to the mel scale filter bank energies, and optionally adding delta and acceleration features. MFCC extraction was tested on isolated word samples to obtain coefficients representing the short-term power spectrum. These coefficients can then be used for applications like speech recognition by matching patterns in the feature space.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...IRJET Journal
This document discusses using machine learning and deep learning techniques to classify music genres automatically. It proposes applying noise reduction techniques to audio files using Fourier analysis before feeding them into models. A convolutional neural network is trained on mel-spectrograms of audio to classify genres. Supervised machine learning models like random forest and XGBoost are also explored using extracted audio features. The proposed system applies noise reduction to preprocessed audio then uses a CNN or supervised learning models to classify music genres.
Performance analysis of the convolutional recurrent neural network on acousti...journalBEEI
In this study, we attempted to find the optimal hyper-parameters of the convolutional recurrent neural network (CRNN) by investigating its performance on acoustic event detection. Important hyper-parameters such as the input segment length, learning rate, and criterion for the convergence test, were determined experimentally. Additionally, the effects of batch normalization and dropout on the performance were measured experimentally to obtain their optimal combination. Further, we studied the effects of varying the batch data on every iteration during the training. From the experimental results using the TUT sound events synthetic 2016 database, we obtained optimal performance with a learning rate of 1/10000. We found that a longer input segment length aided performance improvement, and batch normalization was far more effective than dropout. Finally, performance improvement was clearly observed by varying the starting points of the batch data for each iteration during the training.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Speech Emotion Recognition is a recent research topic in the Human Computer Interaction (HCI) field. The need has risen for a more natural communication interface between humans and computer, as computers have become an integral part of our lives. A lot of work currently going on to improve the interaction between humans and computers. To achieve this goal, a computer would have to be able to distinguish its present situation and respond differently depending on that observation. Part of this process involves understanding a user‟s emotional state. To make the human computer interaction more natural, the objective is that computer should be able to recognize emotional states in the same as human does. The efficiency of emotion recognition system depends on type of features extracted and classifier used for detection of emotions. The proposed system aims at identification of basic emotional states such as anger, joy, neutral and sadness from human speech. While classifying different emotions, features like MFCC (Mel Frequency Cepstral Coefficient) and Energy is used. In this paper, Standard Emotional Database i.e. English Database is used which gives the satisfactory detection of emotions than recorded samples of emotions. This methodology describes and compares the performances of Learning Vector Quantization Neural Network (LVQ NN), Multiclass Support Vector Machine (SVM) and their combination for emotion recognition.
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
The document proposes two methods for estimating the depth of sound images using direction of arrival (DOA) information extracted from stereo signals.
Method 1 estimates depth based on the shape of the DOA distribution modeled using a generalized Gaussian distribution (GGD), where a smoother distribution indicates a source is farther away.
Method 2 applies activation-shared nonnegative matrix factorization (NMF) to extract features while maintaining directional information and reduce noise interfering with DOA estimation. Conventional NMF generates artificial fluctuations, while shared activation preserves source direction.
An experiment calculates the correlation between estimated and reference depths from 6 datasets containing varying source combinations. Results show the proposed method achieves high correlation, confirming its efficacy over conventional methods
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
This document summarizes a research presentation on an online divergence switching method for a hybrid source separation technique. The hybrid method combines directional clustering for spatial separation with supervised nonnegative matrix factorization (SNMF) for spectral separation. The proposed method switches between KL divergence and Euclidean distance for the SNMF, depending on the amount of spectral gaps from the directional clustering. When there are many gaps, Euclidean distance is better for basis extrapolation. When gaps are fewer, KL divergence gives better separation. In experiments, the proposed online switching method outperformed using only one divergence, achieving higher signal-to-distortion ratios for music source separation.
This document discusses an approach to extract features from speech signals using Mel Frequency Cepstral Coefficients (MFCC). MFCC is a popular feature extraction technique that models human auditory perception. It involves preprocessing the speech signal, framing it, applying the discrete cosine transform to the mel scale filter bank energies, and optionally adding delta and acceleration features. MFCC extraction was tested on isolated word samples to obtain coefficients representing the short-term power spectrum. These coefficients can then be used for applications like speech recognition by matching patterns in the feature space.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...IRJET Journal
This document discusses using machine learning and deep learning techniques to classify music genres automatically. It proposes applying noise reduction techniques to audio files using Fourier analysis before feeding them into models. A convolutional neural network is trained on mel-spectrograms of audio to classify genres. Supervised machine learning models like random forest and XGBoost are also explored using extracted audio features. The proposed system applies noise reduction to preprocessed audio then uses a CNN or supervised learning models to classify music genres.
Performance analysis of the convolutional recurrent neural network on acousti...journalBEEI
In this study, we attempted to find the optimal hyper-parameters of the convolutional recurrent neural network (CRNN) by investigating its performance on acoustic event detection. Important hyper-parameters such as the input segment length, learning rate, and criterion for the convergence test, were determined experimentally. Additionally, the effects of batch normalization and dropout on the performance were measured experimentally to obtain their optimal combination. Further, we studied the effects of varying the batch data on every iteration during the training. From the experimental results using the TUT sound events synthetic 2016 database, we obtained optimal performance with a learning rate of 1/10000. We found that a longer input segment length aided performance improvement, and batch normalization was far more effective than dropout. Finally, performance improvement was clearly observed by varying the starting points of the batch data for each iteration during the training.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Speech Emotion Recognition is a recent research topic in the Human Computer Interaction (HCI) field. The need has risen for a more natural communication interface between humans and computer, as computers have become an integral part of our lives. A lot of work currently going on to improve the interaction between humans and computers. To achieve this goal, a computer would have to be able to distinguish its present situation and respond differently depending on that observation. Part of this process involves understanding a user‟s emotional state. To make the human computer interaction more natural, the objective is that computer should be able to recognize emotional states in the same as human does. The efficiency of emotion recognition system depends on type of features extracted and classifier used for detection of emotions. The proposed system aims at identification of basic emotional states such as anger, joy, neutral and sadness from human speech. While classifying different emotions, features like MFCC (Mel Frequency Cepstral Coefficient) and Energy is used. In this paper, Standard Emotional Database i.e. English Database is used which gives the satisfactory detection of emotions than recorded samples of emotions. This methodology describes and compares the performances of Learning Vector Quantization Neural Network (LVQ NN), Multiclass Support Vector Machine (SVM) and their combination for emotion recognition.
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
The document proposes two methods for estimating the depth of sound images using direction of arrival (DOA) information extracted from stereo signals.
Method 1 estimates depth based on the shape of the DOA distribution modeled using a generalized Gaussian distribution (GGD), where a smoother distribution indicates a source is farther away.
Method 2 applies activation-shared nonnegative matrix factorization (NMF) to extract features while maintaining directional information and reduce noise interfering with DOA estimation. Conventional NMF generates artificial fluctuations, while shared activation preserves source direction.
An experiment calculates the correlation between estimated and reference depths from 6 datasets containing varying source combinations. Results show the proposed method achieves high correlation, confirming its efficacy over conventional methods
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
This document summarizes a research presentation on an online divergence switching method for a hybrid source separation technique. The hybrid method combines directional clustering for spatial separation with supervised nonnegative matrix factorization (SNMF) for spectral separation. The proposed method switches between KL divergence and Euclidean distance for the SNMF, depending on the amount of spectral gaps from the directional clustering. When there are many gaps, Euclidean distance is better for basis extrapolation. When gaps are fewer, KL divergence gives better separation. In experiments, the proposed online switching method outperformed using only one divergence, achieving higher signal-to-distortion ratios for music source separation.
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
This document proposes a sound field reproduction system that utilizes an image sensor to estimate a listener's position and orientation. It summarizes challenges with existing spectral division methods that rely on a fixed reference listening position. The proposed system applies an "equiangular filter" to the spectral division method to locally synthesize sound fields around the listener by limiting the spatial bandwidth, as estimated from the image sensor. Simulation experiments and subjective assessments show the proposed system can accurately reproduce sound fields at the listener's position regardless of frequency, with improved directional perception over conventional methods. However, it did not clearly outperform alternatives in terms of sound quality. Overall, the system aims to address limitations of existing approaches by adapting sound field synthesis based on estimated listener parameters
This document summarizes a research paper on speaker recognition using vocal tract features. It discusses extracting pitch using FFT as a vocal source feature and Mel Frequency Cepstral Coefficients (MFCCs) as vocal tract features. These features are used as inputs to a Support Vector Machine (SVM) classifier for binary speaker classification. Experimental results show the SVM achieves up to 94.42% accuracy in classifying speakers based on their MFCC features, and accuracy increases with higher signal-to-noise ratios when noise is added to speech signals. The paper concludes vocal source and tract features can be used together with SVM for text-independent speaker recognition.
IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...IRJET Journal
This document presents research on classifying music genres using machine learning algorithms. The researchers built multiple classification models using the Free Music Archive dataset and compared the models' performance in predicting genre accuracy. Some models were trained on mel-spectrograms of songs and their audio features, while others used only spectrograms. The researchers found that a convolutional neural network model trained solely on spectrograms achieved the highest accuracy among the tested models. The goal of the research was to develop a machine learning approach for automatic music genre classification that performs better than existing methods.
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALIJCSEIT Journal
In this paper, a system, developed for speech encoding, analysis, synthesis and gender identification is
presented. A typical gender recognition system can be divided into front-end system and back-end system.
The task of the front-end system is to extract the gender related information from a speech signal and
represents it by a set of vectors called feature. Features like power spectrum density, frequency at
maximum power carry speaker information. The feature is extracted using First Fourier Transform (FFT)
algorithm. The task of the back-end system (also called classifier) is to create a gender model to recognize
the gender from his/her speech signal in recognition phase. This paper also presents the digital processing
of a speech signals (pronounced “A” and “B”) which are taken from 10 persons, 5 of them are Male and
the rest of them are Female. Power Spectrum Estimation of the signal is examined .The frequency at
maximum power of the English Phonemes is extracted from the estimated power spectrum. The system uses
threshold technique as identification tool. The recognition accuracy of this system is 80% on average.
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...TELKOMNIKA JOURNAL
Speech recognition can be defined as the process of converting voice signals into the ranks of the
word, by applying a specific algorithm that is implemented in a computer program. The research of speech
recognition in Indonesia is relatively limited. This paper has studied methods of feature extraction which is
the best among the Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) for
speech recognition in Indonesian language. This is important because the method can produce a high
accuracy for a particular language does not necessarily produce the same accuracy for other languages,
considering every language has different characteristics. Thus this research hopefully can help further
accelerate the use of automatic speech recognition for Indonesian language. There are two main
processes in speech recognition, feature extraction and recognition. The method used for comparison
feature extraction in this study is the LPC and MFCC, while the method of recognition using Hidden
Markov Model (HMM). The test results showed that the MFCC method is better than LPC in Indonesian
language speech recognition.
This document presents research on developing a machine learning model to recognize gender from voice recordings. The researchers collected a dataset of over 60,000 audio samples from online repositories. They extracted acoustic features from the recordings like mean, median and standard deviation of dominant frequencies. These features were used to train four machine learning algorithms - Support Vector Machine, Decision Tree, Gradient Boosted Trees and Random Forest. The Random Forest model achieved the highest testing accuracy of 89.3% for gender recognition. Feature importance analysis showed that standard deviation, mean and kurtosis were the most important discriminative features for the tree-based models. The research demonstrates that machine learning can effectively perform voice-based gender recognition using acoustic features extracted from speech.
Deep learning for music classification, 2016-05-24Keunwoo Choi
This document describes a presentation on deep learning for music classification. It discusses using deep convolutional neural networks (CNNs) for music classification tasks like genre classification, instrument identification, and automatic music tagging. CNNs can learn hierarchical music features from raw audio or time-frequency representations directly from data without requiring designed features. The presentation provides examples of applying CNNs to automatically tag music with descriptive keywords using a multi-label classification approach.
On the use of voice activity detection in speech emotion recognitionjournalBEEI
Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
This document proposes a video genre classification method using only audio features extracted from video clips. It uses Multivariate Adaptive Regression Splines (MARS) to build classification models for different genres based on low-level audio features like MFCCs, zero crossing rate, short-time energy, etc. extracted from a dataset of news, cartoons, sports, music and dahmas video clips. The models are able to accurately classify video genres with an overall classification rate of 91.83% based on the important audio features identified for each genre by the MARS models.
Wavelet Based Noise Robust Features for Speaker RecognitionCSCJournals
Extraction and selection of the best parametric representation of acoustic signal is the most important task in designing any speaker recognition system. A wide range of possibilities exists for parametrically representing the speech signal such as Linear Prediction Coding (LPC) ,Mel frequency Cepstrum coefficients (MFCC) and others. MFCC are currently the most popular choice for any speaker recognition system, though one of the shortcomings of MFCC is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze the non-stationary signal. Therefore it is not suitable for noisy speech signals. To overcome this problem several researchers used different types of AM-FM modulation/demodulation techniques for extracting features from speech signal. In some approaches it is proposed to use the wavelet filterbanks for extracting the features. In this paper a technique for extracting the features by combining the above mentioned approaches is proposed. Features are extracted from the envelope of the signal and then passed through wavelet filterbank. It is found that the proposed method outperforms the existing feature extraction techniques.
Adaptive wavelet thresholding with robust hybrid features for text-independe...IJECEIAES
The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal background model Gaussian mixture model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios.
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...cscpconf
Analysis of speech for recognition of stress is important for identification of emotional state of
person. This can be done using ‘Linear Techniques’, which has different parameters like pitch,
vocal tract spectrum, formant frequencies, Duration, MFCC etc. which are used for extraction
of features from speech. TEO-CB-Auto-Env is the method which is non-linear method of
features extraction. Analysis is done using TU-Berlin (Technical University of Berlin) German
database. Here emotion recognition is done for different emotions like neutral, happy, disgust,
sad, boredom and anger. Emotion recognition is used in lie detector, database access systems, and in military for recognition of soldiers’ emotion identification during the war.
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
This paper describe the morphing concept in which we convert the voice of any person into pre -analyzed or pre-recorded voice of any animals.As the user generate a pre-established voice, his pitch, timbre, vibrato and articulation can be modified to resemble those of a pre-recorded and pre-analyzed voice of animal. This technique is based on SMS. Thus using this concept we can develop many funny application and we can used this type of application in mobile device, personal computer etc. for enjoying the sometime of period.
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
The document provides an overview of convolutional neural networks (CNNs) including their structures and applications. CNNs use locally connected, shared weights and convolutional layers to learn hierarchical representations of input data. They have been successfully applied to tasks involving images and music such as visual recognition, segmentation, style transfer, tagging, chord recognition and onset detection.
Noise reduction in speech processing using improved active noise control (anc...eSAT Journals
Abstract An improved feed forward adaptive Active Noise Control (ANC) scheme is proposed by using Voice Activity Detector (VAD) and wiener filtering method. The ‘speech-plus-noise’ periods and ‘noise-only’ periods are separated using VAD and the unwanted noise is removed by adaptive filtering method. By using Speech Distortion Weighted- Multichannel Wiener Filtering (SDW-MWF) algorithm the noise periods which is present along with the speech signal, is processed and filtered out. The background noise along with the speech samples are removed by the iterative procedure of filtering process. Feed forward based ANC is used to achieve a system with a better noise reduction in speech processing. Adaptive filtering process is carried out and the speech signal without the background noise can be achieved. Key Words: Active Noise Control (ANC), Noise reduction, Adaptive filtering, Feed forward ANC
Noise reduction in speech processing using improved active noise control (anc...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Presentation slides discussing the theory and empirical results of a text-independent speaker verification system I developed based upon classification of MFCCs. Both mininimum-distance classification and least-likelihood ratio classification using Gaussian Mixture Models were discussed.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IRJET- Musical Instrument Recognition using CNN and SVMIRJET Journal
This document discusses a study that uses convolutional neural networks (CNNs) and support vector machines (SVMs) to recognize musical instruments in audio recordings. The researchers aim to convert audio excerpts to images and use CNNs to classify instruments, then combine the CNN classifications with SVM classifications to improve accuracy. They discuss related work on instrument recognition using other methods. The proposed model uses MFCC features with SVM and passes audio converted to images through four convolutional layers and fully connected layers in the CNN. Combining the CNN and SVM results through weighted averaging is expected to provide higher accuracy than either method alone for classifying instruments in the IRMAS dataset.
IRJET- A Review of Music Analysis TechniquesIRJET Journal
This document reviews various techniques for analyzing music, including extracting and recognizing musical instruments, estimating pitch and tempo, and extracting and recognizing musical notes. It discusses using matrix factorization techniques like non-negative matrix factorization (NMF) and its variants to separate music into instrumental tracks. It also reviews using neural networks with different features to recognize instruments. Other techniques discussed are using the continuous wavelet transform (CWT) to capture timbre characteristics and linear discriminant analysis (LDA) for classification. It discusses various pitch estimation methods like autocorrelation, average magnitude difference function, and linear predictive coding. It also reviews approaches for tempo estimation and extracting and recognizing musical notes.
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
This document proposes a sound field reproduction system that utilizes an image sensor to estimate a listener's position and orientation. It summarizes challenges with existing spectral division methods that rely on a fixed reference listening position. The proposed system applies an "equiangular filter" to the spectral division method to locally synthesize sound fields around the listener by limiting the spatial bandwidth, as estimated from the image sensor. Simulation experiments and subjective assessments show the proposed system can accurately reproduce sound fields at the listener's position regardless of frequency, with improved directional perception over conventional methods. However, it did not clearly outperform alternatives in terms of sound quality. Overall, the system aims to address limitations of existing approaches by adapting sound field synthesis based on estimated listener parameters
This document summarizes a research paper on speaker recognition using vocal tract features. It discusses extracting pitch using FFT as a vocal source feature and Mel Frequency Cepstral Coefficients (MFCCs) as vocal tract features. These features are used as inputs to a Support Vector Machine (SVM) classifier for binary speaker classification. Experimental results show the SVM achieves up to 94.42% accuracy in classifying speakers based on their MFCC features, and accuracy increases with higher signal-to-noise ratios when noise is added to speech signals. The paper concludes vocal source and tract features can be used together with SVM for text-independent speaker recognition.
IRJET- Music Genre Classification using Machine Learning Algorithms: A Compar...IRJET Journal
This document presents research on classifying music genres using machine learning algorithms. The researchers built multiple classification models using the Free Music Archive dataset and compared the models' performance in predicting genre accuracy. Some models were trained on mel-spectrograms of songs and their audio features, while others used only spectrograms. The researchers found that a convolutional neural network model trained solely on spectrograms achieved the highest accuracy among the tested models. The goal of the research was to develop a machine learning approach for automatic music genre classification that performs better than existing methods.
GENDER RECOGNITION SYSTEM USING SPEECH SIGNALIJCSEIT Journal
In this paper, a system, developed for speech encoding, analysis, synthesis and gender identification is
presented. A typical gender recognition system can be divided into front-end system and back-end system.
The task of the front-end system is to extract the gender related information from a speech signal and
represents it by a set of vectors called feature. Features like power spectrum density, frequency at
maximum power carry speaker information. The feature is extracted using First Fourier Transform (FFT)
algorithm. The task of the back-end system (also called classifier) is to create a gender model to recognize
the gender from his/her speech signal in recognition phase. This paper also presents the digital processing
of a speech signals (pronounced “A” and “B”) which are taken from 10 persons, 5 of them are Male and
the rest of them are Female. Power Spectrum Estimation of the signal is examined .The frequency at
maximum power of the English Phonemes is extracted from the estimated power spectrum. The system uses
threshold technique as identification tool. The recognition accuracy of this system is 80% on average.
Comparison of Feature Extraction MFCC and LPC in Automatic Speech Recognition...TELKOMNIKA JOURNAL
Speech recognition can be defined as the process of converting voice signals into the ranks of the
word, by applying a specific algorithm that is implemented in a computer program. The research of speech
recognition in Indonesia is relatively limited. This paper has studied methods of feature extraction which is
the best among the Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCC) for
speech recognition in Indonesian language. This is important because the method can produce a high
accuracy for a particular language does not necessarily produce the same accuracy for other languages,
considering every language has different characteristics. Thus this research hopefully can help further
accelerate the use of automatic speech recognition for Indonesian language. There are two main
processes in speech recognition, feature extraction and recognition. The method used for comparison
feature extraction in this study is the LPC and MFCC, while the method of recognition using Hidden
Markov Model (HMM). The test results showed that the MFCC method is better than LPC in Indonesian
language speech recognition.
This document presents research on developing a machine learning model to recognize gender from voice recordings. The researchers collected a dataset of over 60,000 audio samples from online repositories. They extracted acoustic features from the recordings like mean, median and standard deviation of dominant frequencies. These features were used to train four machine learning algorithms - Support Vector Machine, Decision Tree, Gradient Boosted Trees and Random Forest. The Random Forest model achieved the highest testing accuracy of 89.3% for gender recognition. Feature importance analysis showed that standard deviation, mean and kurtosis were the most important discriminative features for the tree-based models. The research demonstrates that machine learning can effectively perform voice-based gender recognition using acoustic features extracted from speech.
Deep learning for music classification, 2016-05-24Keunwoo Choi
This document describes a presentation on deep learning for music classification. It discusses using deep convolutional neural networks (CNNs) for music classification tasks like genre classification, instrument identification, and automatic music tagging. CNNs can learn hierarchical music features from raw audio or time-frequency representations directly from data without requiring designed features. The presentation provides examples of applying CNNs to automatically tag music with descriptive keywords using a multi-label classification approach.
On the use of voice activity detection in speech emotion recognitionjournalBEEI
Emotion recognition through speech has many potential applications, however the challenge comes from achieving a high emotion recognition while using limited resources or interference such as noise. In this paper we have explored the possibility of improving speech emotion recognition by utilizing the voice activity detection (VAD) concept. The emotional voice data from the Berlin Emotion Database (EMO-DB) and a custom-made database LQ Audio Dataset are firstly preprocessed by VAD before feature extraction. The features are then passed to the deep neural network for classification. In this paper, we have chosen MFCC to be the sole determinant feature. From the results obtained using VAD and without, we have found that the VAD improved the recognition rate of 5 emotions (happy, angry, sad, fear, and neutral) by 3.7% when recognizing clean signals, while the effect of using VAD when training a network with both clean and noisy signals improved our previous results by 50%.
This document proposes a video genre classification method using only audio features extracted from video clips. It uses Multivariate Adaptive Regression Splines (MARS) to build classification models for different genres based on low-level audio features like MFCCs, zero crossing rate, short-time energy, etc. extracted from a dataset of news, cartoons, sports, music and dahmas video clips. The models are able to accurately classify video genres with an overall classification rate of 91.83% based on the important audio features identified for each genre by the MARS models.
Wavelet Based Noise Robust Features for Speaker RecognitionCSCJournals
Extraction and selection of the best parametric representation of acoustic signal is the most important task in designing any speaker recognition system. A wide range of possibilities exists for parametrically representing the speech signal such as Linear Prediction Coding (LPC) ,Mel frequency Cepstrum coefficients (MFCC) and others. MFCC are currently the most popular choice for any speaker recognition system, though one of the shortcomings of MFCC is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze the non-stationary signal. Therefore it is not suitable for noisy speech signals. To overcome this problem several researchers used different types of AM-FM modulation/demodulation techniques for extracting features from speech signal. In some approaches it is proposed to use the wavelet filterbanks for extracting the features. In this paper a technique for extracting the features by combining the above mentioned approaches is proposed. Features are extracted from the envelope of the signal and then passed through wavelet filterbank. It is found that the proposed method outperforms the existing feature extraction techniques.
Adaptive wavelet thresholding with robust hybrid features for text-independe...IJECEIAES
The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal background model Gaussian mixture model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios.
ANALYSIS OF SPEECH UNDER STRESS USING LINEAR TECHNIQUES AND NON-LINEAR TECHNI...cscpconf
Analysis of speech for recognition of stress is important for identification of emotional state of
person. This can be done using ‘Linear Techniques’, which has different parameters like pitch,
vocal tract spectrum, formant frequencies, Duration, MFCC etc. which are used for extraction
of features from speech. TEO-CB-Auto-Env is the method which is non-linear method of
features extraction. Analysis is done using TU-Berlin (Technical University of Berlin) German
database. Here emotion recognition is done for different emotions like neutral, happy, disgust,
sad, boredom and anger. Emotion recognition is used in lie detector, database access systems, and in military for recognition of soldiers’ emotion identification during the war.
Emotion Recognition based on audio signal using GFCC Extraction and BPNN Clas...ijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
This paper describe the morphing concept in which we convert the voice of any person into pre -analyzed or pre-recorded voice of any animals.As the user generate a pre-established voice, his pitch, timbre, vibrato and articulation can be modified to resemble those of a pre-recorded and pre-analyzed voice of animal. This technique is based on SMS. Thus using this concept we can develop many funny application and we can used this type of application in mobile device, personal computer etc. for enjoying the sometime of period.
Deep Convolutional Neural Networks - OverviewKeunwoo Choi
The document provides an overview of convolutional neural networks (CNNs) including their structures and applications. CNNs use locally connected, shared weights and convolutional layers to learn hierarchical representations of input data. They have been successfully applied to tasks involving images and music such as visual recognition, segmentation, style transfer, tagging, chord recognition and onset detection.
Noise reduction in speech processing using improved active noise control (anc...eSAT Journals
Abstract An improved feed forward adaptive Active Noise Control (ANC) scheme is proposed by using Voice Activity Detector (VAD) and wiener filtering method. The ‘speech-plus-noise’ periods and ‘noise-only’ periods are separated using VAD and the unwanted noise is removed by adaptive filtering method. By using Speech Distortion Weighted- Multichannel Wiener Filtering (SDW-MWF) algorithm the noise periods which is present along with the speech signal, is processed and filtered out. The background noise along with the speech samples are removed by the iterative procedure of filtering process. Feed forward based ANC is used to achieve a system with a better noise reduction in speech processing. Adaptive filtering process is carried out and the speech signal without the background noise can be achieved. Key Words: Active Noise Control (ANC), Noise reduction, Adaptive filtering, Feed forward ANC
Noise reduction in speech processing using improved active noise control (anc...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Presentation slides discussing the theory and empirical results of a text-independent speaker verification system I developed based upon classification of MFCCs. Both mininimum-distance classification and least-likelihood ratio classification using Gaussian Mixture Models were discussed.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IRJET- Musical Instrument Recognition using CNN and SVMIRJET Journal
This document discusses a study that uses convolutional neural networks (CNNs) and support vector machines (SVMs) to recognize musical instruments in audio recordings. The researchers aim to convert audio excerpts to images and use CNNs to classify instruments, then combine the CNN classifications with SVM classifications to improve accuracy. They discuss related work on instrument recognition using other methods. The proposed model uses MFCC features with SVM and passes audio converted to images through four convolutional layers and fully connected layers in the CNN. Combining the CNN and SVM results through weighted averaging is expected to provide higher accuracy than either method alone for classifying instruments in the IRMAS dataset.
IRJET- A Review of Music Analysis TechniquesIRJET Journal
This document reviews various techniques for analyzing music, including extracting and recognizing musical instruments, estimating pitch and tempo, and extracting and recognizing musical notes. It discusses using matrix factorization techniques like non-negative matrix factorization (NMF) and its variants to separate music into instrumental tracks. It also reviews using neural networks with different features to recognize instruments. Other techniques discussed are using the continuous wavelet transform (CWT) to capture timbre characteristics and linear discriminant analysis (LDA) for classification. It discusses various pitch estimation methods like autocorrelation, average magnitude difference function, and linear predictive coding. It also reviews approaches for tempo estimation and extracting and recognizing musical notes.
Knn a machine learning approach to recognize a musical instrumentIJARIIT
An outline is provided of a proposed system to recognize musical instruments using machine learning techniques. The system first extracts features from audio files using the MIR toolbox in Matlab. It then uses a hybrid feature selection method and vector quantization to identify instruments. Specifically, the key audio descriptors are selected and feature vectors are generated and matched to standard vectors to classify the instrument. The k-nearest neighbors algorithm is used for classification. Preliminary results show the system can accurately recognize instruments based on extracted acoustic features.
This document provides an overview of recent developments in sound recognition techniques. It discusses several methods for sound recognition, including matching pursuit algorithms with MFCC features, probabilistic distance support vector machines using generalized gamma modeling of STE features, and frequency vector principal component analysis. The document also reviews related literature on environmental sound recognition using time-frequency audio features and sound event recognition. It aims to present an updated survey on sound recognition methods and discuss future research trends in the field.
Literature Survey for Music Genre Classification Using Neural NetworkIRJET Journal
The document discusses literature on classifying music genres using neural networks. It summarizes several past studies that used techniques like convolutional neural networks (CNNs) and mel-frequency cepstral coefficients (MFCCs) on datasets like GTZAN to classify music into genres like blues, classical, country, etc. The document also outlines the system design for a proposed music genre classification system, including collecting the GTZAN dataset, preprocessing the audio files into mel-spectrograms, extracting features using MFCCs, and training a CNN model to classify segments of songs into genres. Classification accuracy of different models from prior studies ranged from 40-80%.
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...CSCJournals
This document is the front matter for the International Journal of Biometrics and Bioinformatics (IJBB) for volume 1, issue 1 published in 2007. It provides publishing details such as the editor in chief, publication date, copyright information, and a table of contents. The main article in this issue is titled "A linear-discriminant-analysis-based approach to enhance the performance of fuzzy c-means clustering in spike sorting with low-SNR data" by Chien-Wen Cho, Wen-Hung Chao, and You-Yin Chen.
IRJET- A Personalized Music Recommendation SystemIRJET Journal
This document describes a personalized music recommendation system that uses collaborative filtering and convolutional neural networks. The system provides three types of recommendations: popularity-based recommendations based on the most popular songs among all users, item-based recommendations of similar songs based on a user's listening history using collaborative filtering, and genre-based recommendations based on the genres of songs a user has listened to previously as determined by a convolutional neural network classifier. The system was tested on a dataset of music listening logs and audio files and evaluated based on its ability to provide personalized music recommendations to users.
Optimized audio classification and segmentation algorithm by using ensemble m...Venkat Projects
The document proposes an optimized audio classification and segmentation algorithm that segments audio streams into four types - pure speech, music, environment sound, and silence - using ensemble methods. It uses a hybrid classification approach of bagged support vector machines and artificial neural networks. The algorithm aims to accurately segment audio with minimum misclassification and requires less training data, making it suitable for real-time applications. It segments non-speech portions into music or environment sound and further divides speech into silence and pure speech. The algorithm achieves approximately 98% accurate segmentation.
IRJET- Music Genre Recognition using Convolution Neural NetworkIRJET Journal
1. The document describes a study that uses a Convolutional Neural Network (CNN) model to classify music genres based on labeled Mel spectrograms of audio clips.
2. A CNN model is trained on a dataset of 1000 audio clips across 10 genres. The trained model is then used to classify new, unlabeled audio clips by genre based on their Mel spectrogram representation.
3. CNNs are well-suited for this task as their convolutional layers can extract hierarchical features from the Mel spectrogram images that are indicative of different genres. The study aims to develop an automated music genre classification system using deep learning techniques.
IRJET- Music Genre Classification using GMMIRJET Journal
This document presents a technique for classifying music genres using Gaussian mixture models (GMM). Tempogram features are extracted from music audio to characterize the temporal structure. A GMM is used to model the distribution of the acoustic features for each genre class. The model is trained using example music clips labelled with their genres. New clips can then be classified by determining which trained GMM distribution they best match. The method achieved an accuracy of 94% on classifying clips as classic, pop or rock music using a 10-component GMM. GMM allows the features from each genre to be represented as a mixture of Gaussian distributions.
A novel convolutional neural network based dysphonic voice detection algorit...IJECEIAES
This paper presents a convolutional neural network (CNN) based noninvasive pathological voice detection algorithm using signal processing approach. The proposed algorithm extracts an acoustic feature, called chromagram, from voice samples and applies this feature to the input of a CNN for classification. The main advantage of chromagram is that it can mimic the way humans perceive pitch in sounds and hence can be considered useful to detect dysphonic voices, as the pitch in the generated sounds varies depending on the pathological conditions. The simulation results show that classification accuracy of 85% can be achieved with the chromagram. A comparison of the performances for the proposed algorithm with those of other related works is also presented.
This document discusses a proposed system for classifying audio scenes in action movies. It aims to provide scene recognition and detection by separating audio classes and obtaining better sound classification accuracy. The system extracts audio features like zero-crossing rate, short-time energy, volume root mean square, and volume dynamic range. It then uses hidden Markov models and support vector machines to classify audio scenes, labeling them as happy, miserable, or action scenes. Sound event types classified include gunshots, screams, car crashes, talking, laughter, fighting, shouting, and background crowd noise. The goal is to index and retrieve interesting events from action movies to engage viewers.
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...IRJET Journal
This document presents a modified least mean square (LMS) algorithm to reduce noise in real-time speech signals. The proposed approach modifies the standard LMS algorithm by incorporating a Wiener filter. Experiments are conducted on speech samples from the NOIZEUS database with various types of noise at different signal-to-noise ratios. Objective measures like segmental SNR, log likelihood ratio, Itakura-Saito spectral distance, and cepstrum are used to evaluate the performance of the proposed algorithm compared to the standard LMS algorithm. The results show that the modified LMS algorithm with Wiener filter outperforms the standard LMS algorithm in enhancing the quality of noisy speech signals based on the objective measure values.
Speech emotion recognition with light gradient boosting decision trees machineIJECEIAES
Speech emotion recognition aims to identify the emotion expressed in the speech by analyzing the audio signals. In this work, data augmentation is first performed on the audio samples to increase the number of samples for better model learning. The audio samples are comprehensively encoded as the frequency and temporal domain features. In the classification, a light gradient boosting machine is leveraged. The hyperparameter tuning of the light gradient boosting machine is performed to determine the optimal hyperparameter settings. As the speech emotion recognition datasets are imbalanced, the class weights are regulated to be inversely proportional to the sample distribution where minority classes are assigned higher class weights. The experimental results demonstrate that the proposed method outshines the state-of-the-art methods with 84.91% accuracy on the Berlin database of emotional speech (emo-DB) dataset, 67.72% on the Ryerson audio-visual database of emotional speech and song (RAVDESS) dataset, and 62.94% on the interactive emotional dyadic motion capture (IEMOCAP) dataset.
MediaEval 2015 - UNIZA System for the "Emotion in Music" task at MediaEval 20...multimediaeval
1) The document describes a system for determining emotion in music based on valence and arousal scores using support vector regression and feature optimization techniques.
2) The system tests different feature sets, including the baseline features and two optimized feature sets selected using genetic algorithm and particle swarm optimization, finding the optimized sets improve prediction accuracy.
3) Results show the optimized feature sets achieve better correlation and RMSE than the baseline features or MIR toolbox features for predicting valence and arousal dimensions.
Recognition of music genres using deep learning.IRJET Journal
This document discusses using deep learning techniques to recognize music genres from audio files. It evaluates three approaches: extracting Mel-spectrograms, MFCC plots, and chroma STFT features from audio and using those as input to CNN models. A CNN architecture with 5 conv layers performed best on Mel-spectrograms, achieving over 90% accuracy. MFCC plots achieved over 70% accuracy. Chroma STFT features performed worst at around 57% accuracy. In conclusion, Mel-spectrograms were found to be the most effective audio feature for music genre classification using deep learning.
Audio Classification using Artificial Neural Network with Denoising Algorithm...IRJET Journal
This document describes the development of an intelligent music player application using audio classification and a noise filtering algorithm. The application uses an artificial neural network trained on audio features to classify music files into categories like beats and melodies. It also includes a music jockey plugin to mix songs and a noise filtering algorithm to improve audio quality. The goal is to enhance the user experience of music players by intelligently organizing music libraries and improving sound quality. Key aspects of the application include a graphical user interface, neural network training process, audio analysis modules, and noise filtering algorithm implementation.
IRJET- A Review on Audible Sound Analysis based on State Clustering throu...IRJET Journal
This document reviews progress in acoustic modeling for statistical parametric speech synthesis, from early hidden Markov models to recent neural network approaches. It discusses how hidden Markov models were previously dominant but artificial neural networks are now replacing them due to improvements in naturalness. The document also examines developing accurate audio classifiers using machine learning techniques on public datasets and improving classification accuracy beyond current levels of 50-79% by employing strategies like cross-validation.
This document proposes an automatic emotion recognition system that analyzes audio information to classify human emotions. It uses spectral features and MFCC coefficients for feature extraction from voice signals. Then, a deep learning-based LSTM algorithm is used for classification. The system is evaluated on three audio datasets. Recurrent convolutional neural networks are proposed to capture temporal and frequency dependencies in speech spectrograms. The system aims to improve on existing methods which have lower accuracy and require more computational resources for implementation.
Similar to IRJET- Implementing Musical Instrument Recognition using CNN and SVM (20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
1) The document discusses the Sungal Tunnel project in Jammu and Kashmir, India, which is being constructed using the New Austrian Tunneling Method (NATM).
2) NATM involves continuous monitoring during construction to adapt to changing ground conditions, and makes extensive use of shotcrete for temporary tunnel support.
3) The methodology section outlines the systematic geotechnical design process for tunnels according to Austrian guidelines, and describes the various steps of NATM tunnel construction including initial and secondary tunnel support.
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
This study examines the effect of response reduction factors (R factors) on reinforced concrete (RC) framed structures through nonlinear dynamic analysis. Three RC frame models with varying heights (4, 8, and 12 stories) were analyzed in ETABS software under different R factors ranging from 1 to 5. The results showed that displacement increased as the R factor decreased, indicating less linear behavior for lower R factors. Drift also decreased proportionally with increasing R factors from 1 to 5. Shear forces in the frames decreased with higher R factors. In general, R factors of 3 to 5 produced more satisfactory performance with less displacement and drift. The displacement variations between different building heights were consistent at different R factors. This study evaluated how R factors influence
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
This study compares the use of Stark Steel and TMT Steel as reinforcement materials in a two-way reinforced concrete slab. Mechanical testing is conducted to determine the tensile strength, yield strength, and other properties of each material. A two-way slab design adhering to codes and standards is executed with both materials. The performance is analyzed in terms of deflection, stability under loads, and displacement. Cost analyses accounting for material, durability, maintenance, and life cycle costs are also conducted. The findings provide insights into the economic and structural implications of each material for reinforcement selection and recommendations on the most suitable material based on the analysis.
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
This document discusses a study analyzing the effect of camber, position of camber, and angle of attack on the aerodynamic characteristics of airfoils. Sixteen modified asymmetric NACA airfoils were analyzed using computational fluid dynamics (CFD) by varying the camber, camber position, and angle of attack. The results showed the relationship between these parameters and the lift coefficient, drag coefficient, and lift to drag ratio. This provides insight into how changes in airfoil geometry impact aerodynamic performance.
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
This document reviews the progress and challenges of aluminum-based metal matrix composites (MMCs), focusing on their fabrication processes and applications. It discusses how various aluminum MMCs have been developed using reinforcements like borides, carbides, oxides, and nitrides to improve mechanical and wear properties. These composites have gained prominence for their lightweight, high-strength and corrosion resistance properties. The document also examines recent advancements in fabrication techniques for aluminum MMCs and their growing applications in industries such as aerospace and automotive. However, it notes that challenges remain around issues like improper mixing of reinforcements and reducing reinforcement agglomeration.
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
This document discusses research on using graph neural networks (GNNs) for dynamic optimization of public transportation networks in real-time. GNNs represent transit networks as graphs with nodes as stops and edges as connections. The GNN model aims to optimize networks using real-time data on vehicle locations, arrival times, and passenger loads. This helps increase mobility, decrease traffic, and improve efficiency. The system continuously trains and infers to adapt to changing transit conditions, providing decision support tools. While research has focused on performance, more work is needed on security, socio-economic impacts, contextual generalization of models, continuous learning approaches, and effective real-time visualization.
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
This document summarizes a research project that aims to compare the structural performance of conventional slab and grid slab systems in multi-story buildings using ETABS software. The study will analyze both symmetric and asymmetric building models under various loading conditions. Parameters like deflections, moments, shears, and stresses will be examined to evaluate the structural effectiveness of each slab type. The results will provide insights into the comparative behavior of conventional and grid slabs to help engineers and architects select appropriate slab systems based on building layouts and design requirements.
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
This document summarizes and reviews a research paper on the seismic response of reinforced concrete (RC) structures with plan and vertical irregularities, with and without infill walls. It discusses how infill walls can improve or reduce the seismic performance of RC buildings, depending on factors like wall layout, height distribution, connection to the frame, and relative stiffness of walls and frames. The reviewed research paper analyzes the behavior of infill walls, effects of vertical irregularities, and seismic performance of high-rise structures under linear static and dynamic analysis. It studies response characteristics like story drift, deflection and shear. The document also provides literature on similar research investigating the effects of infill walls, soft stories, plan irregularities, and different
This document provides a review of machine learning techniques used in Advanced Driver Assistance Systems (ADAS). It begins with an abstract that summarizes key applications of machine learning in ADAS, including object detection, recognition, and decision-making. The introduction discusses the integration of machine learning in ADAS and how it is transforming vehicle safety. The literature review then examines several research papers on topics like lightweight deep learning models for object detection and lane detection models using image processing. It concludes by discussing challenges and opportunities in the field, such as improving algorithm robustness and adaptability.
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
The document analyzes temperature and precipitation trends in Asosa District, Benishangul Gumuz Region, Ethiopia from 1993 to 2022 based on data from the local meteorological station. The results show:
1) The average maximum and minimum annual temperatures have generally decreased over time, with maximum temperatures decreasing by a factor of -0.0341 and minimum by -0.0152.
2) Mann-Kendall tests found the decreasing temperature trends to be statistically significant for annual maximum temperatures but not for annual minimum temperatures.
3) Annual precipitation in Asosa District showed a statistically significant increasing trend.
The conclusions recommend development planners account for rising summer precipitation and declining temperatures in
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
This document discusses the design and analysis of pre-engineered building (PEB) framed structures using STAAD Pro software. It provides an overview of PEBs, including that they are designed off-site with building trusses and beams produced in a factory. STAAD Pro is identified as a key tool for modeling, analyzing, and designing PEBs to ensure their performance and safety under various load scenarios. The document outlines modeling structural parts in STAAD Pro, evaluating structural reactions, assigning loads, and following international design codes and standards. In summary, STAAD Pro is used to design and analyze PEB framed structures to ensure safety and code compliance.
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
This document provides a review of research on innovative fiber integration methods for reinforcing concrete structures. It discusses studies that have explored using carbon fiber reinforced polymer (CFRP) composites with recycled plastic aggregates to develop more sustainable strengthening techniques. It also examines using ultra-high performance fiber reinforced concrete to improve shear strength in beams. Additional topics covered include the dynamic responses of FRP-strengthened beams under static and impact loads, and the performance of preloaded CFRP-strengthened fiber reinforced concrete beams. The review highlights the potential of fiber composites to enable more sustainable and resilient construction practices.
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
This document summarizes a survey on securing patient healthcare data in cloud-based systems. It discusses using technologies like facial recognition, smart cards, and cloud computing combined with strong encryption to securely store patient data. The survey found that healthcare professionals believe digitizing patient records and storing them in a centralized cloud system would improve access during emergencies and enable more efficient care compared to paper-based systems. However, ensuring privacy and security of patient data is paramount as healthcare incorporates these digital technologies.
Review on studies and research on widening of existing concrete bridgesIRJET Journal
This document summarizes several studies that have been conducted on widening existing concrete bridges. It describes a study from China that examined load distribution factors for a bridge widened with composite steel-concrete girders. It also outlines challenges and solutions for widening a bridge in the UAE, including replacing bearings and stitching the new and existing structures. Additionally, it discusses two bridge widening projects in New Zealand that involved adding precast beams and stitching to connect structures. Finally, safety measures and challenges for strengthening a historic bridge in Switzerland under live traffic are presented.
React based fullstack edtech web applicationIRJET Journal
The document describes the architecture of an educational technology web application built using the MERN stack. It discusses the frontend developed with ReactJS, backend with NodeJS and ExpressJS, and MongoDB database. The frontend provides dynamic user interfaces, while the backend offers APIs for authentication, course management, and other functions. MongoDB enables flexible data storage. The architecture aims to provide a scalable, responsive platform for online learning.
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
This paper proposes integrating Internet of Things (IoT) and blockchain technologies to help implement objectives of India's National Education Policy (NEP) in the education sector. The paper discusses how blockchain could be used for secure student data management, credential verification, and decentralized learning platforms. IoT devices could create smart classrooms, automate attendance tracking, and enable real-time monitoring. Blockchain would ensure integrity of exam processes and resource allocation, while smart contracts automate agreements. The paper argues this integration has potential to revolutionize education by making it more secure, transparent and efficient, in alignment with NEP goals. However, challenges like infrastructure needs, data privacy, and collaborative efforts are also discussed.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
This document provides a review of research on the performance of coconut fibre reinforced concrete. It summarizes several studies that tested different volume fractions and lengths of coconut fibres in concrete mixtures with varying compressive strengths. The studies found that coconut fibre improved properties like tensile strength, toughness, crack resistance, and spalling resistance compared to plain concrete. Volume fractions of 2-5% and fibre lengths of 20-50mm produced the best results. The document concludes that using a 4-5% volume fraction of coconut fibres 30-40mm in length with M30-M60 grade concrete would provide benefits based on previous research.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
The document discusses optimizing business management processes through automation using Microsoft Power Automate and artificial intelligence. It provides an overview of Power Automate's key components and features for automating workflows across various apps and services. The document then presents several scenarios applying automation solutions to common business processes like data entry, monitoring, HR, finance, customer support, and more. It estimates the potential time and cost savings from implementing automation for each scenario. Finally, the conclusion emphasizes the transformative impact of AI and automation tools on business processes and the need for ongoing optimization.
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
The document describes the seismic design of a G+5 steel building frame located in Roorkee, India according to Indian codes IS 1893-2002 and IS 800. The frame was analyzed using the equivalent static load method and response spectrum method, and its response in terms of displacements and shear forces were compared. Based on the analysis, the frame was designed as a seismic-resistant steel structure according to IS 800:2007. The software STAAD Pro was used for the analysis and design.
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
This research paper explores using plastic waste as a sustainable and cost-effective construction material. The study focuses on manufacturing pavers and bricks using recycled plastic and partially replacing concrete with plastic alternatives. Initial results found that pavers and bricks made from recycled plastic demonstrate comparable strength and durability to traditional materials while providing environmental and cost benefits. Additionally, preliminary research indicates incorporating plastic waste as a partial concrete replacement significantly reduces construction costs without compromising structural integrity. The outcomes suggest adopting plastic waste in construction can address plastic pollution while optimizing costs, promoting more sustainable building practices.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
A review on techniques and modelling methodologies used for checking electrom...nooriasukmaningtyas
The proper function of the integrated circuit (IC) in an inhibiting electromagnetic environment has always been a serious concern throughout the decades of revolution in the world of electronics, from disjunct devices to today’s integrated circuit technology, where billions of transistors are combined on a single chip. The automotive industry and smart vehicles in particular, are confronting design issues such as being prone to electromagnetic interference (EMI). Electronic control devices calculate incorrect outputs because of EMI and sensors give misleading values which can prove fatal in case of automotives. In this paper, the authors have non exhaustively tried to review research work concerned with the investigation of EMI in ICs and prediction of this EMI using various modelling methodologies and measurement setups.
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
DEEP LEARNING FOR SMART GRID INTRUSION DETECTION: A HYBRID CNN-LSTM-BASED MODELgerogepatton
As digital technology becomes more deeply embedded in power systems, protecting the communication
networks of Smart Grids (SG) has emerged as a critical concern. Distributed Network Protocol 3 (DNP3)
represents a multi-tiered application layer protocol extensively utilized in Supervisory Control and Data
Acquisition (SCADA)-based smart grids to facilitate real-time data gathering and control functionalities.
Robust Intrusion Detection Systems (IDS) are necessary for early threat detection and mitigation because
of the interconnection of these networks, which makes them vulnerable to a variety of cyberattacks. To
solve this issue, this paper develops a hybrid Deep Learning (DL) model specifically designed for intrusion
detection in smart grids. The proposed approach is a combination of the Convolutional Neural Network
(CNN) and the Long-Short-Term Memory algorithms (LSTM). We employed a recent intrusion detection
dataset (DNP3), which focuses on unauthorized commands and Denial of Service (DoS) cyberattacks, to
train and test our model. The results of our experiments show that our CNN-LSTM method is much better
at finding smart grid intrusions than other deep learning algorithms used for classification. In addition,
our proposed approach improves accuracy, precision, recall, and F1 score, achieving a high detection
accuracy rate of 99.50%.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024