The document summarizes a study that analyzed the acoustic properties of an imitator impersonating the voice of a South Indian actor. The imitator was able to closely match the timing of some words but diverged over time. Mean formant frequencies and pitch did not match the target voice. While the imitation sounded close to the human ear, acoustic analysis showed the voices were distinct in timing, formant frequencies, and pitch. The study concludes that voice is unique and difficult to perfectly copy acoustically.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
Research Inventy : International Journal of Engineering and Scienceinventy
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed
Accents of English have been investigated for many years both from the perspective of native and non-native speakers of the language. Various research results imply that non-native speakers of English language produce certain speech characteristics which are uncommon in native speakers’ speech. This is because non-native speakers do not produce the same tongue movement as native speakers. This paper presents an isolated English word recognition system devised with the speech of local Bangladeshi people, who are also non-native speakers of English language. Here, we have also noticed a different speech characteristic which is not available within the speech of native English speakers. Two acoustic features, ‘pitch’ and ‘formants’ have been utilized to develop the system. The system is speaker-independent and stands on Template based approach. The recognition method applied here is very simple and the recognition accuracy is also very satisfactory.
Voice Recognition System using Template MatchingIJORCS
It is easy for human to recognize familiar voice but using computer programs to identify a voice when compared with others is a herculean task. This is due to the problem that is encountered when developing the algorithm to recognize human voice. It is impossible to say a word the same way in two different occasions. Human speech analysis by computer gives different interpretation based on varying speed of speech delivery. This research paper gives detail description of the process behind implementation of an effective voice recognition algorithm. The algorithm utilize discrete Fourier transform to compare the frequency spectra of two voice samples because it remained unchanged as speech is slightly varied. Chebyshev inequality is then used to determine whether the two voices came from the same person. The algorithm is implemented and tested using MATLAB.
DYNAMIC PHONE WARPING – A METHOD TO MEASURE THE DISTANCE BETWEEN PRONUNCIATIONS cscpconf
Human beings generate different speech waveforms while speaking the same word at different times. Also, different human beings have different accents and generate significantly varying speech waveforms for the same word. There is a need to measure the distances between various words which facilitate preparation of pronunciation dictionaries. A new algorithm called Dynamic Phone Warping (DPW) is presented in this paper. It uses dynamic programming technique for global alignment and shortest distance measurements. The DPW algorithm can be used to enhance the pronunciation dictionaries of the well-known languages like English or to build pronunciation dictionaries to the less known sparse languages. The precision measurement experiments show 88.9% accuracy.
Research Inventy : International Journal of Engineering and Scienceinventy
Research Inventy : International Journal of Engineering and Science
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed
Accents of English have been investigated for many years both from the perspective of native and non-native speakers of the language. Various research results imply that non-native speakers of English language produce certain speech characteristics which are uncommon in native speakers’ speech. This is because non-native speakers do not produce the same tongue movement as native speakers. This paper presents an isolated English word recognition system devised with the speech of local Bangladeshi people, who are also non-native speakers of English language. Here, we have also noticed a different speech characteristic which is not available within the speech of native English speakers. Two acoustic features, ‘pitch’ and ‘formants’ have been utilized to develop the system. The system is speaker-independent and stands on Template based approach. The recognition method applied here is very simple and the recognition accuracy is also very satisfactory.
Voice Recognition System using Template MatchingIJORCS
It is easy for human to recognize familiar voice but using computer programs to identify a voice when compared with others is a herculean task. This is due to the problem that is encountered when developing the algorithm to recognize human voice. It is impossible to say a word the same way in two different occasions. Human speech analysis by computer gives different interpretation based on varying speed of speech delivery. This research paper gives detail description of the process behind implementation of an effective voice recognition algorithm. The algorithm utilize discrete Fourier transform to compare the frequency spectra of two voice samples because it remained unchanged as speech is slightly varied. Chebyshev inequality is then used to determine whether the two voices came from the same person. The algorithm is implemented and tested using MATLAB.
Upgrading the Performance of Speech Emotion Recognition at the Segmental Level IOSR Journals
Abstract: This paper presents an efficient approach for maximizing the accuracy of automatic speech emotion
recognition in English, using minimal inputs, minimal features, lesser algorithmic complexity and reduced
processing time. Whereas the findings reported here are based on the exclusive use of vowel formants, most of
the related previous works used tens or even hundreds of other features. In spite of using a greater level of
signal processing, the recognition accuracy reported earlier was often lesser than that obtained by our
approach. This method is based on vowel utterances and the first step comprises statistical pre-processing of
the vowel formants. This is followed by the identification of the best formants using the KMeans, K-nearest
neighbor and Naive Bayes classifiers. The Artificial neural network that was used for the final classification
gave an accuracy of 95.6% on elicited emotional speech. Nearly 1500 speech files from ten female speakers in
the neutral and six basic emotions were used to prove the efficiency of the proposed approach. Such a result
has not been reported earlier for English and is of significance to researchers, sociologists and others interested in speech.
Keywords: Artificial Neural Networks, Emotions, Formants, Preprocessing,Vowels.
A NOVEL METHOD FOR OBTAINING A BETTER QUALITY SPEECH SIGNAL FOR COCHLEAR IMPL...acijjournal
Cochlear implant devices are known to exist since a long time. The purpose of the present work is to develop a speech algorithm for obtaining robust speech. In this paper, the technique of cochlear implant is first introduced, followed by discussions of some of the existing techniques available for obtaining speech. The next section introduces a new technique for obtaining robust speech. The key feature of this technique lies in the use of the advantages of an integrated approach involving the use of an estimation technique such as a kalman filter with non linear filter bank strategy, using Dual Resonance Non Linear(DRNL) and Single Side Band(SSB) Encoding method. A comparative study of the proposed method with the existing method indicates that the proposed method performs well compared to the existing method.
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...IJECEIAES
In this paper, I present high-level speaker specific feature extraction considering intonation, linguistics rhythm, linguistics stress, prosodic features directly from speech signals. I assume that the rhythm is related to language units such as syllables and appears as changes in measurable parameters such as fundamental frequency ( ), duration, and energy. In this work, the syllable type features are selected as the basic unit for expressing the prosodic features. The approximate segmentation of continuous speech to syllable units is achieved by automatically locating the vowel starting point. The knowledge of high-level speaker’s specific speakers is used as a reference for extracting the prosodic features of the speech signal. High-level speaker-specific features extracted using this method may be useful in applications such as speaker recognition where explicit phoneme/syllable boundaries are not readily available. The efficiency of the particular characteristics of the specific features used for automatic speaker recognition was evaluated on TIMIT and HTIMIT corpora initially sampled in the TIMIT at 16 kHz to 8 kHz. In summary, the experiment, the basic discriminating system, and the HMM system are formed on TIMIT corpus with a set of 48 phonemes. Proposed ASR system shows 1.99%, 2.10%, 2.16% and 2.19 % of efficiency improvements compared to traditional ASR system for and of 16KHz TIMIT utterances.
An Introduction to Various Features of Speech SignalSpeech featuresSivaranjan Goswami
An overview of various temporal, spectral and cepstral features of speech signal used in digital speech processing.
For more tutorials visit:
https://sites.google.com/site/enggprojectece
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Decoding Digital Audio: Visualizing and Annotating Linear Time-Based Media 2015Philip Desenne
Presentation slides for the "Catching Waves" Panel Discussion on Sustainable Digital Audio Delivery held at Lamont Library, Harvard University, Friday May 8th, 2015
Artificially enhancing better-ear glimpsing cues to improve understanding of ...HEARnet _
Artificially enhancing better-ear glimpsing cues to improve understanding of speech in noise for listeners with hearing loss cues to improve understanding of speech in noise for listeners with hearing loss
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...sipij
Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,
the pathological cases of speech disabled children affected with AOS are analyzed. The speech signal
samples of children of age between three to eight years are considered for the present study. These speech
signals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysis
This analysis is conducted on speech data samples which are concerned with both place of articulation and
manner of articulation. The speech disability of pathological subjects was estimated using results of above
analysis.
The rapid growth of Hospitality businesses in Nigeria is increasing at an alarming rate. It is sad to note that majority of these investors have little or no knowledge of the impact that their investment has on the environment. This is because most investors only carry out feasibility study on the profit forfeiting the sustainability of the business which depends on the environment. Today hospitality investors in Nigeria have refused to answer this question “are the activities of hospitality business environmental friendly? At this point it is pertinent to note that the sustainability of any hospitality business anchors largely on environmental sustainability. In other words, all hospitality business needs the environment to strive. Therefore, in view of the above, this paper critically diagnosed the activities of Soarak hotel and casino Lagos towards environmental sustainability. Its primary objective is to identify the impact of soarak hotel on its immediate surroundings. This research made extensive use of interview and questionnaire as instruments of data collection. In conclusion, the paper recommends sustainable measures to mitigate the activities of hospitality business on the environment.
The global economic recession has posed new challenges to the world coupled with the challenges of new energy technology in response to global warming has dwindled the economy of so many nations today. Particularly petro-mono economy countries like Nigeria whose revenue base depends mostly on proceeds from crude oil exportation. Overdependence on crude oil is a sign of blunt future consequentially if the oil mine dries up. What would be the fate of the economy? Therefore it is time for Nigeria to spread the tentacles of her economic prowess to other service sectors such as Tourism for sustainable economic exploitation. Tourism is one of the most promising drivers of growth for the world economy. As a development vehicle, tourism resources are inexhaustible unlike crude oil. Nigeria specifically Awka has vast tourism potentials awaiting development. Little wonder the administration of former President Olusegun Obansanjo the set machinery in motion to turn the sector into a major revenue earner. The machinery included the introduction of various festivals across the country for the promotion of its rich cultural heritage to woo foreign tourists. With her appealing tourism resources, this paper advocates the development and exploitation of these resources for wealth creation.
The efficacy of Lemon (Citrus lemonirisso) juice on wound healing of albino wistar rat was investigated; also the potential haemostatic mechanism associated with administration of the extract was investigated. Results showed that lemon juice extract decreased haemoglobin concentration, packed cell volume while it has no significant effect on platelet count, white blood cell count and white cell differential counts in albino rats. Furthermore, the bleeding and clotting times were shortened and the period of healing of wound using lemon juice could possess some elements that is affecting the haemostatic mechanism.
The purpose of this study is to review and understand the decision making influences faced by female segment in choosing and purchasing specific Brands of Cars in the Muscat region of the Sultanate of Oman. The paper reports an empirical analysis by classifying female segments into cohorts based on the decision making characteristics which provides qualitative insights of the female car owner driver segment residing in Oman. The Car Manufacturers will benefit from the data collected, especially in helping to build good customer relationships with the segment and in determining factors impacting on the profitability.
Upgrading the Performance of Speech Emotion Recognition at the Segmental Level IOSR Journals
Abstract: This paper presents an efficient approach for maximizing the accuracy of automatic speech emotion
recognition in English, using minimal inputs, minimal features, lesser algorithmic complexity and reduced
processing time. Whereas the findings reported here are based on the exclusive use of vowel formants, most of
the related previous works used tens or even hundreds of other features. In spite of using a greater level of
signal processing, the recognition accuracy reported earlier was often lesser than that obtained by our
approach. This method is based on vowel utterances and the first step comprises statistical pre-processing of
the vowel formants. This is followed by the identification of the best formants using the KMeans, K-nearest
neighbor and Naive Bayes classifiers. The Artificial neural network that was used for the final classification
gave an accuracy of 95.6% on elicited emotional speech. Nearly 1500 speech files from ten female speakers in
the neutral and six basic emotions were used to prove the efficiency of the proposed approach. Such a result
has not been reported earlier for English and is of significance to researchers, sociologists and others interested in speech.
Keywords: Artificial Neural Networks, Emotions, Formants, Preprocessing,Vowels.
A NOVEL METHOD FOR OBTAINING A BETTER QUALITY SPEECH SIGNAL FOR COCHLEAR IMPL...acijjournal
Cochlear implant devices are known to exist since a long time. The purpose of the present work is to develop a speech algorithm for obtaining robust speech. In this paper, the technique of cochlear implant is first introduced, followed by discussions of some of the existing techniques available for obtaining speech. The next section introduces a new technique for obtaining robust speech. The key feature of this technique lies in the use of the advantages of an integrated approach involving the use of an estimation technique such as a kalman filter with non linear filter bank strategy, using Dual Resonance Non Linear(DRNL) and Single Side Band(SSB) Encoding method. A comparative study of the proposed method with the existing method indicates that the proposed method performs well compared to the existing method.
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...IJECEIAES
In this paper, I present high-level speaker specific feature extraction considering intonation, linguistics rhythm, linguistics stress, prosodic features directly from speech signals. I assume that the rhythm is related to language units such as syllables and appears as changes in measurable parameters such as fundamental frequency ( ), duration, and energy. In this work, the syllable type features are selected as the basic unit for expressing the prosodic features. The approximate segmentation of continuous speech to syllable units is achieved by automatically locating the vowel starting point. The knowledge of high-level speaker’s specific speakers is used as a reference for extracting the prosodic features of the speech signal. High-level speaker-specific features extracted using this method may be useful in applications such as speaker recognition where explicit phoneme/syllable boundaries are not readily available. The efficiency of the particular characteristics of the specific features used for automatic speaker recognition was evaluated on TIMIT and HTIMIT corpora initially sampled in the TIMIT at 16 kHz to 8 kHz. In summary, the experiment, the basic discriminating system, and the HMM system are formed on TIMIT corpus with a set of 48 phonemes. Proposed ASR system shows 1.99%, 2.10%, 2.16% and 2.19 % of efficiency improvements compared to traditional ASR system for and of 16KHz TIMIT utterances.
An Introduction to Various Features of Speech SignalSpeech featuresSivaranjan Goswami
An overview of various temporal, spectral and cepstral features of speech signal used in digital speech processing.
For more tutorials visit:
https://sites.google.com/site/enggprojectece
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Decoding Digital Audio: Visualizing and Annotating Linear Time-Based Media 2015Philip Desenne
Presentation slides for the "Catching Waves" Panel Discussion on Sustainable Digital Audio Delivery held at Lamont Library, Harvard University, Friday May 8th, 2015
Artificially enhancing better-ear glimpsing cues to improve understanding of ...HEARnet _
Artificially enhancing better-ear glimpsing cues to improve understanding of speech in noise for listeners with hearing loss cues to improve understanding of speech in noise for listeners with hearing loss
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM AP...sipij
Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,
the pathological cases of speech disabled children affected with AOS are analyzed. The speech signal
samples of children of age between three to eight years are considered for the present study. These speech
signals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysis
This analysis is conducted on speech data samples which are concerned with both place of articulation and
manner of articulation. The speech disability of pathological subjects was estimated using results of above
analysis.
The rapid growth of Hospitality businesses in Nigeria is increasing at an alarming rate. It is sad to note that majority of these investors have little or no knowledge of the impact that their investment has on the environment. This is because most investors only carry out feasibility study on the profit forfeiting the sustainability of the business which depends on the environment. Today hospitality investors in Nigeria have refused to answer this question “are the activities of hospitality business environmental friendly? At this point it is pertinent to note that the sustainability of any hospitality business anchors largely on environmental sustainability. In other words, all hospitality business needs the environment to strive. Therefore, in view of the above, this paper critically diagnosed the activities of Soarak hotel and casino Lagos towards environmental sustainability. Its primary objective is to identify the impact of soarak hotel on its immediate surroundings. This research made extensive use of interview and questionnaire as instruments of data collection. In conclusion, the paper recommends sustainable measures to mitigate the activities of hospitality business on the environment.
The global economic recession has posed new challenges to the world coupled with the challenges of new energy technology in response to global warming has dwindled the economy of so many nations today. Particularly petro-mono economy countries like Nigeria whose revenue base depends mostly on proceeds from crude oil exportation. Overdependence on crude oil is a sign of blunt future consequentially if the oil mine dries up. What would be the fate of the economy? Therefore it is time for Nigeria to spread the tentacles of her economic prowess to other service sectors such as Tourism for sustainable economic exploitation. Tourism is one of the most promising drivers of growth for the world economy. As a development vehicle, tourism resources are inexhaustible unlike crude oil. Nigeria specifically Awka has vast tourism potentials awaiting development. Little wonder the administration of former President Olusegun Obansanjo the set machinery in motion to turn the sector into a major revenue earner. The machinery included the introduction of various festivals across the country for the promotion of its rich cultural heritage to woo foreign tourists. With her appealing tourism resources, this paper advocates the development and exploitation of these resources for wealth creation.
The efficacy of Lemon (Citrus lemonirisso) juice on wound healing of albino wistar rat was investigated; also the potential haemostatic mechanism associated with administration of the extract was investigated. Results showed that lemon juice extract decreased haemoglobin concentration, packed cell volume while it has no significant effect on platelet count, white blood cell count and white cell differential counts in albino rats. Furthermore, the bleeding and clotting times were shortened and the period of healing of wound using lemon juice could possess some elements that is affecting the haemostatic mechanism.
The purpose of this study is to review and understand the decision making influences faced by female segment in choosing and purchasing specific Brands of Cars in the Muscat region of the Sultanate of Oman. The paper reports an empirical analysis by classifying female segments into cohorts based on the decision making characteristics which provides qualitative insights of the female car owner driver segment residing in Oman. The Car Manufacturers will benefit from the data collected, especially in helping to build good customer relationships with the segment and in determining factors impacting on the profitability.
Air pollution is a global environmental challenge that has continued to receive worldwide attention despite the recent decline in concentration of atmospheric pollutants following stringent environmental protection regulations. The major source of this pollution remains fossil fuels; hence the urgent need for cleaner energy sources. This study presents a review of the models applied in monitoring ambient air quality. The primary aim of air pollution modeling is to identify and quantitatively characterize pollutant emission at its source and subsequent dispersion through the atmosphere, subject to meteorological conditions, physical and chemical transformations. The common models and model assumptions for modeling air pollution and quality were critically reviewed and analyzed in this work for application in both forecasting and estimation of air pollutants on the basis of considered causes and in air quality assessment and air pollution control.
It has been observed that anthropogenic activities namely, farming, tree harvesting, seasonal fire regimes, introduction of exotic tree species like Eucalyptus and Greviella, and collection of herbs for medicinal use are going on and form a major threat for the orchid Polystachya fusiformis (Thou.) Lindl. This study determined the relative abundance and distribution of the species Polystachya fusiformis (Thou.) Lindl. in the Manga range ecosystem of Kisii, Western Kenya during two flowering seasons. Other results of the present study were analyzed with SPSS version 17 for paired sample correlations, OriginPro7 t-Test and ANOVA, Minitab 16 chi-square test. From the analysis there is a significant correlation between altitude and number of orchid population clusters with a p-value of 0.008 in the distribution of Polystachya fusiformis (Thou.) Lindl. which led to rejection of the null hypothesis. The Levene’s test for equal variance shows that at α 0.05 there is a significant difference between altitude and number of clusters as indicated by the P value of 0.00004. Of the 88 sites sampled, only 41sites had orchid clusters. Principal component analysis using Unscrambler 9.7 indicated that many of the orchid population clusters fell within the range of one or two orchid population clusters. The score plots from the two Hoteling’s outputs show how well data is distributed including sample patterns, groupings, similarities and differences during the study. The two analyses illustrated how fire affects the orchid population on fire prone sites of the range. Orchid population clusters progressively increased with increase with altitude range (from 1800m to 1850m) above sea level, but number of orchid population clusters decreased towards 1950m. Sites with minimal anthropogenic disturbances (1796m, 1830m, 1854m, 1886m, and 1890m) had a higher number of orchid population clusters.
Barley is one of the most important traditional crops in Ethiopia which is a major center of genetic diversity for barley along with other crop plants species. Two hundred seven accessions and 18 released varieties were laid down in 15*15 simple lattice design and planted in 2008 main cropping season (June to Nov) at Kokate. The objective of the study was to conduct the morphological characterization and to determine the nature and degree of variability in morpho- agronomic traits of landrace of barley in southern Ethiopia collections. The proportion of genotypes in kernel row number were 26.6, 15.3, 16.6, 41.5 and 0.4% for two rowed with lateral floret, two rowed deficient, irregular, six rowed with awns on lateral floret and branched heads, respectively. Genotypes with white kernel color (57.5%) and amber (normal) lemma color (50%) were dominant. The highest diversity indices pooled over the characters within zones/ special woredas were recorded for accessions sampled from Dawro (H’= 0.75 ± 0.05) followed by Sheka (H’=0.74 ± 0.07), Gamgofa (H’ =0.70 ± 0.05) and Keffa (H’= 0.70 ± 0.08). These zones can be used for in situ conservation for barley landraces as representatives of southern Ethiopian high lands. The barley genotypes were clustered into five distinct groups of various sizes based on 8 qualitative traits. The estimates of diversity index (H’) for each trait in each of the three altitudinal class has shown that polymorphism was common in varying degrees for most traits, implying the existence of a wide range of variation in the materials.
Stress has become a major concern of the modern times as it can cause harm to employee’s health and performance. Work related stress costs organization billions of dollars each year through sickness, turnover and absenteeism. So it becomes necessary for every organization to know the factor causing stress among the employees as well as how they cope up with stress to make the employee more participative and productive. The Research study titled “A STUDY ON STRESS MANAGEMENT AMONG EMPLOYEES AT SAKTHI FINANCE LIMITED, COIMBATORE” was conducted to find out the factor causing stress among employees and to know how they cope up with stress. The Research design used was a descriptive research. The primary data has been collected through a questionnaire method. The sample design used in the study was Convenience Sampling Technique with a sample size of 60. The collected data has been analysed through various tools like Percentage Analysis, Chi- Square Test & ANOVAs, and Factor Analysis.
A Phonetic Forensic Analysis of Imran Khan’s Speeches.pdfFaiz Ullah
The objective of this research was to analyze the speeches made by Al Tools and Imran Khan. Praat played a crucial
role in conducting this analysis. Nowadays, there are numerous fake videos and audios associated with specific
individuals. For instance, speeches made by Al Tools, such as Imran Khan's speech after being imprisoned, were
released.
International Journal of Engineering Research and DevelopmentIJERD Editor
Electrical, Electronics and Computer Engineering,
Information Engineering and Technology,
Mechanical, Industrial and Manufacturing Engineering,
Automation and Mechatronics Engineering,
Material and Chemical Engineering,
Civil and Architecture Engineering,
Biotechnology and Bio Engineering,
Environmental Engineering,
Petroleum and Mining Engineering,
Marine and Agriculture engineering,
Aerospace Engineering.
STUDY OF ACOUSTIC PROPERTIES OF NASAL AND NONNASAL VOWELS IN TEMPORAL DOMAINcscpconf
There has been considerable amount of work done in exploring the acoustic correlates of nasalized and non-nasalized vowels in the frequency domain. Nasalized vowels are characterized by the presence of extra pole-zero pairs near the first formant region and across thespectrum. Several other automatically extractable acoustic features have been proposed by researchers across the globe. This area has not been explored much in the temporal domain. In this study we have tried to find quantifiable differences/similarities between the nasal and non-nasal vowel /a/ in the temporal domain at the pitch synchronous level. The results show significant differences between nasalized and non-nasalized vowel /a/
Study of acoustic properties of nasal and nonnasal vowels in temporal domaincsandit
There has been considerable amount of work done in exploring the acoustic correlates of nasalized and
non-nasalized vowels in the frequency domain. Nasalized vowels are characterized by the presence of extra
pole-zero pairs near the first formant region and across the spectrum. Several other automatically
extractable acoustic features have been proposed by researchers across the globe. This area has not been
explored much in the temporal domain. In this study we have tried to find quantifiable
differences/similarities between the nasal and non-nasal vowel /a/ in the temporal domain at the pitch
synchronous level. The results show significant differences between nasalized and non-nasalized vowel /a/.
S A M P L E P A P E R S54INHIBITORY INFLUENCES ON ASYCHRO.docxagnesdcarey33086
S A M P L E P A P E R S54
INHIBITORY INFLUENCES ON ASYCHRONY 3
Inhibitory Influences on Asychrony as a Cue for Auditory Segregation
Auditory grouping involves the formation of auditory objects from the sound mixture
reaching the ears. The cues used to integrate or segregate these sounds and so form auditory
objects have been defined by several authors (e.g., Bregman, 1990; Darwin, 1997; Darwin &
Carlyon, 1995). The key acoustic cues for segregating concurrent acoustic elements are
differences in onset time (e.g., Dannenbring & Bregman, 1978; Rasch, 1978) and harmonic
relations (e.g., Brunstrom & Roberts, 1998; Moore, Glasberg, & Peters, 1986). In an example of
the importance of onset time, Darwin (1984a, 1984b) showed that increasing the level of a
harmonic near the first formant (F1) frequency by adding a synchronous pure tone changes the
phonetic quality of a vowel. However, when the added tone began a few hundred milliseconds
before the vowel, it was essentially removed from the vowel percept.… [section continues].
General Method
Overview
In the experiments reported here, we used a paradigm developed by Darwin to assess the
perceptual integration of additional energy in the F1 region of a vowel through its effect on
phonetic quality (Darwin, 1984a, 1984b; Darwin & Sutherland, 1984).…[section continues].
Stimuli
Amplitude and phase values for the vowel harmonics were obtained from the vocal-tract
transfer function using cascaded formant resonators (Klatt, 1980). F1 values varied in 10-Hz
steps from 360–550 Hz—except in Experiment 3, which used values from 350– 540 Hz—to
produce a continuum of 20 tokens.…[section continues].
Listeners
Elements of empirical studies, 1.01
Figure 2.2. Sample Two-Experiment Paper (The numbers refer to num-
bered sections in the Publication Manual. This abridged manu-
script illustrates the organizational structure characteristic of
multiple-experiment papers. Of course, a complete multiple-
experiment paper would include a title page, an abstract page,
and so forth.)
Paper adapted from “Inhibitory Influences on Asychrony as a Cue for Auditory Segregation,” by S. D.
Holmes and B. Roberts, 2006, Journal of Experimental Psychology: Human Perception and Performance, 32,
pp. 1231–1242. Copyright 2006 by the American Psychological Association.
M A N U S C R I P T S T R U C T U R E A N D C O N T E N T 55
INHIBITORY INFLUENCES ON ASYCHRONY 4
Listeners were volunteers recruited from the student population of the University of
Birmingham and were paid for their participation. All listeners were native speakers of British
English who reported normal hearing and had successfully completed a screening procedure
(described below). For each experiment, the data for 12 listeners are presented.…[section
continues].
Procedure
At the start of each session, listeners took part in a warm-up block. Depending on the
number of conditions in a particular experiment, the warm-up blo.
Novel cochlear filter based cepstral coefficients for classification of unvoi...ijnlc
In this paper, the use of new auditory-based features derived from cochlear filters, have been proposed for
classification of unvoiced fricatives. Classification attempts have been made to classify sibilant (i.e., /s/,
/sh/) vs. non-sibilants (i.e., /f/, /th/) as well as for fricatives within each sub-category (i.e., intra-sibilants
and intra-non-sibilants). Our experimental results indicate that proposed feature set, viz., Cochlear Filterbased
Cepstral Coefficients (CFCC) performs better for individual fricative classification (i.e., a jump of
3.41 % in average classification accuracy and a fall of 6.59 % in EER) in clean conditions than the stateof-
the-art feature set, viz., Mel Frequency Cepstral Coefficients (MFCC). Furthermore, under signal
degradation conditions (i.e., by additive white noise) classification accuracy using proposed feature set
drops much slowly (i.e., from 86.73 % in clean conditions to 77.46 % at SNR of 5 dB) than by using MFCC
(i.e., from 82.18 % in clean conditions to 46.93 % at SNR of 5 dB).
Audio descriptive analysis of singer and musical instrument identification in...eSAT Journals
Abstract Music information retrieval (MIR) has reached to a reasonably stable state after advancement in the Low Level audio Descriptors (LLDs) and feature extraction techniques. The analysis of sound has now become simple by the continuous efforts and research of MIR community in the field of signal processing from last two decades. In north Indian classical music, a singer is accompanied by some instruments such as harmonium, violin or flute. These instruments are tuned in the same musical scale (pitch range) in which the singer is signing. Separate researches have been made in recent past to identify a musical instrument and a singer. In this paper, we have analyzed the low level audio descriptors, for singing voice and musical instrument sound together, that appears to human ear as similar with respect to ‘timbre’, to see if we could treat them same and use identification/ classification routines to classify them into their classes. We have used Hybrid Selection algorithm from wrapper technique(the one that uses classifier also in feature selection process) to identify and extract the features and K-Means and K nearest neighbor classifiers to classify and cross verify the accuracy of classification. The accuracy of classification achieved was 91.1% which clearly proves that musical instruments and singing voice that sounds similar in timbral aspect can be grouped together and classification is possible with mixed database of instruments and singing voices. Keywords: Music Information Retrieval (MIR), Timbre, Singing Voice, Low level Descriptors (LLD, North Indian Classical music. MIRTOOL BOX
"Heart failure is a typical clinical accompanied by symptoms syndrome (e.g. shortness of breath, ankle swelling and fatigue) that lead to structural or functional abnormalities of the heart (e.g. high venous pressure, pulmonary edema and peripheral edema).
In recent years, the significant role of B-type natriuretic peptide has been revealed in the pathogenesis of heart disease and the use of the drug sacubitril/valsartan has started. It has a positive effect on the regulation of the level of B-type natriuretic peptide in the body. It is obviously seen from the the world literature that natriuretic peptides play an important role in the pathophysiology of heart failure. For this reason, many studies suggest that the importance of natriuretic peptides in the diagnosis and treatment of heart failure is recommended.
Due to this, we tried to investigate the effects of a comprehensive medication therapy with a combination of sacubitril/valsartan in the patients with chronic heart failure."
Parallel generators of pseudo random numbers with control of calculation errors
T0 numtq0nzq=
1. International Journal of Science and Research (IJSR)
ISSN (Online): 2319-7064
Impact Factor (2012): 3.358
Imitation of the Voice – An Acoustic Study
Babi Duli
Ph.D Research Scholar, Dept. of Linguistics and Phonetics, School of Phonetics and Spoken English
EFL University, Hyderabad – 500605 Email:- bobbyciefl@gmail.com Ph: +91 9494006604
Abstract: The present study presented here is an imitation of a voice sample imitated the south Indian film star Mahesh Babu by a
professional imitator/mimicry artist in a TV show. The imitator tried to imitate the dialogue of the hero from his block buster hit movie
titled Dookudu ([du:kuu). The aim of the study is to investigate how closely the imitations matched in the selected acoustic parameters
of the original voice. It was found that he was able to imitate dialogue very closely, but timing at the segmental level showed little change
in the direction of the targets. Mean formant frequency did not match the target voice. It is too distant in the timing, mean of formant
frequencies and in pitch
Keywords: imitation, fundamental frequency, formant frequency, pitch, timing, forensic, phoneticians, culprits, acoustics, YouTube
1. Introduction
Imitation is an act of attempting others voice which is
unique by birth. It can be done either with the knowledge of
the actual person or in his/her absence. If it is done in his/her
presence, chances are very limited to make the actual person
as culprit or to indulge in any of the immoral activities, but
in absence it will go to any extent( either entertain or not). If
it entertains, the imitator gets the awards, if not; only the
original speaker can get the repression. But according to law
100 culprits can escape from the punishment but one good
person should not be caught in injustice. Here the forensic
experts trained a lot especially to trace out the culprits and
save the innocent. Even phoneticians also trained in this
connection to connect to the culprits’ voices from a
collection of compendious voices to do the justice to the
common compatriot.
2. Earlier research review
Only a handful of studies exist in this area, and to our
knowledge only two that deal with acoustic parameters in
some detail. Bessler [2] has studied a caricatured
impersonation of de Gaulle. The most relevant finding with
respect to the present study was the fact that the
impersonator exaggerated both mean fundamental frequency
level and range. Stressed syllable durations were also
exaggerated. The impersonation was primarily meant for
entertainment purposes and not accuracy of imitation. The
generalisability of the results may therefore be questioned.
In the other study [3], vowel formant frequencies and
fundamental frequencies in imitations were compared with
the corresponding values for the original voices. Although
the imitators managed to change their formant and
fundamental frequencies in the direction of the target values
“they were not able to adapt these parameters to match or
even be similar to those of the imitated persons.” [3, p.
1842]
3. Methodology
The target voice sample is collected from the YouTube from
a popular Telugu comedy show in E-TV titled Jabardasth.
The source voice sample is collected from the super hit
movie titled Dookudu. The video files are converted into
.wav files by using ‘video pad’ and ‘wave pad’. Praat
software is used for segmenting the voice samples and for
analyzing.
4. Acoustic Analysis
The speech materials are digitized and used for fundamental
frequency level and pitch level for the acoustic analyses. All
the speech files are labelled at word level. The mean of the
Fundamental frequency, formant frequencies and the mean
of the Pitch are then computed. First three formant
frequencies are calculated for all the words which are
presented in the material. All tokens are numbered
consecutively to ensure that comparisons between tokens in
the original, and imitation of the voice were made between
in identical contexts.
5. Results
5.1 Timing
Imitator reached for maintain the closest timing at the initial
word in the continuous speech but he failed after that.
Initially the difference between imitator and Mahesh is a
very close but after it is moved to the falling stage and
acquired negative results. This is one of the key factors for
identifying the culprit. It is proved that voice is unique in
nature. The following table1 will give the clarity regarding
the timing of the voice samples.
Table1: Total timing of the utterance
Token I – Total (in Ms*) M- Total
Volume 3 Issue 10, October 2014
www.ijsr.net
Licensed Under Creative Commons Attribution CC BY
(in MS*)
Dif
/baja:nike/ 595 567 28
/mi:ni/ 319 431 -112
/telijani/ 232 312 -80
/blarana:di/ 686 804 -118
*Millisecons I stands for imitator voice M stands for
Mahesh voice Dif stands for difference of the voice
A more detailed analysis of the timing showed on a line
diagram. At the initial stage it is very close to the original
voice but later the imitator’s voice reached to the falling
stage and original voice reached to the raising stage. This is
clearly seen in the following Figure 1
Paper ID: OCT14474 1176
2. International Journal of Science and Research (IJSR)
ISSN (Online): 2319-7064
Impact Factor (2012): 3.358
Figure 1: Timing of the utterance
5.2 Formant frequency
The second best way of examining voice samples is that get
the mean of the formant frequency. In F1, F2, F3 only
Mahesh has the dominating voice than the imitator. Imitator
failed to reach the original voice samples. He attempted but
failed due to less preparation. Even this is not possible to
anybody to imitate the original voice samples alike because
as it stated above voice is unique in nature. By examining
the mean of the formant frequencies also forensic experts
and phoneticians identify the culprits or voices. The
following table2 explains about the mean of the formant
frequency
Table 2 : Mean of the formant frequencies
Token M-F1 I-F1 Dif M-F2 I-F2 Dif M-F3 I-F3 Dif
/baja:nike/ 718.39 564.51 153.88 1704.26 1603.07 101.19 2535.72 2449.05 86.67
/mi:ni/ 389.56 376.47 13.09 2006.89 1788.94 217.95 2590.42 2357 233.42
/telijani/ 562.91 471.42 91.49 1882.86 1697.43 185.43 2575.49 2362.59 212.9
/blarana:di/ 672.87 582.28 90.59 1673.09 1614.60 58.49 2712.80 2549.11 163.69
F stands for frequency
For the table 2 the following bar chart (figure2) represents
the results of the mean of the formant frequencies.
Figure2: Mean of the formant frequency
5.3. Pitch
In any voice pitch plays a crucial role. The following table3
is clearly showing that the imitator has the pitch in between
148Hz to 160 Hz but Mahesh has 100Hz to 125Hz. It means
where Mahesh pitch levels end, there the imitators pitch
levels starts. Based on this assumption, Pitch plays a very
crucial role in any individual’s voice.
Table3: Pitch of the voice
Token I-Pitch M-pitch Dif
/baja:nike/ 156.07 121.78 34.29
/mi:ni/ 158.71 123.25 35.46
/telijani/ 152.78 109.38 43.4
/blarana:di/ 148.67 100.13 48.54
Imitator has the high pitch, but Mahesh has the low Pitch. Of
course the pitch levels are merely close when it is to a naked
ear, but they are too distant when it is acoustically analyzed.
The following line diagram (Figure3) represents how the
voices are distant.
Figure3: Pitch of the voice
6. Conclusion
Based on the present study the author found that the voice is
unique in nature. It is highly impossible to copy. Though it
is very close to the original voice to a naked ear but when it
is analyzed acoustically, then there are so many factors
reveal that no two voices are alike. It is proved based on this
study. It is too distant in the timing, mean of the formant
frequencies, and the pitch. There are other voices and
imitations also to work on like (the famous actors/artists in
the Tollywood) NTR (Nandamuri Taraka Ramarao) ANR
(Akkineni Nageswara Rao), Krishna, Chiranjeevi, etc.,
References
[1] Anders Eriksson and Pär Wretling. How flexible is the
human voice? – A case study of mimicry Umeå
University, S-901 87 Umeå, Sweden.
[2] Bessler, P. (1991) La caricature de de Gaulle par Tissot:
Étude phonostylistique. In: Information/Communication,
12, 19–32. Canadian Scholars’ Press.
[3] Endres, W., W. Bambach & G. Flösser. (1971) Voice
spectrograms as a function of age, voice disguise, and
voice imitation. J. Acoust. Soc. Am., 49, 1842–1848.
Volume 3 Issue 10, October 2014
www.ijsr.net
Paper ID: OCT14474 1177
Licensed Under Creative Commons Attribution CC BY
3. International Journal of Science and Research (IJSR)
ISSN (Online): 2319-7064
Impact Factor (2012): 3.358
[4] Philip Rose.( 2002) Forensic Speaker Identification:
[5] Traunmüller, H. & A. Eriksson. (1995) The perceptual
evaluation of F0 excursions in speech as evidenced in
liveliness estimations. J. Acoust. Soc. Am., 97, 1905–
1915.
Author Profile
Mr. Babi Duli, a Ph.D research scholar at the EFL
University, from the Department of Linguistics and
Phonetics, School of Phonetics and Spoken English,
continuing his research in the area of Forensic
Phonetics. He had a master degree from Adikavi
Nannaya University, Rajahmundry. He was having
experience in teaching Pronunciation. He worked for several
prestigious institutions, and trained many students in the area of
pronunciation.
Volume 3 Issue 10, October 2014
www.ijsr.net
Taylor And Fransic Series. London
Paper ID: OCT14474 1178
Licensed Under Creative Commons Attribution CC BY