This document describes two experiments that investigated the effects of amplitude compression on speech intelligibility and attentional load. Experiment 1 found that amplitude compression reduced performance on a simultaneous visual-motor tracking task, indicating increased listening effort. Experiment 2 used a lexical decision task instead of tracking but found no significant differences in performance between compressed and uncompressed speech conditions. The goal overall was to evaluate compression effects using alternative measures beyond traditional speech recognition tests.
Predicting the Level of Emotion by Means of Indonesian Speech SignalTELKOMNIKA JOURNAL
This document summarizes a study that aimed to evaluate how well Mel-Frequency Cepstral Coefficients (MFCC) features extracted from Indonesian speech signals relate to four emotions: happy, sad, angry, and fear. Nearly 300 speech signals were collected from actors speaking Indonesian sentences with different emotions. Using support vector machine classification, the study found that the Teager energy feature and the first MFCC coefficient were most crucial for prediction, achieving 86% accuracy. Additional initial MFCC features increased accuracy slightly, but more than four features had negligible effects.
Performance estimation based recurrent-convolutional encoder decoder for spee...karthik annam
This document discusses a proposed Recurrent-Convolutional Encoder-Decoder (R-CED) network for speech enhancement. The R-CED network aims to overcome challenges with existing methods by estimating the a priori and posteriori signal-to-noise ratios to separate noise from speech. The R-CED consists of convolutional layers with increasing and decreasing numbers of filters to encode and decode features. Performance will be evaluated using metrics like PESQ, STOI, CER, MSE, SNR, and SDR. The proposed method aims to improve speech enhancement accuracy and recover enhanced speech quality compared to other techniques.
Conversational transfer learning for emotion recognitionTakato Hayashi
1) The document proposes an approach called TL-ERC that uses transfer learning to improve emotion recognition in conversations. TL-ERC pre-trains a hierarchical dialogue model on multi-turn conversation data and transfers its parameters to an emotion classifier.
2) Experiments show that TL-ERC improves performance and robustness over randomly initialized models, especially with limited training data. TL-ERC also reaches optimal validation performance in fewer training epochs.
3) Comparisons indicate TL-ERC outperforms previous state-of-the-art models for emotion recognition and is better able to leverage pre-trained weights than training from scratch.
This document summarizes a research paper on understanding and estimating emotional expression using acoustic analysis of natural speech. The paper explores identifying seven emotional states (anger, surprise, sadness, happiness, fear, disgust, and neutral) using fifteen acoustic features extracted from the SAVEE speech database. Three models using different combinations of features were evaluated using various machine learning algorithms. The results showed that Model 2, using energy intensity, pitch, standard deviation, jitter, and shimmer, achieved the highest classification accuracy. Estimation of emotions using confidence intervals showed that most emotions could be accurately estimated using energy intensity and pitch. The paper concludes that expanding the study to include more features and databases could improve emotional state recognition.
In the present-day communications speech signals get contaminated due to
various sorts of noises that degrade the speech quality and adversely impacts
speech recognition performance. To overcome these issues, a novel approach
for speech enhancement using Modified Wiener filtering is developed and
power spectrum computation is applied for degraded signal to obtain the
noise characteristics from a noisy spectrum. In next phase, MMSE technique
is applied where Gaussian distribution of each signal i.e. original and noisy
signal is analyzed. The Gaussian distribution provides spectrum estimation
and spectral coefficient parameters which can be used for probabilistic model
formulation. Moreover, a-priori-SNR computation is also incorporated for
coefficient updation and noise presence estimation which operates similar to
the conventional VAD. However, conventional VAD scheme is based on the
hard threshold which is not capable to derive satisfactory performance and a
soft-decision based threshold is developed for improving the performance of
speech enhancement. An extensive simulation study is carried out using
MATLAB simulation tool on NOIZEUS speech database and a comparative
study is presented where proposed approach is proved better in comparison
with existing technique.
Emotion Detection from Voice Based Classified Frame-Energy Signal Using K-Mea...ijseajournal
Emotion detection is a new research era in health informatics and forensic technology. Besides having some challenges, voice based emotion recognition is getting popular, as the situation where the facial image is not available, the voice is the only way to detect the emotional or psychiatric condition of a
person. However, the voice signal is so dynamic even in a short-time frame so that, a voice of the same person can differ within a very subtle period of time. Therefore, in this research basically two key criterion have been considered; firstly, this is clear that there is a necessity to partition the training data according
to the emotional stage of each individual speaker. Secondly, rather than using the entire voice signal, short time significant frames can be used, which would be enough to identify the emotional condition of the speaker. In this research, Cepstral Coefficient (CC) has been used as voice feature and a fixed valued kmeans clustered method has been used for feature classification. The value of k will depend on the number
of emotional situations in human physiology is being an evaluation. Consequently, the value of k does not necessarily consider the volume of experimental dataset. In this experiment, three emotional conditions: happy, angry and sad have been detected from eight female and seven male voice signals. This methodology has increased the emotion detection accuracy rate significantly comparing to some recent works and also reduced the CPU time of cluster formation and matching.
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Modelsipij
The swift progress in the study field of human-computer interaction (HCI) causes to increase in the interest in systems for Speech emotion recognition (SER). The speech Emotion Recognition System is the system that can identify the emotional states of human beings from their voice. There are well works in Speech Emotion Recognition for different language but few researches have implemented for Arabic SER systems and that because of the shortage of available Arabic speech emotion databases. The most commonly considered languages for SER is English and other European and Asian languages. Several machine learning-based classifiers that have been used by researchers to distinguish emotional classes: SVMs, RFs, and the KNN algorithm, hidden Markov models (HMMs), MLPs and deep learning. In this paper we propose ASERS-LSTM model for Arabic Speech Emotion Recognition based on LSTM model. We extracted five features from the speech: Mel-Frequency Cepstral Coefficients (MFCC) features, chromagram, Melscaled spectrogram, spectral contrast and tonal centroid features (tonnetz). We evaluated our model using Arabic speech dataset named Basic Arabic Expressive Speech corpus (BAES-DB). In addition of that we also construct a DNN for classify the Emotion and compare the accuracy between LSTM and DNN model. For DNN the accuracy is 93.34% and for LSTM is 96.81%
Predicting the Level of Emotion by Means of Indonesian Speech SignalTELKOMNIKA JOURNAL
This document summarizes a study that aimed to evaluate how well Mel-Frequency Cepstral Coefficients (MFCC) features extracted from Indonesian speech signals relate to four emotions: happy, sad, angry, and fear. Nearly 300 speech signals were collected from actors speaking Indonesian sentences with different emotions. Using support vector machine classification, the study found that the Teager energy feature and the first MFCC coefficient were most crucial for prediction, achieving 86% accuracy. Additional initial MFCC features increased accuracy slightly, but more than four features had negligible effects.
Performance estimation based recurrent-convolutional encoder decoder for spee...karthik annam
This document discusses a proposed Recurrent-Convolutional Encoder-Decoder (R-CED) network for speech enhancement. The R-CED network aims to overcome challenges with existing methods by estimating the a priori and posteriori signal-to-noise ratios to separate noise from speech. The R-CED consists of convolutional layers with increasing and decreasing numbers of filters to encode and decode features. Performance will be evaluated using metrics like PESQ, STOI, CER, MSE, SNR, and SDR. The proposed method aims to improve speech enhancement accuracy and recover enhanced speech quality compared to other techniques.
Conversational transfer learning for emotion recognitionTakato Hayashi
1) The document proposes an approach called TL-ERC that uses transfer learning to improve emotion recognition in conversations. TL-ERC pre-trains a hierarchical dialogue model on multi-turn conversation data and transfers its parameters to an emotion classifier.
2) Experiments show that TL-ERC improves performance and robustness over randomly initialized models, especially with limited training data. TL-ERC also reaches optimal validation performance in fewer training epochs.
3) Comparisons indicate TL-ERC outperforms previous state-of-the-art models for emotion recognition and is better able to leverage pre-trained weights than training from scratch.
This document summarizes a research paper on understanding and estimating emotional expression using acoustic analysis of natural speech. The paper explores identifying seven emotional states (anger, surprise, sadness, happiness, fear, disgust, and neutral) using fifteen acoustic features extracted from the SAVEE speech database. Three models using different combinations of features were evaluated using various machine learning algorithms. The results showed that Model 2, using energy intensity, pitch, standard deviation, jitter, and shimmer, achieved the highest classification accuracy. Estimation of emotions using confidence intervals showed that most emotions could be accurately estimated using energy intensity and pitch. The paper concludes that expanding the study to include more features and databases could improve emotional state recognition.
In the present-day communications speech signals get contaminated due to
various sorts of noises that degrade the speech quality and adversely impacts
speech recognition performance. To overcome these issues, a novel approach
for speech enhancement using Modified Wiener filtering is developed and
power spectrum computation is applied for degraded signal to obtain the
noise characteristics from a noisy spectrum. In next phase, MMSE technique
is applied where Gaussian distribution of each signal i.e. original and noisy
signal is analyzed. The Gaussian distribution provides spectrum estimation
and spectral coefficient parameters which can be used for probabilistic model
formulation. Moreover, a-priori-SNR computation is also incorporated for
coefficient updation and noise presence estimation which operates similar to
the conventional VAD. However, conventional VAD scheme is based on the
hard threshold which is not capable to derive satisfactory performance and a
soft-decision based threshold is developed for improving the performance of
speech enhancement. An extensive simulation study is carried out using
MATLAB simulation tool on NOIZEUS speech database and a comparative
study is presented where proposed approach is proved better in comparison
with existing technique.
Emotion Detection from Voice Based Classified Frame-Energy Signal Using K-Mea...ijseajournal
Emotion detection is a new research era in health informatics and forensic technology. Besides having some challenges, voice based emotion recognition is getting popular, as the situation where the facial image is not available, the voice is the only way to detect the emotional or psychiatric condition of a
person. However, the voice signal is so dynamic even in a short-time frame so that, a voice of the same person can differ within a very subtle period of time. Therefore, in this research basically two key criterion have been considered; firstly, this is clear that there is a necessity to partition the training data according
to the emotional stage of each individual speaker. Secondly, rather than using the entire voice signal, short time significant frames can be used, which would be enough to identify the emotional condition of the speaker. In this research, Cepstral Coefficient (CC) has been used as voice feature and a fixed valued kmeans clustered method has been used for feature classification. The value of k will depend on the number
of emotional situations in human physiology is being an evaluation. Consequently, the value of k does not necessarily consider the volume of experimental dataset. In this experiment, three emotional conditions: happy, angry and sad have been detected from eight female and seven male voice signals. This methodology has increased the emotion detection accuracy rate significantly comparing to some recent works and also reduced the CPU time of cluster formation and matching.
ASERS-LSTM: Arabic Speech Emotion Recognition System Based on LSTM Modelsipij
The swift progress in the study field of human-computer interaction (HCI) causes to increase in the interest in systems for Speech emotion recognition (SER). The speech Emotion Recognition System is the system that can identify the emotional states of human beings from their voice. There are well works in Speech Emotion Recognition for different language but few researches have implemented for Arabic SER systems and that because of the shortage of available Arabic speech emotion databases. The most commonly considered languages for SER is English and other European and Asian languages. Several machine learning-based classifiers that have been used by researchers to distinguish emotional classes: SVMs, RFs, and the KNN algorithm, hidden Markov models (HMMs), MLPs and deep learning. In this paper we propose ASERS-LSTM model for Arabic Speech Emotion Recognition based on LSTM model. We extracted five features from the speech: Mel-Frequency Cepstral Coefficients (MFCC) features, chromagram, Melscaled spectrogram, spectral contrast and tonal centroid features (tonnetz). We evaluated our model using Arabic speech dataset named Basic Arabic Expressive Speech corpus (BAES-DB). In addition of that we also construct a DNN for classify the Emotion and compare the accuracy between LSTM and DNN model. For DNN the accuracy is 93.34% and for LSTM is 96.81%
This document summarizes two experiments and an agent-based simulation that investigated optimization strategies in human spatial cognition and exploration. The studies found that people explore novel environments in two distinct patterns - extensive exploration that prioritizes increasing spatial knowledge, and limited exploration that aims to reduce travel distance. Both strategies reflect a trade-off between memory demands and travel costs. An agent-based model was able to simulate these two exploration behaviors and reproduce the same memory-distance trade-off as seen in humans. Overall, the research confirmed that human spatial exploration involves optimizing strategies to balance gaining spatial knowledge with minimizing travel efforts.
A Cognitive Constraint Model of Dual-Task Trade-offs in a Highly Dynamic Driv...DBrumby
Describes a modeling study of the strategic variations in distracted driving and their effects on driver performance. Demonstrates how a constraint modeling approach can be applied to complex dynamic tasks.
Age-related Driving Performance: Effect of fog under dual-task conditionsjkcrash12
The present study investigated the driving performance of older and
younger drivers using a dual-task paradigm. Drivers were requred to do a
car-following task while detecting a signal light change in a light array above the
roadway in the driving simulator under different fog conditions. Decreased
accuracies and longer response times were recorded for older drivers, compared to
younger drivers, expecially under dense fog conditions. In addition, older drivers had
decreased car following performance when simultaneously performing the
light-detection task. These results suggets that under poor weather conditions (e.g.
fog), with reduced visibility, older drivers may have an increased accident risk
because of a decreased ability to perform multiple tasks.
This document provides information about a mentor for the 8th semester Bachelor of Architecture students at Sunder Deep Group of Institutions for the even semester of the 2016-2017 academic year. The mentor is Ar. Hemraj Chail from the Architecture department and he is responsible for 39 students in Section A, though their roll numbers are not listed.
In normal individual there is normal division of attention to many stimulus at a time but this ability is impaired in patients with Parkinson’s disease leads to slowness of one or both the tasks, affecting quality of daily life.
People with Parkinson’s have difficulties in doing movement in a sequence.these patients give preference to cognitive task over motor task leads to serious difficulties in normal living of these peoples.
This document summarizes theories of divided attention from psychological literature. It describes dual task experiments and factors like task similarity, difficulty, and practice that influence performance. Early theories proposed either a single, limited central processor (Kahneman) or multiple specialized modules (Allport). Later theories like multiple resource theory (Navon & Gopher) and Baddeley's model of working memory provided a synthesis, combining a central executive with modality-specific subsystems to better explain dual task findings. However, all theories have limitations in fully specifying the cognitive architecture underlying divided attention.
Psychology is the scientific study of behavior and mental processes. Its goals are to describe, predict, understand, and control behavior. It has many subfields including clinical, cognitive, developmental, and social psychology. Important figures in its history include Wilhelm Wundt, who founded the first psychology lab, and William James, who brought experimental psychology to the US. Major perspectives include biological, psychodynamic, cognitive, behavioral, and humanistic approaches.
The document discusses duality theory in linear programming (LP). It explains that for every LP primal problem, there exists an associated dual problem. The primal problem aims to optimize resource allocation, while the dual problem aims to determine the appropriate valuation of resources. The relationship between primal and dual problems is fundamental to duality theory. The document provides examples of primal and dual problems and their formulations. It also outlines some general rules for constructing the dual problem from the primal, as well as relations between optimal solutions of primal and dual problems.
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
This document summarizes a research paper that proposes a new transformer model for span-based question answering on dialogue transcripts. The model is pretrained on tasks like masked language modeling at the token and utterance level, as well as utterance order prediction, using the Friends TV show transcript corpus. It is then fine-tuned jointly on two tasks: utterance ID prediction and token span prediction. Evaluation on the FriendsQA dataset shows the proposed model outperforms BERT and RoBERTa baselines. However, analysis finds the model still struggles with inference in dialogues and representing speakers.
The document summarizes research on music generation using deep learning techniques. It reviews several papers on using RNNs, LSTMs and GANs for music style transfer, generation of melodies and accompaniments, and modeling long-term dependencies in music. It describes the Bach dataset used, representing music pieces as MIDI files. The proposed model architecture uses LSTM layers, dropout and fully connected layers with softmax. Experiments varied hyperparameters like number of LSTM layers, sizes and dropout rates to optimize F-measure and loss on training and test data.
The study evaluated the effectiveness of a computer assisted pronunciation training (CAPT) system called PARLING for teaching English pronunciation to Italian children. 28 children participated and were split into a control group that received teacher-led training and an experimental group that used PARLING. Both groups showed significant improvement in pronunciation quality from pre-test to post-test, with no significant differences between the groups, indicating that PARLING was as effective as teacher-led instruction. Difficult and unknown words showed greater improvement than easy words known by the children.
This document summarizes a study on how the structural position of sounds affects their acquisition by English learners of Spanish. It tested if learners rely on distributional information when acquiring sounds. The study found that learners were most successful with sounds that have overlapping distributions in English and Spanish, and least successful with sounds only in Spanish. This suggests learners do use distribution to learn sounds and confirms the importance of comparing sound systems between languages.
1) The document examines how reading ability influences the ability to learn talker-specific phonetic details through perceptual learning.
2) An experiment with average and advanced readers found that advanced readers had a faster rate of learning talkers' voices during training and showed greater benefits of talker familiarity on word recognition after training compared to average readers.
3) The results suggest that individual differences in talker-specific perceptual learning are related to differences in reading abilities.
This document summarizes a research workshop on task-based telecollaboration between Russian and English speakers and the development of strategic competence. It discusses how online collaboration can help develop intercultural strategic competence, including metacognitive, cognitive, social, affective, and compensation strategies. The document then summarizes several studies on collaboration between Russian and American students that show collaboration helps improve language skills, cultural competence, and motivation through increased strategy use, including planning, feedback, and social interaction.
The study examined the effect of context cue exposure on long-term memory recall performance. Sixty undergraduate students studied a word pair list while listening to classical music (context cue). In the test phase, half the students heard the music again while recalling the words, whereas the other half did not hear the music. Results showed that recall performance was significantly better for students who heard the context cue (classical music) in both phases compared to those who only heard it during study. The findings suggest that reinstating an encoding context cue can improve long-term memory recall.
Predictability of Consonant Perception Ability Through a Listening Comprehens...Kosuke Sugai
This study examined whether a typical English listening comprehension test can predict learners' ability to perceive English consonants. The researchers administered a 30-item listening comprehension test to 107 Japanese EFL learners and selected 22 learners who scored between 25-27. These learners then completed a phoneme judgment task with 17 minimal word pairs differing in initial consonants. The results showed the listening test did not predict learners' consonant perception abilities and that learners with similar listening scores varied in their overall and individual consonant perception skills. The study supports the idea that common listening tests do not measure phonetic abilities.
The document provides information about an upcoming psychology exam, including its length, structure, and marking scheme. It also defines key terms like studies and theories, and provides guidelines for describing and evaluating studies. Specific studies discussed include Craik and Tulving's 1975 experiment demonstrating better recall of words processed semantically versus structurally or phonetically. Memory and forgetting are defined in relation to encoding, storage and retrieval. Several memory theories are also outlined, such as the multi-store model, levels of processing theory, trace decay theory, and cue-dependent theory.
The Effect of Presence and Type of Encoding Cue on Memory-Erica StarrErica Starr
Hofstra University Cognitive Psychology research seminar (Psy 190) - Erica Starr - Dec. '12 Final PowerPoint Presentation based on a research paper written for the course
The document summarizes a study on the effects of computer-assisted pronunciation readings (CPR) on ESL learners' use of pausing, stress, intonation, and comprehensibility. The study involved 75 intermediate ESL students who did 11 weeks of CPR tasks or no treatment (control group). Results showed the CPR group significantly improved in perceiving pausing, word stress, and intonation, as well as producing word stress, but not comprehensibility. The study concluded CPR helped perception and controlled production of some prosodic features but not spontaneous speech comprehensibility.
This document summarizes two experiments and an agent-based simulation that investigated optimization strategies in human spatial cognition and exploration. The studies found that people explore novel environments in two distinct patterns - extensive exploration that prioritizes increasing spatial knowledge, and limited exploration that aims to reduce travel distance. Both strategies reflect a trade-off between memory demands and travel costs. An agent-based model was able to simulate these two exploration behaviors and reproduce the same memory-distance trade-off as seen in humans. Overall, the research confirmed that human spatial exploration involves optimizing strategies to balance gaining spatial knowledge with minimizing travel efforts.
A Cognitive Constraint Model of Dual-Task Trade-offs in a Highly Dynamic Driv...DBrumby
Describes a modeling study of the strategic variations in distracted driving and their effects on driver performance. Demonstrates how a constraint modeling approach can be applied to complex dynamic tasks.
Age-related Driving Performance: Effect of fog under dual-task conditionsjkcrash12
The present study investigated the driving performance of older and
younger drivers using a dual-task paradigm. Drivers were requred to do a
car-following task while detecting a signal light change in a light array above the
roadway in the driving simulator under different fog conditions. Decreased
accuracies and longer response times were recorded for older drivers, compared to
younger drivers, expecially under dense fog conditions. In addition, older drivers had
decreased car following performance when simultaneously performing the
light-detection task. These results suggets that under poor weather conditions (e.g.
fog), with reduced visibility, older drivers may have an increased accident risk
because of a decreased ability to perform multiple tasks.
This document provides information about a mentor for the 8th semester Bachelor of Architecture students at Sunder Deep Group of Institutions for the even semester of the 2016-2017 academic year. The mentor is Ar. Hemraj Chail from the Architecture department and he is responsible for 39 students in Section A, though their roll numbers are not listed.
In normal individual there is normal division of attention to many stimulus at a time but this ability is impaired in patients with Parkinson’s disease leads to slowness of one or both the tasks, affecting quality of daily life.
People with Parkinson’s have difficulties in doing movement in a sequence.these patients give preference to cognitive task over motor task leads to serious difficulties in normal living of these peoples.
This document summarizes theories of divided attention from psychological literature. It describes dual task experiments and factors like task similarity, difficulty, and practice that influence performance. Early theories proposed either a single, limited central processor (Kahneman) or multiple specialized modules (Allport). Later theories like multiple resource theory (Navon & Gopher) and Baddeley's model of working memory provided a synthesis, combining a central executive with modality-specific subsystems to better explain dual task findings. However, all theories have limitations in fully specifying the cognitive architecture underlying divided attention.
Psychology is the scientific study of behavior and mental processes. Its goals are to describe, predict, understand, and control behavior. It has many subfields including clinical, cognitive, developmental, and social psychology. Important figures in its history include Wilhelm Wundt, who founded the first psychology lab, and William James, who brought experimental psychology to the US. Major perspectives include biological, psychodynamic, cognitive, behavioral, and humanistic approaches.
The document discusses duality theory in linear programming (LP). It explains that for every LP primal problem, there exists an associated dual problem. The primal problem aims to optimize resource allocation, while the dual problem aims to determine the appropriate valuation of resources. The relationship between primal and dual problems is fundamental to duality theory. The document provides examples of primal and dual problems and their formulations. It also outlines some general rules for constructing the dual problem from the primal, as well as relations between optimal solutions of primal and dual problems.
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-b...Jinho Choi
This document summarizes a research paper that proposes a new transformer model for span-based question answering on dialogue transcripts. The model is pretrained on tasks like masked language modeling at the token and utterance level, as well as utterance order prediction, using the Friends TV show transcript corpus. It is then fine-tuned jointly on two tasks: utterance ID prediction and token span prediction. Evaluation on the FriendsQA dataset shows the proposed model outperforms BERT and RoBERTa baselines. However, analysis finds the model still struggles with inference in dialogues and representing speakers.
The document summarizes research on music generation using deep learning techniques. It reviews several papers on using RNNs, LSTMs and GANs for music style transfer, generation of melodies and accompaniments, and modeling long-term dependencies in music. It describes the Bach dataset used, representing music pieces as MIDI files. The proposed model architecture uses LSTM layers, dropout and fully connected layers with softmax. Experiments varied hyperparameters like number of LSTM layers, sizes and dropout rates to optimize F-measure and loss on training and test data.
The study evaluated the effectiveness of a computer assisted pronunciation training (CAPT) system called PARLING for teaching English pronunciation to Italian children. 28 children participated and were split into a control group that received teacher-led training and an experimental group that used PARLING. Both groups showed significant improvement in pronunciation quality from pre-test to post-test, with no significant differences between the groups, indicating that PARLING was as effective as teacher-led instruction. Difficult and unknown words showed greater improvement than easy words known by the children.
This document summarizes a study on how the structural position of sounds affects their acquisition by English learners of Spanish. It tested if learners rely on distributional information when acquiring sounds. The study found that learners were most successful with sounds that have overlapping distributions in English and Spanish, and least successful with sounds only in Spanish. This suggests learners do use distribution to learn sounds and confirms the importance of comparing sound systems between languages.
1) The document examines how reading ability influences the ability to learn talker-specific phonetic details through perceptual learning.
2) An experiment with average and advanced readers found that advanced readers had a faster rate of learning talkers' voices during training and showed greater benefits of talker familiarity on word recognition after training compared to average readers.
3) The results suggest that individual differences in talker-specific perceptual learning are related to differences in reading abilities.
This document summarizes a research workshop on task-based telecollaboration between Russian and English speakers and the development of strategic competence. It discusses how online collaboration can help develop intercultural strategic competence, including metacognitive, cognitive, social, affective, and compensation strategies. The document then summarizes several studies on collaboration between Russian and American students that show collaboration helps improve language skills, cultural competence, and motivation through increased strategy use, including planning, feedback, and social interaction.
The study examined the effect of context cue exposure on long-term memory recall performance. Sixty undergraduate students studied a word pair list while listening to classical music (context cue). In the test phase, half the students heard the music again while recalling the words, whereas the other half did not hear the music. Results showed that recall performance was significantly better for students who heard the context cue (classical music) in both phases compared to those who only heard it during study. The findings suggest that reinstating an encoding context cue can improve long-term memory recall.
Predictability of Consonant Perception Ability Through a Listening Comprehens...Kosuke Sugai
This study examined whether a typical English listening comprehension test can predict learners' ability to perceive English consonants. The researchers administered a 30-item listening comprehension test to 107 Japanese EFL learners and selected 22 learners who scored between 25-27. These learners then completed a phoneme judgment task with 17 minimal word pairs differing in initial consonants. The results showed the listening test did not predict learners' consonant perception abilities and that learners with similar listening scores varied in their overall and individual consonant perception skills. The study supports the idea that common listening tests do not measure phonetic abilities.
The document provides information about an upcoming psychology exam, including its length, structure, and marking scheme. It also defines key terms like studies and theories, and provides guidelines for describing and evaluating studies. Specific studies discussed include Craik and Tulving's 1975 experiment demonstrating better recall of words processed semantically versus structurally or phonetically. Memory and forgetting are defined in relation to encoding, storage and retrieval. Several memory theories are also outlined, such as the multi-store model, levels of processing theory, trace decay theory, and cue-dependent theory.
The Effect of Presence and Type of Encoding Cue on Memory-Erica StarrErica Starr
Hofstra University Cognitive Psychology research seminar (Psy 190) - Erica Starr - Dec. '12 Final PowerPoint Presentation based on a research paper written for the course
The document summarizes a study on the effects of computer-assisted pronunciation readings (CPR) on ESL learners' use of pausing, stress, intonation, and comprehensibility. The study involved 75 intermediate ESL students who did 11 weeks of CPR tasks or no treatment (control group). Results showed the CPR group significantly improved in perceiving pausing, word stress, and intonation, as well as producing word stress, but not comprehensibility. The study concluded CPR helped perception and controlled production of some prosodic features but not spontaneous speech comprehensibility.
3. speech processing algorithms for perception improvement of hearing impaire...k srikanth
This document presents a novel algorithm to improve speech perception for hearing impaired patients using dichotic speech processing and presentation techniques. The algorithm splits speech into multiple frequency bands using dyadic filters to achieve bands of constant bandwidth, 1/3 octave bandwidth, and critical bandwidth. Listening tests on 5 subjects with hearing loss were conducted using 15 syllable speech material processed with the different filter sets. The results showed that processing with 1/3 octave bands improved recognition scores and reduced response times the most compared to unprocessed speech and speech processed with the other two filter sets. The algorithm aims to overcome issues like spectral and temporal masking that impair speech perception for those with sensorineural hearing loss.
Louise Stringer and Paul Iverson from UCL investigated how accent influences word recognition and electrophysiological measures of speech processing for native English and Spanish listeners. They found that a regional Scottish accent and non-native Spanish accent showed some influence on early phonological and lexical processing even in quiet conditions. More intelligible accents in noise elicited larger brain responses, suggesting processing difficulties with accented speech occur even without noise. Accents may affect listeners' expectations about upcoming words.
Improving students speaking skill through socio drama atLily Andryani
The document discusses using role play to improve students' speaking skills in an 11th grade social science class. It outlines how role play can engage students and help them practice communication. The study aims to determine if role play improves test scores in speaking ability. It describes collecting data through student interviews and dramas, then analyzing the results using a paired t-test to see if role play significantly impacts speaking skills. The significance of the study is that it could help teachers improve scores and make lessons more creative.
This study investigated differences in brain structural connectivity and the functional default mode network between deaf and hearing individuals using MRI. Results found increased activation in the posterior cingulate cortex, precuneus, and medial temporal lobes in the deaf group's default mode network. Analysis of structural connectivity found differences in node degree and fiber density in these areas and the motor cortex for the deaf group, suggesting neuronal plasticity related to sign language processing. Preliminary results provide new insights into brain network adaptations related to deafness and sign language use.
Multimedia Vocabulary Acquisition Research Article Critiquemiahoward
This study investigated the effect of different multimedia presentation modes on children's vocabulary learning using an interactive multimedia story. 135 Spanish children ages 8-9 were randomly assigned to groups that saw English words presented with words only, pictures only, or words and pictures. Those who saw words only performed better on immediate and delayed vocabulary post-tests compared to those who saw words with pictures or pictures only. The findings suggest that for children, presenting just words without additional pictures is more effective for second language vocabulary learning than combining words with pictures or using pictures alone.
Relationship between Vocabulary Size and Reading ComprehensionBangkok University
This study investigated the relationship between vocabulary size and reading comprehension ability in 30 Thai first-year university students studying English. The students completed vocabulary level tests assessing knowledge of words at the 2000, 3000 and 5000 frequency levels, as well as a reading comprehension test. Results showed the students' vocabulary size decreased as the word frequency levels increased. Vocabulary size at all three levels had a significant positive correlation with reading comprehension ability, with stronger correlations at higher frequency levels. The findings suggest vocabulary instruction should be a key part of English courses to help students build the vocabulary knowledge needed for effective reading.
Relationship between Vocabulary Size and Reading Comprehension
BoysTownJobTalk
1. Exploratory Studies on Measuring
Attentional Load Using a Dual-Task
Paradigm: An Alternative Way to
Evaluate Effects of Amplitude
Compression
University of Nebraska-Lincoln
Sangsook Choi
2. 2
Overview
n Background and interests
n Motivation
n Hearing aids not perfect
n A lot more to be discovered to improve
hearing and communication for individuals
with hearing loss
4. 4
Acoustic consequences of
compression
n Temporal changes
n Reduced envelope modulation
n Distortion in amplitude envelopes
n Alteration in average spectrum level
n Other kinds of potential distortion
5. 5
Compression & Intelligibility
n Conflicting results
n Compression enhances and degrades the
signal
n Different theoretical perspectives (temporal
approach)
n Still in search for optimal compression
to provide comfort and to improve
speech intelligibility
6. 6
Traditional Intelligibility
Measures
n Recognition of phonemes, words, &
sentences tested at threshold or
suprathreshold levels
n Tradition of engineers to measure
speech intelligibility to evaluate
communication systems based upon
articulation index theory
7. 7
Limitations of intelligibility
measures
§ Validity, reliability, & sensitivity of
speech recognition tests
§ Low predictability of real life performance
§ Poor replicability
§ High performance in word recognition
often possible with poorly specified speech
(e.g., cochlear implant processed signals)
8. 8
Limitation of intelligibility
measures (cont’)
§ Recognition based tests over simplifies the
listening process
§ Understanding speech involves more than
recognizing a sequence of phonemes
n The listener integrates the acoustic signal with
other information to comprehend the meaning of
an utterance
n Auditory disability is a complex problem
n Recognition ability just one measure of the
speech intelligibility.
9. 9
Traditional Approach to
Intelligibility
n Accuracy measures based on percent correct
n Overall success rate in speech tests is result
of true detectibility plus subject’s internal
state
n Attention and arousal
n Biases and Expectations
n High intelligibility scores due to perceptual
effort is often not considered
10. 10
Alternative Approaches:
n Influence of cognitive psychology
n Attention and immediate memory in study of
hearing and behavior (Broadbent, 1958; Rabbit,
1966)
n Listeners as a cognitive interface (Pisoni, 1981)
n Index of processing difficulty (listening effort)
n Reaction time measures (Pratt, 1981)
n STM & Free, Serial recall(Luce et al, 1982)
n Dual task performance (Downs & Crum, 1978,
Gordon et al, 1992)
11. 11
A Dual-Task Performance &
Listening Effort
n Attention is a limited resource (Broadbent,
1958)
n For multiple tasks, attention must be shared
n Kahneman’s channel capacity model (1973)
n Processing capacity is overloaded, dual task
performance decreases
n In creased listening effort as decreased
performance in the dual task.
12. 12
Overall Purpose
n Due to the limitations of traditional
intelligibility measures, an alternative method
was sought to add to the traditional
perspective.
n To evaluate the acoustic change in speech,
listening effort was considered as an
additional measure of speech
n The capacity demands was measured as an
index of listening effort using a dual-task.
13. 13
Hypotheses
n Increased processing demands due to
distortion in speech material may not be well
reflected in performance in speech
recognition tests due to cognitive
intervention (i.e, active top-down
processes).
n Increase in processing demands may be
reflected as decreased performance in a
dual-task as an indication of increased
attentional load.
14. 14
Experiment 1
n The effect of amplitude compression on
speech intelligibility and attention allocation
was investigated.
n The attentional load required for processing
compressed speech was measured using a
dual task dual-task to quantify listening
effort.
n A visual motor tracking was used to increase
task demands.
16. 16
Experimental measures
§ Measure 1
§ Word recognition ability by percent correct
words: Primary task
§ Measure 2
§ Visual-motor Tracking ability by percent of
time on target: Secondary task
17. 17
Participants
§ N=64 (n1, n2=32)
§ Normal hearing
§ native English speakers
§ Ages 19 to 55 (mean=27)
§ 60 females and 4 males
§ Primarily students
18. 18
Material for Word Recognition
§ Monosyllabic words
§ Digitally recorded by a male talker with
mid-western dialect
§ Digitally mixed with speech shaped noise
at a 6 dB SNR
§ Two types of stimuli
§ Compressed: WDRC
§ Un-Compressed: Linear
20. 20
Procedures for Word
recognition task
n Random presentation by IDRT (Tice &
Carrell, 1998) and TDT.
n Binaural presentation
n Approximately 72 dB SPL at circumaural
headphones
n Word repetition performed alone and
along with tracking task
21. 21
Visual Motor Task
§ Pursuit Rotor:
§ Visual motor learning
§ Computerized by Dlhopolsky, 2000
§ Tracking a dot using a mouse
§ Percent time on target
28. 28
Conclusion and Discussion
n The effect of compression was to reduce
performance on a simultaneous pursuit rotor
tracking task. The simultaneous tracking task
did not cause reduced performance on the
word-repetition-task.
n The Pursuit Rotor performance decrement
was interpreted as due to increased capacity
demands for processing compressed speech.
29. 29
Limitations
n Fatigue, learning effect,
n Small effect size and weak statistical
power
n This version of Pursuit Rotor is not
customizable
n Calibrating level of vigilance (right
amount of attention, it doesn’t distract
but helps concentration)
30. 30
Experiment 2
n A more sensitive measure of listener
attention and effort was sought. A
linguistic (lexical decision) task was
investigated as a simultaneous task.
n Linguistic distraction was explored to
improve the level of interference based
upon the modal model of attention
theory.
31. 31
Modality of Interference
n Modular theories
n The degree of similarity between 2 tasks
n Similar task compete for the same
processing modules
32. 32
Hypotheses
n It was assumed that the lexical decision task
would use the same resource for word
repetition to access lexicon. It was, therefore,
expected that lexical decision task would
result in a higher level of distraction than the
visual motor tracking task so that it will
differentiate better compressed speech from
uncompressed speech.
n It was hypothesized that simultaneous lexical
decision performance would be better for
uncompressed than compressed speech. This
hypothesis was based on presumed reduction
of cues available in compressed words.
33. 33
Experimental Measures
§ Measure 1:Primary
§ Word recognition
§ Percent correct words
§ Open-set format: verbal repetition
§ Measure 2: Secondary
§ Lexicality
§ P(c)max: unbiased measure, SDT
§ Forced choice: word vs. nonword
34. 34
Participants
§ N=40 (n1, n2=20)
§ Normal hearing
§ native English speakers
§ Ages 19 to 41 (mean=24)
§ 37 females and 3 males
§ Primarily students
35. 35
Visual Stimuli for Lexical Decision
n Lexical lists consist of 50 % words (Balota & Spieler,
1998) and 50 % nonwords (Washington University
Lexicon Project Website).
n Subjective familiarity, length of words, reaction time,
frequency of orthographs are matched for two lexical
lists.
n 48 points tall with an Arial typeface.
n 15 inch monitor
n Each word placed in a Gaussian noise background
n Corel Photopaint 11.
36. 36
Examples of Visual Stimuli
Left panel shows word “THAW” and right panel shows nonword “ACAS”
that were used in the lexical decision task.
37. 37
Apparatus & Procedure for
Lexical Decision
n Random presentation of visual stimuli controlled using
Presentation .7 (Neurobehavioral system, 2003)
n Displayed on a 15 inch monitor
n Resolution:
n The distance between a subject and a computer screen
approximately 63 cm.
n Visual stimuli displayed on a center of the screen for 500
ms.
n Subjects asked to push a button labeled either word or
nonword corresponding to their lexical decisions
38. 38
The participant is pushing
lexicality button in response to
a visual stimulus
The word
“SCHEME” is
displayed for the
lexical decision
task
41. 41
Conclusions and Discussion
n The dual task using the lexical decision
task failed to measure increased
listening effort due to compression.
n Linguistic distraction was no better than
visual motor distraction.
n Inconsistent findings using a dual-task
might have resulted from an insufficient
level of distraction (Task difficulty).
42. 42
Conclusion and Discussion
(cont’)
n Use of sensory memory (echoic vs. iconic).
n Multiple resource theory seems to apply to
the findings: Stage of processing, and code of
processing, modalities of input and output.
n 2/3 of Participants reported more difficulty
performing a dual task than a single task.
There might be relation between task
difficulty and level of attention.
46. 46
Overall Conclusion and Discussion
n Compression yielded lower word
intelligibility compared to linear
processing.
n Compression decreased tracking ability
but not lexicality.
n Task difficulty seems more important
factor than task similarity.
47. 47
Future Research
n Additional levels of distraction should be
investigated to find performance functions for
simultaneous tasks.
n Accurate measures of listener effort
n Similar levels of distraction to be developed across
different tasks
n Additional dependent measures might be
examined
n Non-word vocalization, slurred articulation, and
vocal loudness.