THE FLORIDA STATE UNIVERSITY 
COLLEGE OF MUSIC 
THE EFFECTS OF MUSIC TRAINING AND SELECTIVE ATTENTION ON WORKING 
MEMORY DURING BIMODAL PROCESSING OF AUDITORY AND VISUAL STIMULI 
PREVIEW 
By 
JENNIFER D. JONES 
A Dissertation submitted to the 
College of Music 
in partial fulfillment of the 
requirements for the degree of 
Doctor of Philosophy 
Degree Awarded: 
Summer Semester, 2006
UMI Number: 3232396 
PREVIEW 
Copyright 2006 by 
Jones, Jennifer D. 
All rights reserved. 
UMI Microform 
3232396 
2006 
by ProQuest Information and Learning Company. Copyright 
All rights reserved. This microform edition is protected against 
unauthorized copying under Title 17, United States Code. 
ProQuest Information and Learning Company 
300 North Zeeb Road 
P.O. Box 1346 
Ann Arbor, MI 48106-1346
The members of the Committee approve the dissertation of Jennifer D. Jones defended on June 
PREVIEW 
ii 
15, 2006. 
____________________________________________ 
Jayne M. Standley 
Professor Directing Dissertation 
___________________________________________ 
Jeffrey James 
Outside Committee Member 
____________________________________________ 
John M. Geringer 
Committee Member 
____________________________________________ 
Clifford K. Madsen 
Committee Member 
The Office of Graduate Studies has verified and approved the above named committee members.
ACKNOWLEDGEMENT 
I wish to thank Dr. Jayne Standley for always having answers to my questions and for 
supporting my research interests. To Dr. Cliff Madsen, I wish to say thank you for ‘getting me.’ 
It was an honor to teach with you. To Dr. Geringer, you challenged me to think harder than I 
even thought was possible! Thanks for the stats-induced headaches. 
I wish to thank my parents for the roots and wings. Thank you, Mother, for always 
having time to listen to me. Thank you, Daddy, for determination and great math genes. There is 
no way to adequately thank my husband, Jon Jones. You have had many roles – data-miner, 
editor-in-chief, financier, chef, computer technician, cat-feeder, and many more. Mostly, I thank 
you for always saying, PREVIEW 
“Yes, you can” every time I claimed I couldn’t, and “Yes, you will” when 
I claimed I wouldn’t. Better than better! iii
TABLE OF CONTENTS 
List of Tables vii 
List of Figures viii 
Abstract ix 
1. INTRODUCTION 1 
PREVIEW 
2. REVIEW OF LITERATURE 5 
Theoretical Framework 5 
Attention Research 8 
Measuring Attention and Attention as a Central Resource 8 
Endogenous and Exogenous Attention 11 
Attention Research with Infants and Children 12 
Attention, Intelligence, and Development 14 
Selective Attention Research 16 
Auditory and Visual Stimuli 17 
Commonalities and Synergy 17 
Localization and Cueing Differences with Cross-modal Stimuli 19 
Visual Attention During Auditory Distraction – Irrelevant Sound 20 
iv 
Paradigm 
Auditory and Visual Dominance Theories 23 
Dichotic Listening Paradigm 25 
Dual Audio Tasks – Pitch and Duration Judgments (non-dichotic) 29 
Music and Memory 30 
Memory and Songs – The Influence of Auditory Structure on Serial 30 
Verbal Recall 
Theories of Music Processing and Memory 34 
Melody Recognition 35 
Developmental and Training Differences 35 
Pitch, Rhythm, Contour, and Timbre Discrimination 36 
Searching for Melodic Targets 39 
Attention During Multi-Voice Music 40 
Error Detection and Expectancy as Evidence of Focus of Attention 42 
during Multi-voice music 
Gestalt in Music – Extracting Parts from Wholes 46 
Attention to Music – Focused and Therapeutic Listening 47 
Bimodal Experiences with Music – Complimentary and Non-complimentary 50 
Audio-Visual 
Encoding and Decoding Music Through Visual, Auditory and Tactile Senses 52
Music Training and Memory Research 54 
Gender Differences in Music and Memory Research 58 
The Present Study 61 
3. METHOD 63 
Pilot Study Materials 63 
Music (Auditory Stimuli) 63 
Images (Visual Stimuli) 68 
Video Development 70 
Posttest Construction 70 
Procedure for Pilot Study 72 
Results of Pilot Study 73 
Changes to Music Stimuli for Main Experiment 80 
Changes to Test Construction for Main Experiment 81 
PREVIEW 
Main Experiment 83 
Participants 83 
Design 86 
Procedures 88 
4. RESULTS 92 
Familiarity – Ratings and Total Correct Scores 92 
Perception of Attention Allocation 94 
Analyses of Modality of Error Scores 95 
Analysis of Question Type under Music Conditions 97 
Memory Strategies 100 
Analyses for Memory Decay 103 
Analyses for Serial Position Effects 104 
Analyses of Posttest Questions 107 
5. DISCUSSION 109 
Summary of Results 109 
Music Training Effects 109 
Recognition Versus Rejection of Stimuli in Working Memory 111 
The Role of Strategy 112 
Rhythm, Attention, and Information Processing 113 
Contour- Similar or Dissimilar to Target Melody 114 
Serial Position Effects – Expectancy Theory 115 
Attention States 116 
Implications for Music Education and Music Therapy 118 
v
Appendix A: Main Experiment – Composition of Audio Distractors for The Bailiff’s 120 
BIOGRAPHICAL SKETCH 241 
PREVIEW 
vi 
Daughter and Pomp and Circumstance 
Appendix B: Informed Consent Form and Approval Letter from 127 
Human Subjects Committee 
Appendix C: Pilot Study – Pre-Experiment Questionnaire 130 
Appendix D: Main Experiment – Posttest Form for Pomp and Circumstance 132 
Appendix E: Main Experiment – Posttest Form for The Bailiff’s Daughter 135 
Appendix F: Pilot Study – Post-Experiment Questionnaire 138 
Appendix G: Pilot Study – Instructions Script 140 
Appendix H: Pilot Study – Practice Test Form 143 
Appendix I: Main Experiment – Pre-Experiment Questionnaire 145 
Appendix J: Main Experiment – Practice Test Form 147 
Appendix K: Main Experiment – Post-Experiment Questionnaire 149 
Appendix L: Audio Instructions Accompanying Introductory and Instruction 152 
Slides in Experiment Videos 
Appendix M: Introductory, Instruction, and Practice Test Slides 155 
Appendix N: Main Experiment – The Bailiff’s Daughter Posttest 159 
Appendix O: Main Experiment – Pomp and Circumstance Posttest 163 
Appendix P: The Bailiff’s Daughter Video 167 
Appendix Q: Pomp and Circumstance Video 169 
Appendix R: Raw Data Spreadsheets 171 
REFERENCES 226
LIST OF TABLES 
1. Musicianship as Defined by a Sample of Reviewed Literature 59 
2. Descriptive Statistics on Pilot Data Group by Music Type 76 
3. Question Analysis for The Bailiff’s Daughter – Pilot Study 77 
4. Question Analysis for Pomp and Circumstance – Pilot Study 78 
5. Academic Majors of the Participants 85 
6. Mean Estimated Hours of Music Heard and Performed By Participants Groups 86 
7. Study Design – Participant Distribution by Gender, Major, and Instruction Type 87 
8. Perception of Attention Allocation to Music and Pictures for Familiar 94 
17. Bailiff’s Daughter Distractor Music Items Failing Criterion 108 
PREVIEW 
vii 
and Unfamiliar Music 
9. Mean Correct for Each Question Type for The Bailiff’s Daughter 97 
10. Mean Correct for Each Question Type for Pomp and Circumstance 99 
11. Frequency Distribution of Strategies Used by Participants 101 
12. Mean Total Score for Familiar and Unfamiliar Music By Strategy Type 102 
13. Frequency Distribution of Strategies by Instruction Type for Music Conditions 103 
14. Frequency Distribution of Total Correct Responses for Pictures by Serial Position 105 
15. The Bailiff’s Daughter Frequency Distribution of Total Correct Responses for 106 
Music Measures 
16. Pomp and Circumstance Frequency Distribution of Total Correct Responses for 107 
Music Measures
LIST OF FIGURES 
1. Pomp and Circumstance – original 64 
2. Pomp and Circumstance – pilot study version 65 
3. The Bailiff’s Daughter – original 65 
4. The Bailiff’s Daughter – pilot study version 66 
5. Hail to the Chief – original notation 67 
6. Hail to the Chief – pilot study version 67 
7. The Farmer’s Boy – original key and notation 67 
8. The Farmer’s Boy – pilot study version 68 
9. The Bailiff’s Daughter order of visual training stimuli (black images on white screen) 69 
10. Pomp and Circumstance order of visual training stimuli (blue images on 69 
PREVIEW 
viii 
white screen) 
11. Hail to the Chief order of images (green on white screen) for training stimulus, 70 
practice test 2 
12. The Farmer’s Boy order of images (red on white screen) for training stimulus, 70 
practice test 1 
13. Pomp and Circumstance – main experiment 81 
14. The Bailiff’s Daughter – main experiment 81 
15. Experimental laboratory set-up 89 
16.Familiarity ratings by major interaction 92 
17. Total correct scores – interaction between major and instruction 93 
18. Picture errors – major by instruction interaction 96 
19. Music errors – major by instruction interaction 96 
20. The Bailiff’s Daughter – question type by gender interaction 98 
21. Pomp and Circumstance – question type by gender interaction 99 
22. Pomp and Circumstance – question type by major interaction 100 
23. Pomp and Circumstance – test half by order interaction 104 
24. Question #22 distractor music 107
ABSTRACT 
Researchers have investigated participants’ abilities to recall various auditory and visual 
stimuli presented simultaneously during conditions of divided and selective attention. These 
investigations have rarely used actual music as the auditory stimuli. Music researchers have 
thoroughly investigated melodic recognition, but non-complimentary visual stimuli and attention 
conditions have rarely been applied during such studies. The purpose of this study was to 
examine the effects of music training and selective attention on recall of paired melodic and 
pictorial stimuli in a recognition memory paradigm. 
PREVIEW 
A total of 192 music and non-music majors viewed one of six researcher-prepared 
training videotapes containing eight images sequenced with a highly familiar music selection and 
an unfamiliar music selection under one of three attention conditions: divided attention, selective 
attention to music, and selective attention to pictures. A 24-question posttest presented bimodal 
test items that were paired during the training, paired distractors, a music trainer with a picture 
distractor, or a picture trainer with a music distractor. Total correct scores, error scores by 
modality, and scores by question type were obtained and analyzed. 
Results indicated that there were significant differences between music and non-music 
majors’ recall of the bimodal stimuli under selective attention conditions. Music majors 
consistently outperformed non-music majors in divided attention and selective attention to music 
conditions, while non-music majors outperformed music majors during selective attention to 
pictures. Music majors were better able to reject distractor music than were non-music majors. 
Music majors made fewer music errors than non-music majors. However, an unanticipated effect 
of gender was found. Females were better at recognizing paired trainers and males were better at 
rejecting distractors for both music conditions. Individually selected memory strategies did not 
significantly impact total scores. 
Analyses of sample error rates to individual questions revealed memory effects for music 
due to serial position and rhythmic complexity of stimuli. Participants poorly recalled the final 
measure of both music conditions. This finding was unusual since this position is generally 
ix
memorable in serial recall tasks. Simple rhythmic contexts were not remembered as well as more 
complex ones. The measures containing four quarter notes were not well recalled, even when 
tested two times. 
This study confirmed that selective attention protocols could be successfully applied to a 
melodic recognition paradigm with participants possessing various levels of music training. The 
effect of rhythmic complexity on memory requires further investigation, as does the effect of 
gender on recognition of melody. A better understanding of what makes a melody memorable 
would allow music educators and music therapists the opportunity to devise and teach effective 
strategies. 
PREVIEW 
x
CHAPTER 1 
INTRODUCTION 
Can people do two things at once? When asked, most individuals readily answer “yes” or 
“no” to this question. Each seems to understand his/her own capacity and preference for 
‘multitasking.’ Those who answer “yes” perceive that their performance is not compromised and 
may be enhanced in highly stimulating environments. Those who answer “no” perceive that they 
perform best when completing a single task a time. Is either of these groups correct? Researchers 
have investigated many aspects of this conundrum – are humans single, double, or multi channel 
thinkers? What conditions influence performance accuracy? What stimulus properties influence 
the outcome of dual task events? What roles do attention, memory, and experience/training play? 
Researchers have discovered partial answers to many of these questions, but as questions are 
answered, technology advances and are faced with new sensory environments that pose new 
challenges for research. 
PREVIEW 
Modern environments are saturated with stimuli; one prevalent environmental stimulus is 
music, made more readily available to listeners than at any other time in history by the iPod and 
other portable devices. Researchers have confirmed that music is ever-present in today’s society; 
we are influenced by music everywhere from work (Lesiuk, 2005) to restaurants (Caldwell & 
Hibbert, 2002). Many times the hearer chooses the music, other times; listeners have little 
control over sound environments (North, Hargreaves, & Hargreaves, 2004). Teens and young 
adults have reported listening to music from 2.5 or 3 hours per day to as much as 40 hours per 
week (Gardstrom, 1999; North, Hargreaves, & O’Neill, 2000; Radvansky, Fleming, & Simmons, 
1995; Schwartz & Fouts, 2003; Tarrant, North, & Hargreaves, 2000). While listening to music is 
a highly valued leisure activity, it is frequently secondary to another media event, such as 
watching television and reading (Kubey & Larson, 1990). Therefore, listening to music can be 
researched in the context of dual task experiments. 
In fact, many aspects of music listening constitute a dual task, even when listening to the 
music is the primary objective. Most music contains both pitch and rhythm information. 
1
Researchers have investigated the degree to which listeners can attend separately to each of these 
components (Byo, 1997; Demorest & Serlin, 1997; Sink, 1983). Additionally, music can present 
different timbres and degrees of intensity for listeners’ attentional foci (Madsen & Geringer, 
1990; Radvansky et al., 1995; Wolpert, 1990). Songs introduce yet another stimulus by the 
presence of text (Bonnel, Faita, Peretz, & Besson, 2001). Performing music also provides a 
number of dual task opportunities, including the dual auditory tasks of listening to one’s own 
performance while being aware of the performance of others and the inclusion of visual tasks 
when performing from notation. 
Investigations of bimodal audio-visual processing have included musical and non-musical 
stimuli. Research on the effects of soundtracks to movies provides clarity on music’s 
PREVIEW 
influence upon mood (Boltz, 2001) in addition to its impact upon memory for mood-related 
aspects of films (Marshall & Cohen, 1988). Other research involving short tone sequences and 
common sounds (door bell, duck quack) paired with visual images has revealed developmental 
differences in the reliance upon our eyes and ears for information (Napolitano & Sloutsky, 2004; 
Robinson & Sloutsky, 2004; Sloutsky & Napolitano, 2003). Sloutsky (2003, 2004) and 
colleagues discovered that young children (4-year olds) relied more heavily on auditory 
information when encoding bimodal stimuli. This was termed an auditory processing bias. 
Additionally, children were not able to shift their attention to visual aspects when instructed or 
able to use this information successfully during testing. In contrast, the visually-dominant adults 
could shift their attention to auditory inputs successfully. 
Baddeley’s (1986) components of working memory, namely the phonological loop, 
visuospatial sketchpad, and central executive function, provide a framework for examining recall 
for visual and auditory events. According to this proposal, visual and auditory events are 
processed in different memory sub-components with central executive function acting like a 
coordinator for incoming information. Contemporary information processing theory concurs with 
Baddeley’s ideas of separate stores for different incoming stimuli. Information processing 
theorists propose that there are filters or buffers that prevent the working memory system from 
overloading by prioritizing information into sensory traces, data-driven, and process-driven 
concepts (Klahr & MacWhinney, 1998). Information processing theory also categorizes 
information as serial, including music, speech, and other events that unfold in time, or parallel, 
including many visual stimuli. While speech and music are both examples of serial auditory 
2
processing, brain studies have discovered that different areas of the brain are used when 
processing these events. 
Through brain scanning technology, researchers have found that verbal, auditory (Mirz et 
al., 1999), and musical stimuli are processed differently. Generally, the left hemisphere is 
specialized for speech (Jeffries, Fritz, & Braun, 2003) and words (Samson & Zatorre, 1991) 
while the right hemisphere processes melody. Rhythm judgment did not appear to be lateralized 
to the left or right hemisphere (Dennis & Hopyan, 2001; Plenger et al., 1996). Curiously, though 
some aspects of music are processed in the opposite hemisphere from verbal data, people with 
musical training demonstrate superior verbal memory (Ho, Cheung, & Chan, 2003; Kilgour, 
Jakobson, & Cuddy, 2000). No such advantage was found for visual memory (Ho et al., 2003). 
Seemingly, musicians’ systematic use of their auditory attention and processing yields superior 
skills in the general auditory domain. However, few studies have examined how musicians 
compare to others when bimodal audio (musical)-visual tasks are presented to them. 
PREVIEW 
Other research has focused on the differences in the male and female brain. The male 
brain is characterized as being designed for understanding and building systems (and extracting 
rules that govern systems) while the female brain is more socially oriented (Baron-Cohen, 2005). 
Likewise, males and females, both infants and adults, responded to music differently, particularly 
when under stress (Standley, 1998, 2000), and female infants have more acute hearing (Cassidy 
& Ditty, 2001). However, studies examining tonal memory (Norris, 2000), attention responses 
(Richard, Normandau, Brun, & Maillet, 2004), and mental capacity (Johnson, Im-Bolter, & 
Pascual-Leone, 2003) found no differences between the sexes. Often researchers assign equal 
numbers of each sex in participant groups without reporting differences. It is not conclusive if 
differences in audio-visual memory between men and women exist at this point. 
The present study was designed to investigate how musicians versus nonmusicians and 
males versus females remember paired musical (auditory) and visual components following 
bimodal encoding examined in a recognition memory paradigm. The effects of attention on the 
recall of bimodal stimuli among the groups were tested through selective attention instructions. 
Additionally, this study sought to determine the differences between dual encoding unfamiliar 
music with unfamiliar images in comparison to retrieval of familiar music and encoding of 
unfamiliar images. It was expected that differences in the recall of musical events between 
musicians and nonmusicians would emerge, though a difference between males and females was 
3
not projected. The natural patterns of attention to visual and audio/musical events were examined 
in the group receiving no selective attention instructions. The degrees to which participants could 
manipulate their attention patterns to musical and visual stimuli were also tested. Encoding and 
memory strategies of the groups were categorized. 
PREVIEW 
4
CHAPTER 2 
REVIEW OF LITERATURE 
Theoretical Framework 
Researchers have long investigated how humans remember (James, 1902). Numerous 
forms of memory 1993), autobiographical more generally recognized existence and functions has undergone and the late 1950s to “relatively undifferentiated acoustically based, PREVIEW 
have been differentiated, including implicit and explicit memory (Schacter, 
memory, episodic, and semantic memory (Nelson, 1993) among the 
long-term and short-term memory divisions. Few theorists debate the 
of a long-term memory component, though short-term memory theory 
continues to undergo revision. Alterations in short-term memory theory from 
the early 1970s included a shift from the dominant view of memory as a 
unitary system” (Baddeley, 1976, p. 187) to one of distinct stores for 
limited-capacity, short-term storage and a more durable long-term store of 
considerable capacity. Even in the mid 1970s growing evidence that sensory systems (visual, 
auditory, and kinesthetic) may have a unique memory store compelled further differentiation of 
short-term memory theory. Baddeley’s (1986) conceptualization of working memory in three 
major divisions has proven hearty enough to withstand rigorous research and has spawned years 
of debate. This concept of the central executive with its slave components, the phonological loop 
and visuospatial sketchpad, provided a framework for not only researching visual and auditory 
information processing, but how information is selected, organized, coordinated, stored, and 
ultimately remembered (Baddeley & Hitch, 1974). 
Broadbent (Broadbent, 1971) contributed the concept of information selection. In the 
initial proposal in 1958, there was a single channel with a limited capacity through which all 
information funneled. The capacity limit of the single channel was a function of the rate of 
information flow, meaning that the organism needed time to process the stream of incoming 
information. In order to accommodate this limit, selective perceptual processing utilized a buffer 
store and a filtering system (Broadbent, 1971). Information could be held in the store and 
selected information filtered out for immediate processing. Broadbent also contributed the 
5
concepts of vigilance and expectancy to early work in early information processing. Today, 
vigilance and expectancy are probably best conceived as functions of attention, including 
selective attention, sustained attention, and inhibiting or ignoring distracting stimuli. Jones 
(1999) credited Broadbent with the concept of selective attention, particularly for auditory 
events, that has been exhaustively researched through the dichotic listening paradigm. 
The role of attention during information processing has continued to be important to 
working memory theory development. Treisman and Davies (1973) proposed the concept of 
parallel attention and processing, thereby expanding the single channel theory. In parallel 
processing, incoming stimuli of different modalities or different properties of the same stimuli 
can be analyzed at the same time because they do not use the same resources. Two studies 
conducted by Allport, Antonis, and Reynolds (1972) supported the hypothesis. Allport et al. 
(1972) proposed a multi-channel hypothesis after research showing participants displayed 
abilities to accurately recall photography presented while they were engaged in speech 
shadowing. A second study involved music majors who were able to sight-read piano music 
while speech shadowing easy and difficult prose passages. There were no significant differences 
in the memory posttest for the prose passages under conditions of sight-reading on the piano and 
only speech shadowing. Furthermore, the differences in error rates during the piano task were not 
significantly different in session 2 under divided or focused attention. This research clarified that 
when the simultaneous tasks are different enough, each can be completed successfully. The 
allocation of attention to one task, speech shadowing, did not affect visual memory or motor 
performance. 
PREVIEW 
Cowan (1995) proposed that attention and memory intersected and conceptualized 
working memory in terms of focus of attention. Cowan (1998) devised a definition of working 
memory as follows: Working memory is the collection of mental processes that permit 
information to be held temporarily in an accessible state, in the service of some mental activity” 
(Cowan, 1998, p. 77). He furthered compared his working memory system with Baddeley’s 
system (central executive, phonological and visuospatial stores and processors) acknowledging 
the differing sensory memories for visual and auditory events. Cowan’s working memory system 
is composed of a capacity-limited focus of attention along with temporarily activated information 
in permanent memory. According to Cowan, attention to events summons long-term memory 
6
resources that allow for semantic processing (Cowan, 2005). The use of long-term resources, 
such as chunking and schemata, was not a new concept. 
Miller (1956) proposed chunking as a means of overcoming capacity limits for short-term 
memory. The basic concept of chunking was that like information was perceptually grouped such 
that information was encoded in small groups rather than a string. One of the most common 
examples of chunking is remembering telephone numbers in groups of three or four. Researchers 
have investigated the degree to which stimulus features render the chunks or cogitative processes 
of the observer (Crawley, Acker-Mills, Pastore, & Weil, 2002; Green & McKeown, 2001). 
Stimulus-driven chunking would provide evidence of a bottom-up or data-driven approach while 
scheme-driven chunking would be indicative of a top-down or concept-driven approach, using 
information technology language (Klahr & MacWhinney, 1998). Chunking obviously occurs; the 
degree to which it is perceptually (automatic) or conceptually (thought) driven is still under 
investigation. 
PREVIEW 
Information processing theory has developed alongside the understanding and 
development of computer programming. Based upon storage units, filters, and schemata 
proposed by psychological theorists, programmers have designed computer models that imitate 
human information processing. One particular area of interest has been auditory selective 
attention, a topic intriguing to cognitive psychologists, neurologists, musicians, and 
computational engineers alike (Wrigley & Brown, 2004). Psychologists have developed and 
tested theories with behavioral experiments, and computer engineers have integrated information 
from electrophysiology and neurology. A model was developed that grouped incoming streams 
of intentional information and allowed for ‘leaks’ representative of the processed unintentional 
auditory streams. The model represented a culmination of the research on the processing of 
attended and unattended auditory events. Psychologists have tested this phenomenon by using 
multiple auditory streams or auditory streams in addition to other input and asking participants to 
divide attention across streams or select a single stream. Myriad researchers have developed the 
selective attention paradigm in both auditory and bimodal (often auditory and visual) paradigms. 
The attention research for bimodal audio-visual and dual/multiple audio events has been 
framed by Baddeley’s three components of working memory (Baddeley, 1976, 1986; Baddeley 
and Hitch, 1974) and contemporary theories of information processing with an elaborate system 
of sensory-specific filters and stores. Investigations have been designed to better understand the 
7
capacity limits for unimodal information and bimodal information, particularly the unique 
differences between auditory and visual events. Investigations on the recall of serial and 
nonserial information as well as paired or associated events have provided understanding of the 
coordination processes during information processing. Researchers have manipulated encoding 
methods, rehearsal times and strategies, and output. Measurement of performance has included 
reaction or response times, accuracy rates, including hit and false alarm rates, and a number of 
brain scan technologies. Designs have included detection, recognition, and discrimination 
protocols, each contributing different and, at times, conflicting outcomes. Participants have been 
instructed to attend to stimuli, ignore stimuli, and divide attention across events. Through these 
expansive and complex investigations of attention and working memory during dual tasks or 
multisensory environments, much has been learned about how humans process and retain 
information. 
PREVIEW 
While educators are particularly invested in understanding how learning is differentially 
achieved through vision and audition, research on learning through different senses has of 
interest to a broad audience. The effects of multisensory input are of special interest to music 
therapists and music educators. While the researchers studying attention during dual audio tasks 
have systematically manipulated frequency, duration, and intensity in short tone sequences, 
fewer researchers have used actual music. Music as a stimulus provides a number of foci for 
attention, including rhythm, pitch, melody, and harmony, to name a few. Additionally, cognitive 
memory strategies can be examined using music (Madsen & Madsen, 2002). The use of music as 
an agent for facilitating the learning of non-musical information has been investigated in addition 
to the teaching of music understanding and performance. The effect of music training upon 
memory for verbal and nonverbal information has provided a fertile ground for examining the 
role of experience in memory and attention. 
Attention Research 
Measuring Attention and Attention as a Central Resource 
Auditory attention has proven to be a difficult construct to measure though James’ (1902) 
claimed that we all know what attention is. Two such tests have claimed to measure auditory 
selective attention, namely the Goldman-Fristoe-Woodcock Auditory Selective Attention Test and 
the Flowers Auditory Test of Selective Attention. Glass, Franks, and Potter (1986) compared 
these test to determine if they indeed measured the same construct; the researchers found the 
8
tests to correlate (r = .44). Though the correlation was positive, the relative weakness of the 
relationship indicated that the construct of auditory selective attention was broad. Auditory 
selective attention ranges from being aware and localizing of sounds, to perceptual processing of 
relevance, as well as ignoring distraction, and maintaining focus of attention over time. Despite 
the difficulties presented by empirical measures of attention, a study by Kahneman, Ben-Ishai, 
and Lotan (1973) demonstrated the validity of attention as a construct. Based upon a high 
number of accidents – a behavior attributed to poor attention - bus drivers were tested for 
auditory attention. Kahneman et al. (1973) found moderate, positive correlations between the 
number of accidents per year by professional bus drivers in Israel and a test of selective auditory 
attention. 
PREVIEW 
Cowan (1995, 2005) has established the importance of attention to working memory; 
therefore, measures of working memory may to provide an estimate of one’s attention. One 
aspect of attention that has been frequently measured is the ability to resist distraction. 
Resistance to auditory distraction, referred to as the irrelevant sound paradigm (Jones, 1999), is a 
common technique. Beaman (2004) gave participants the Operations Span Task (OSPAN) for 
working memory during conditions of auditory distraction. He later compared the scores for 
relationship between the test and behavior. The OSPAN was not predictive of the irrelevant 
sound effect on serial or free recall of verbal material (Beaman, 2004). Irrelevant speech sounds 
and words affected both high and low scores from the OSPAN. Though the relationship was not 
hearty enough to demonstrate a significant relationship, low span individuals were more likely to 
experience intrusion from previous list trials than high span individuals. It does appear that these 
tests are spotlighting the same concept, though they have demonstrated flaws. 
Morey and Cowan (2004, 2005) verified attention to be a central resource in working 
memory. In the 2004 study, participants were asked to recite their 7-digit phone number, a 
random set of 7 digits, 2 digits, or no digits while examining visual arrays with 4, 6, or 8 colored 
squares and subsequently making same-different judgments on the visual arrays. The recitation 
of numbers was designed to prevent verbal rehearsal of the positions and colors of squares in the 
visual array by directing attention to numerical recitation. There were significant differences in 
scores by visual array size and recital condition with no significant interactions between the 
variables. Performance was best for the smallest array size. The participants’ scores were poorest 
when reciting 7 random digits in comparison to the other three conditions that did not differ 
9
significantly from one another. The authors concluded that some shared space in working 
memory was used for digits and visual information. The 2005 study provided additional support 
for the central resource for attention. Morey and Cowan (2005) found that participants were able 
to make visual array judgments of same or different equally well under no digit recall conditions 
and silent rehearsal of digits conditions, but vocal rehearsal of the digit lists significantly 
impacted visual array judgments. The recitation of the to-be-remembered digits, whether before 
the first visual array or after, interfered with visual memory. Attention was presumably drawn 
away from the visual task during recall of the list, indicative of a central attentional resource 
(Morey & Cowan, 2005). The distracting effect of speaking was likely the result of tapping into 
resources from the phonological loop, particularly since digits were not disruptive to visual array 
judgments when silently rehearsed. The impact of the distracting spoken digits occurred 
regardless of the location of the distractor in the sequence (before arrays or during). 
PREVIEW 
Further evidence from both studies that a central attention resource was responsible for 
auditory and visual information was the relationship between accuracy in both modalities. Morey 
and Cowan (2004) found that visual array comparisons were significantly more accurate when 
accompanied by correct digit list recall than when the digit recall was incorrect. The same 
relationship existed for correct digit recall as correct lists were accompanied by correct visual 
arrays. The co-occurrence of error (and accuracy) confounded the idea of a simple trade-off 
between stimulus types under dual-task conditions. The same relationship was found in the 2005 
study; when the digits were recalled incorrectly, accuracy on the visual array task was lower than 
when digits were correctly recalled. Demonstration of successful attention allocation was 
evinced by correct recall of both digits and visual arrays. 
Research in the auditory distraction paradigm also supports attention as a major 
component of working memory. Berti and Schroger (2003) had listeners identify the duration of 
tones, the majority of which were 1000 Hz (90%) and a few of which (10%) were 1050 Hz or 
950 Hz, immediately (low-load task) or upon the arrival of the next tone (high-load) task. There 
was a significant interaction between response times and load, and main effects for response 
times. Under the high-load condition, there was less difference in response times between 
standard and deviant tones. Response times were lower for both standard and deviant tones in the 
low-load condition, as would be expected. The greater interruption of the irrelevant pitch change 
in the low-load condition appeared to be triggered by the preattentive detection system, however, 
10
the task requiring greater attention reduced the sensitivity of the preattentive system. Attention 
mediated response times until the task load was too large. An automatic response (faster reaction 
time) indicated a greater ability to ignore the irrelevant dimension of pitch during duration 
judgments. Additionally, this research provides evidence that the salience of stimulus features 
can be dependent upon task load. 
Endogenous and Exogenous Attention 
While attention can be directed toward certain stimuli or specific stimulus attributes, 
attention can be ‘captured’ by salient aspects of events without intentional attention shifts. Green 
and McKeown (2001) discussed the differentiation between endogenous attention, the top-down, 
voluntarily controlled attention, and exogenous attention, the attention that is largely automatic. 
Their research results provided evidence for stimulus-driven control of frequency selection 
during informative and uninformative cue trials despite the participants’ intention to ignore 
frequency. This processing of unintentional auditory features, or in other cases auditory streams, 
was precisely what challenged the computer engineers’ computational model of auditory 
selective attention (Wrigley & Brown, 2004). 
PREVIEW 
Another study in auditory perception confirmed the role of stimulus-driven attention. 
Crawley et al. (2002) conducted research designed to determine differences between musicians’ 
and nonmusicians’ ability to use primitive (stimulus-driven, bottom-up) and scheme-driven 
grouping to detect single errors in 3-voice music with homophonic or polyphonic textures. The 
authors proposed that musicians would demonstrate better flexibility in selecting schemes due to 
experience, particularly the ability to attend to single lines in homophonic music despite the 
likely perceptual grouping of chords. However, the data refuted this hypothesis. Both musicians 
and nonmusicians were better at identifying subtle melodic changes in a homophonic texture, 
particularly when the change was chord-unrelated. The performance of both groups suffered 
when directed to search for melodic changes in a specific voice instead of any change in the 
overall texture. While musicians were significantly better at the error detection task overall than 
nonmusicians, Crawley et al. (2002) concluded that music training did not appear to provide 
musicians with the ability to override perceptual grouping tendencies but did give them better 
ability to use the information in error detection. While control of attention was the intention, 
stimulus properties evoked automatic perceptual strategies despite intention to select a specific 
cognitive strategy. 
11
Attention Research with Infants and Young Children 
Sustained attention to a stimulus and distraction by other stimuli can be investigated in 
very young participants. Richard et al. (2004) investigated attention getting (localization) and 
attention holding (habituation) patterns of infants (5 months old) exposed to simple and complex 
auditory stimuli (repeating scale pattern) using a looking paradigm. The data on localization and 
habituation differentiated the two attention processes in response to these auditory stimuli; a 
progressive decrease in attention-holding but not attention-getting was observed across trials. By 
simply turning their heads, the infants indicated awareness of the location of presented stimuli 
and made preference decisions by the length of time the infant focused on a location. Infants 
preferred complex tones as indicated by longer looking times. These differences were significant. 
Distinct acoustic properties influenced the sustained attention of the infants. 
PREVIEW 
Infants attend to verbal and musical sounds differently. Kinney and Kagan (1976) tested 
the orienting responses of 7-month-old boys and girls for auditory stimuli, specifically short 
verbal (nonsense syllables) and musical phrases (varying in rhythm and timbre) that were 
presented along a continuum of variability from none to extreme. Using heard turns and heart 
rate deceleration as indicators of an orienting response and captured attention, the hypothesis that 
the response would be curvilinear with moderate stimulus changes being more alerting than no 
change or extreme change was supported for both types of stimuli. However, some distinct 
response differences were noted. More infants vocalized during musical stimulus presentations 
with great variability among the boys in the sample. Girls’ fixation responses on variable stimuli 
were closer to the predicted quadratic trend while boys’ responses fell into an inverted U-shape. 
Clearly infants’ responses were discriminate among the variable verbal and musical stimuli with 
differences between the sexes emerging. 
Infant attention to bimodal stimuli can also be tested. Ruff and Capozzoli (2003) 
investigated the attention getting properties of audio and visual distractors in children engaged in 
play with toys. They compared casual, settled, and focused attention disruption by audio only, 
visual only, or audio-visual distractors on children 10 months, 26 months, and 42 months old. 
Based upon the number of head turns as an indicator of the attention-getting properties of the 
distractor, there were significant differences among the age groups by modality of distractor. 
While the three age groups did not differ in responses to visual only distractors, 10 month old 
infants had more head turns in response to audio only (2 or 3 tone sequences) and audio-visual 
12
(tone sequences plus pictures on screen). The differences in complexity of the auditory 
distractor, 2 tones versus 3 tones, were evident in the 42-month-old group only. In the audio only 
condition, children looked longer at the monitor after simple (2-tones) tunes, but longer looking 
times were documented under audio-visual conditions following complex (3 tone) tunes. These 
findings seem to indicate that sustained attention by resisting distraction has a developmental 
sequence that is different based upon the modality of the distractor. The meaningfulness of the 
distractor, the actual tunes versus two tones, was a more salient distractor for the 42-month-old 
group. Perhaps the longer looking times were a reflection of conceptual processing. 
Bahrick and Lickliter (2000) provided convincing evidence that infants’ (5 months old) 
PREVIEW 
attention adhered to the intersensory redundancy hypothesis. The hypothesis proposed that 
information presented synchronously across two sense modalities was processed thoroughly as a 
result of focused attention to the event versus lesser attention for unimodal presentation. Infants 
were presented with a bimodal training stimulus where a red hammer pounded out distinct, 
synchronized auditory rhythms. When presented with a novel rhythm pattern during the test 
phase, significantly longer looking times were noted in comparison to repetition of the training 
rhythm. (Infants look longer at novel presentations.) However, when training was either visual or 
auditory alone, there were no significant differences in looking times during unimodal test 
phases. Not only were the infants stimulated by bimodal stimuli, the encoding of such events 
appeared to be more thorough and provided a basis for future decision making in comparison to 
singular inputs. These data provide further evidence of dual processing capabilities when the 
encoded stimuli recruit different sensory functions. 
Using a protocol similar to Bahrick and Lickliter (2000), Lewkowicz (2003) documented 
that infants as young as 4 months old detected changes in rhythm following audiovisual encoding 
of both syllables and sounds (toy hammer taps). However, only 10-month-old infants looked 
longer at a desynchronized audiovisual rhythm. The author concluded that these older infants 
were able to process the audiovisual events as a single perceptual stream rather than two separate 
streams. This developmental milestone would reduce the load and allow for the attention to 
desynchronized rhythm as a novel, meaningful stimulus. Infants from 4 to 10 months increased 
looking times to desynchronized, arrhythmic nonsense speech. The abilities of infants to process 
synchronized audiovisual events is particularly important to the development of language and 
speech. 
13

Attention working memory

  • 1.
    THE FLORIDA STATEUNIVERSITY COLLEGE OF MUSIC THE EFFECTS OF MUSIC TRAINING AND SELECTIVE ATTENTION ON WORKING MEMORY DURING BIMODAL PROCESSING OF AUDITORY AND VISUAL STIMULI PREVIEW By JENNIFER D. JONES A Dissertation submitted to the College of Music in partial fulfillment of the requirements for the degree of Doctor of Philosophy Degree Awarded: Summer Semester, 2006
  • 2.
    UMI Number: 3232396 PREVIEW Copyright 2006 by Jones, Jennifer D. All rights reserved. UMI Microform 3232396 2006 by ProQuest Information and Learning Company. Copyright All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, MI 48106-1346
  • 3.
    The members ofthe Committee approve the dissertation of Jennifer D. Jones defended on June PREVIEW ii 15, 2006. ____________________________________________ Jayne M. Standley Professor Directing Dissertation ___________________________________________ Jeffrey James Outside Committee Member ____________________________________________ John M. Geringer Committee Member ____________________________________________ Clifford K. Madsen Committee Member The Office of Graduate Studies has verified and approved the above named committee members.
  • 4.
    ACKNOWLEDGEMENT I wishto thank Dr. Jayne Standley for always having answers to my questions and for supporting my research interests. To Dr. Cliff Madsen, I wish to say thank you for ‘getting me.’ It was an honor to teach with you. To Dr. Geringer, you challenged me to think harder than I even thought was possible! Thanks for the stats-induced headaches. I wish to thank my parents for the roots and wings. Thank you, Mother, for always having time to listen to me. Thank you, Daddy, for determination and great math genes. There is no way to adequately thank my husband, Jon Jones. You have had many roles – data-miner, editor-in-chief, financier, chef, computer technician, cat-feeder, and many more. Mostly, I thank you for always saying, PREVIEW “Yes, you can” every time I claimed I couldn’t, and “Yes, you will” when I claimed I wouldn’t. Better than better! iii
  • 5.
    TABLE OF CONTENTS List of Tables vii List of Figures viii Abstract ix 1. INTRODUCTION 1 PREVIEW 2. REVIEW OF LITERATURE 5 Theoretical Framework 5 Attention Research 8 Measuring Attention and Attention as a Central Resource 8 Endogenous and Exogenous Attention 11 Attention Research with Infants and Children 12 Attention, Intelligence, and Development 14 Selective Attention Research 16 Auditory and Visual Stimuli 17 Commonalities and Synergy 17 Localization and Cueing Differences with Cross-modal Stimuli 19 Visual Attention During Auditory Distraction – Irrelevant Sound 20 iv Paradigm Auditory and Visual Dominance Theories 23 Dichotic Listening Paradigm 25 Dual Audio Tasks – Pitch and Duration Judgments (non-dichotic) 29 Music and Memory 30 Memory and Songs – The Influence of Auditory Structure on Serial 30 Verbal Recall Theories of Music Processing and Memory 34 Melody Recognition 35 Developmental and Training Differences 35 Pitch, Rhythm, Contour, and Timbre Discrimination 36 Searching for Melodic Targets 39 Attention During Multi-Voice Music 40 Error Detection and Expectancy as Evidence of Focus of Attention 42 during Multi-voice music Gestalt in Music – Extracting Parts from Wholes 46 Attention to Music – Focused and Therapeutic Listening 47 Bimodal Experiences with Music – Complimentary and Non-complimentary 50 Audio-Visual Encoding and Decoding Music Through Visual, Auditory and Tactile Senses 52
  • 6.
    Music Training andMemory Research 54 Gender Differences in Music and Memory Research 58 The Present Study 61 3. METHOD 63 Pilot Study Materials 63 Music (Auditory Stimuli) 63 Images (Visual Stimuli) 68 Video Development 70 Posttest Construction 70 Procedure for Pilot Study 72 Results of Pilot Study 73 Changes to Music Stimuli for Main Experiment 80 Changes to Test Construction for Main Experiment 81 PREVIEW Main Experiment 83 Participants 83 Design 86 Procedures 88 4. RESULTS 92 Familiarity – Ratings and Total Correct Scores 92 Perception of Attention Allocation 94 Analyses of Modality of Error Scores 95 Analysis of Question Type under Music Conditions 97 Memory Strategies 100 Analyses for Memory Decay 103 Analyses for Serial Position Effects 104 Analyses of Posttest Questions 107 5. DISCUSSION 109 Summary of Results 109 Music Training Effects 109 Recognition Versus Rejection of Stimuli in Working Memory 111 The Role of Strategy 112 Rhythm, Attention, and Information Processing 113 Contour- Similar or Dissimilar to Target Melody 114 Serial Position Effects – Expectancy Theory 115 Attention States 116 Implications for Music Education and Music Therapy 118 v
  • 7.
    Appendix A: MainExperiment – Composition of Audio Distractors for The Bailiff’s 120 BIOGRAPHICAL SKETCH 241 PREVIEW vi Daughter and Pomp and Circumstance Appendix B: Informed Consent Form and Approval Letter from 127 Human Subjects Committee Appendix C: Pilot Study – Pre-Experiment Questionnaire 130 Appendix D: Main Experiment – Posttest Form for Pomp and Circumstance 132 Appendix E: Main Experiment – Posttest Form for The Bailiff’s Daughter 135 Appendix F: Pilot Study – Post-Experiment Questionnaire 138 Appendix G: Pilot Study – Instructions Script 140 Appendix H: Pilot Study – Practice Test Form 143 Appendix I: Main Experiment – Pre-Experiment Questionnaire 145 Appendix J: Main Experiment – Practice Test Form 147 Appendix K: Main Experiment – Post-Experiment Questionnaire 149 Appendix L: Audio Instructions Accompanying Introductory and Instruction 152 Slides in Experiment Videos Appendix M: Introductory, Instruction, and Practice Test Slides 155 Appendix N: Main Experiment – The Bailiff’s Daughter Posttest 159 Appendix O: Main Experiment – Pomp and Circumstance Posttest 163 Appendix P: The Bailiff’s Daughter Video 167 Appendix Q: Pomp and Circumstance Video 169 Appendix R: Raw Data Spreadsheets 171 REFERENCES 226
  • 8.
    LIST OF TABLES 1. Musicianship as Defined by a Sample of Reviewed Literature 59 2. Descriptive Statistics on Pilot Data Group by Music Type 76 3. Question Analysis for The Bailiff’s Daughter – Pilot Study 77 4. Question Analysis for Pomp and Circumstance – Pilot Study 78 5. Academic Majors of the Participants 85 6. Mean Estimated Hours of Music Heard and Performed By Participants Groups 86 7. Study Design – Participant Distribution by Gender, Major, and Instruction Type 87 8. Perception of Attention Allocation to Music and Pictures for Familiar 94 17. Bailiff’s Daughter Distractor Music Items Failing Criterion 108 PREVIEW vii and Unfamiliar Music 9. Mean Correct for Each Question Type for The Bailiff’s Daughter 97 10. Mean Correct for Each Question Type for Pomp and Circumstance 99 11. Frequency Distribution of Strategies Used by Participants 101 12. Mean Total Score for Familiar and Unfamiliar Music By Strategy Type 102 13. Frequency Distribution of Strategies by Instruction Type for Music Conditions 103 14. Frequency Distribution of Total Correct Responses for Pictures by Serial Position 105 15. The Bailiff’s Daughter Frequency Distribution of Total Correct Responses for 106 Music Measures 16. Pomp and Circumstance Frequency Distribution of Total Correct Responses for 107 Music Measures
  • 9.
    LIST OF FIGURES 1. Pomp and Circumstance – original 64 2. Pomp and Circumstance – pilot study version 65 3. The Bailiff’s Daughter – original 65 4. The Bailiff’s Daughter – pilot study version 66 5. Hail to the Chief – original notation 67 6. Hail to the Chief – pilot study version 67 7. The Farmer’s Boy – original key and notation 67 8. The Farmer’s Boy – pilot study version 68 9. The Bailiff’s Daughter order of visual training stimuli (black images on white screen) 69 10. Pomp and Circumstance order of visual training stimuli (blue images on 69 PREVIEW viii white screen) 11. Hail to the Chief order of images (green on white screen) for training stimulus, 70 practice test 2 12. The Farmer’s Boy order of images (red on white screen) for training stimulus, 70 practice test 1 13. Pomp and Circumstance – main experiment 81 14. The Bailiff’s Daughter – main experiment 81 15. Experimental laboratory set-up 89 16.Familiarity ratings by major interaction 92 17. Total correct scores – interaction between major and instruction 93 18. Picture errors – major by instruction interaction 96 19. Music errors – major by instruction interaction 96 20. The Bailiff’s Daughter – question type by gender interaction 98 21. Pomp and Circumstance – question type by gender interaction 99 22. Pomp and Circumstance – question type by major interaction 100 23. Pomp and Circumstance – test half by order interaction 104 24. Question #22 distractor music 107
  • 10.
    ABSTRACT Researchers haveinvestigated participants’ abilities to recall various auditory and visual stimuli presented simultaneously during conditions of divided and selective attention. These investigations have rarely used actual music as the auditory stimuli. Music researchers have thoroughly investigated melodic recognition, but non-complimentary visual stimuli and attention conditions have rarely been applied during such studies. The purpose of this study was to examine the effects of music training and selective attention on recall of paired melodic and pictorial stimuli in a recognition memory paradigm. PREVIEW A total of 192 music and non-music majors viewed one of six researcher-prepared training videotapes containing eight images sequenced with a highly familiar music selection and an unfamiliar music selection under one of three attention conditions: divided attention, selective attention to music, and selective attention to pictures. A 24-question posttest presented bimodal test items that were paired during the training, paired distractors, a music trainer with a picture distractor, or a picture trainer with a music distractor. Total correct scores, error scores by modality, and scores by question type were obtained and analyzed. Results indicated that there were significant differences between music and non-music majors’ recall of the bimodal stimuli under selective attention conditions. Music majors consistently outperformed non-music majors in divided attention and selective attention to music conditions, while non-music majors outperformed music majors during selective attention to pictures. Music majors were better able to reject distractor music than were non-music majors. Music majors made fewer music errors than non-music majors. However, an unanticipated effect of gender was found. Females were better at recognizing paired trainers and males were better at rejecting distractors for both music conditions. Individually selected memory strategies did not significantly impact total scores. Analyses of sample error rates to individual questions revealed memory effects for music due to serial position and rhythmic complexity of stimuli. Participants poorly recalled the final measure of both music conditions. This finding was unusual since this position is generally ix
  • 11.
    memorable in serialrecall tasks. Simple rhythmic contexts were not remembered as well as more complex ones. The measures containing four quarter notes were not well recalled, even when tested two times. This study confirmed that selective attention protocols could be successfully applied to a melodic recognition paradigm with participants possessing various levels of music training. The effect of rhythmic complexity on memory requires further investigation, as does the effect of gender on recognition of melody. A better understanding of what makes a melody memorable would allow music educators and music therapists the opportunity to devise and teach effective strategies. PREVIEW x
  • 12.
    CHAPTER 1 INTRODUCTION Can people do two things at once? When asked, most individuals readily answer “yes” or “no” to this question. Each seems to understand his/her own capacity and preference for ‘multitasking.’ Those who answer “yes” perceive that their performance is not compromised and may be enhanced in highly stimulating environments. Those who answer “no” perceive that they perform best when completing a single task a time. Is either of these groups correct? Researchers have investigated many aspects of this conundrum – are humans single, double, or multi channel thinkers? What conditions influence performance accuracy? What stimulus properties influence the outcome of dual task events? What roles do attention, memory, and experience/training play? Researchers have discovered partial answers to many of these questions, but as questions are answered, technology advances and are faced with new sensory environments that pose new challenges for research. PREVIEW Modern environments are saturated with stimuli; one prevalent environmental stimulus is music, made more readily available to listeners than at any other time in history by the iPod and other portable devices. Researchers have confirmed that music is ever-present in today’s society; we are influenced by music everywhere from work (Lesiuk, 2005) to restaurants (Caldwell & Hibbert, 2002). Many times the hearer chooses the music, other times; listeners have little control over sound environments (North, Hargreaves, & Hargreaves, 2004). Teens and young adults have reported listening to music from 2.5 or 3 hours per day to as much as 40 hours per week (Gardstrom, 1999; North, Hargreaves, & O’Neill, 2000; Radvansky, Fleming, & Simmons, 1995; Schwartz & Fouts, 2003; Tarrant, North, & Hargreaves, 2000). While listening to music is a highly valued leisure activity, it is frequently secondary to another media event, such as watching television and reading (Kubey & Larson, 1990). Therefore, listening to music can be researched in the context of dual task experiments. In fact, many aspects of music listening constitute a dual task, even when listening to the music is the primary objective. Most music contains both pitch and rhythm information. 1
  • 13.
    Researchers have investigatedthe degree to which listeners can attend separately to each of these components (Byo, 1997; Demorest & Serlin, 1997; Sink, 1983). Additionally, music can present different timbres and degrees of intensity for listeners’ attentional foci (Madsen & Geringer, 1990; Radvansky et al., 1995; Wolpert, 1990). Songs introduce yet another stimulus by the presence of text (Bonnel, Faita, Peretz, & Besson, 2001). Performing music also provides a number of dual task opportunities, including the dual auditory tasks of listening to one’s own performance while being aware of the performance of others and the inclusion of visual tasks when performing from notation. Investigations of bimodal audio-visual processing have included musical and non-musical stimuli. Research on the effects of soundtracks to movies provides clarity on music’s PREVIEW influence upon mood (Boltz, 2001) in addition to its impact upon memory for mood-related aspects of films (Marshall & Cohen, 1988). Other research involving short tone sequences and common sounds (door bell, duck quack) paired with visual images has revealed developmental differences in the reliance upon our eyes and ears for information (Napolitano & Sloutsky, 2004; Robinson & Sloutsky, 2004; Sloutsky & Napolitano, 2003). Sloutsky (2003, 2004) and colleagues discovered that young children (4-year olds) relied more heavily on auditory information when encoding bimodal stimuli. This was termed an auditory processing bias. Additionally, children were not able to shift their attention to visual aspects when instructed or able to use this information successfully during testing. In contrast, the visually-dominant adults could shift their attention to auditory inputs successfully. Baddeley’s (1986) components of working memory, namely the phonological loop, visuospatial sketchpad, and central executive function, provide a framework for examining recall for visual and auditory events. According to this proposal, visual and auditory events are processed in different memory sub-components with central executive function acting like a coordinator for incoming information. Contemporary information processing theory concurs with Baddeley’s ideas of separate stores for different incoming stimuli. Information processing theorists propose that there are filters or buffers that prevent the working memory system from overloading by prioritizing information into sensory traces, data-driven, and process-driven concepts (Klahr & MacWhinney, 1998). Information processing theory also categorizes information as serial, including music, speech, and other events that unfold in time, or parallel, including many visual stimuli. While speech and music are both examples of serial auditory 2
  • 14.
    processing, brain studieshave discovered that different areas of the brain are used when processing these events. Through brain scanning technology, researchers have found that verbal, auditory (Mirz et al., 1999), and musical stimuli are processed differently. Generally, the left hemisphere is specialized for speech (Jeffries, Fritz, & Braun, 2003) and words (Samson & Zatorre, 1991) while the right hemisphere processes melody. Rhythm judgment did not appear to be lateralized to the left or right hemisphere (Dennis & Hopyan, 2001; Plenger et al., 1996). Curiously, though some aspects of music are processed in the opposite hemisphere from verbal data, people with musical training demonstrate superior verbal memory (Ho, Cheung, & Chan, 2003; Kilgour, Jakobson, & Cuddy, 2000). No such advantage was found for visual memory (Ho et al., 2003). Seemingly, musicians’ systematic use of their auditory attention and processing yields superior skills in the general auditory domain. However, few studies have examined how musicians compare to others when bimodal audio (musical)-visual tasks are presented to them. PREVIEW Other research has focused on the differences in the male and female brain. The male brain is characterized as being designed for understanding and building systems (and extracting rules that govern systems) while the female brain is more socially oriented (Baron-Cohen, 2005). Likewise, males and females, both infants and adults, responded to music differently, particularly when under stress (Standley, 1998, 2000), and female infants have more acute hearing (Cassidy & Ditty, 2001). However, studies examining tonal memory (Norris, 2000), attention responses (Richard, Normandau, Brun, & Maillet, 2004), and mental capacity (Johnson, Im-Bolter, & Pascual-Leone, 2003) found no differences between the sexes. Often researchers assign equal numbers of each sex in participant groups without reporting differences. It is not conclusive if differences in audio-visual memory between men and women exist at this point. The present study was designed to investigate how musicians versus nonmusicians and males versus females remember paired musical (auditory) and visual components following bimodal encoding examined in a recognition memory paradigm. The effects of attention on the recall of bimodal stimuli among the groups were tested through selective attention instructions. Additionally, this study sought to determine the differences between dual encoding unfamiliar music with unfamiliar images in comparison to retrieval of familiar music and encoding of unfamiliar images. It was expected that differences in the recall of musical events between musicians and nonmusicians would emerge, though a difference between males and females was 3
  • 15.
    not projected. Thenatural patterns of attention to visual and audio/musical events were examined in the group receiving no selective attention instructions. The degrees to which participants could manipulate their attention patterns to musical and visual stimuli were also tested. Encoding and memory strategies of the groups were categorized. PREVIEW 4
  • 16.
    CHAPTER 2 REVIEWOF LITERATURE Theoretical Framework Researchers have long investigated how humans remember (James, 1902). Numerous forms of memory 1993), autobiographical more generally recognized existence and functions has undergone and the late 1950s to “relatively undifferentiated acoustically based, PREVIEW have been differentiated, including implicit and explicit memory (Schacter, memory, episodic, and semantic memory (Nelson, 1993) among the long-term and short-term memory divisions. Few theorists debate the of a long-term memory component, though short-term memory theory continues to undergo revision. Alterations in short-term memory theory from the early 1970s included a shift from the dominant view of memory as a unitary system” (Baddeley, 1976, p. 187) to one of distinct stores for limited-capacity, short-term storage and a more durable long-term store of considerable capacity. Even in the mid 1970s growing evidence that sensory systems (visual, auditory, and kinesthetic) may have a unique memory store compelled further differentiation of short-term memory theory. Baddeley’s (1986) conceptualization of working memory in three major divisions has proven hearty enough to withstand rigorous research and has spawned years of debate. This concept of the central executive with its slave components, the phonological loop and visuospatial sketchpad, provided a framework for not only researching visual and auditory information processing, but how information is selected, organized, coordinated, stored, and ultimately remembered (Baddeley & Hitch, 1974). Broadbent (Broadbent, 1971) contributed the concept of information selection. In the initial proposal in 1958, there was a single channel with a limited capacity through which all information funneled. The capacity limit of the single channel was a function of the rate of information flow, meaning that the organism needed time to process the stream of incoming information. In order to accommodate this limit, selective perceptual processing utilized a buffer store and a filtering system (Broadbent, 1971). Information could be held in the store and selected information filtered out for immediate processing. Broadbent also contributed the 5
  • 17.
    concepts of vigilanceand expectancy to early work in early information processing. Today, vigilance and expectancy are probably best conceived as functions of attention, including selective attention, sustained attention, and inhibiting or ignoring distracting stimuli. Jones (1999) credited Broadbent with the concept of selective attention, particularly for auditory events, that has been exhaustively researched through the dichotic listening paradigm. The role of attention during information processing has continued to be important to working memory theory development. Treisman and Davies (1973) proposed the concept of parallel attention and processing, thereby expanding the single channel theory. In parallel processing, incoming stimuli of different modalities or different properties of the same stimuli can be analyzed at the same time because they do not use the same resources. Two studies conducted by Allport, Antonis, and Reynolds (1972) supported the hypothesis. Allport et al. (1972) proposed a multi-channel hypothesis after research showing participants displayed abilities to accurately recall photography presented while they were engaged in speech shadowing. A second study involved music majors who were able to sight-read piano music while speech shadowing easy and difficult prose passages. There were no significant differences in the memory posttest for the prose passages under conditions of sight-reading on the piano and only speech shadowing. Furthermore, the differences in error rates during the piano task were not significantly different in session 2 under divided or focused attention. This research clarified that when the simultaneous tasks are different enough, each can be completed successfully. The allocation of attention to one task, speech shadowing, did not affect visual memory or motor performance. PREVIEW Cowan (1995) proposed that attention and memory intersected and conceptualized working memory in terms of focus of attention. Cowan (1998) devised a definition of working memory as follows: Working memory is the collection of mental processes that permit information to be held temporarily in an accessible state, in the service of some mental activity” (Cowan, 1998, p. 77). He furthered compared his working memory system with Baddeley’s system (central executive, phonological and visuospatial stores and processors) acknowledging the differing sensory memories for visual and auditory events. Cowan’s working memory system is composed of a capacity-limited focus of attention along with temporarily activated information in permanent memory. According to Cowan, attention to events summons long-term memory 6
  • 18.
    resources that allowfor semantic processing (Cowan, 2005). The use of long-term resources, such as chunking and schemata, was not a new concept. Miller (1956) proposed chunking as a means of overcoming capacity limits for short-term memory. The basic concept of chunking was that like information was perceptually grouped such that information was encoded in small groups rather than a string. One of the most common examples of chunking is remembering telephone numbers in groups of three or four. Researchers have investigated the degree to which stimulus features render the chunks or cogitative processes of the observer (Crawley, Acker-Mills, Pastore, & Weil, 2002; Green & McKeown, 2001). Stimulus-driven chunking would provide evidence of a bottom-up or data-driven approach while scheme-driven chunking would be indicative of a top-down or concept-driven approach, using information technology language (Klahr & MacWhinney, 1998). Chunking obviously occurs; the degree to which it is perceptually (automatic) or conceptually (thought) driven is still under investigation. PREVIEW Information processing theory has developed alongside the understanding and development of computer programming. Based upon storage units, filters, and schemata proposed by psychological theorists, programmers have designed computer models that imitate human information processing. One particular area of interest has been auditory selective attention, a topic intriguing to cognitive psychologists, neurologists, musicians, and computational engineers alike (Wrigley & Brown, 2004). Psychologists have developed and tested theories with behavioral experiments, and computer engineers have integrated information from electrophysiology and neurology. A model was developed that grouped incoming streams of intentional information and allowed for ‘leaks’ representative of the processed unintentional auditory streams. The model represented a culmination of the research on the processing of attended and unattended auditory events. Psychologists have tested this phenomenon by using multiple auditory streams or auditory streams in addition to other input and asking participants to divide attention across streams or select a single stream. Myriad researchers have developed the selective attention paradigm in both auditory and bimodal (often auditory and visual) paradigms. The attention research for bimodal audio-visual and dual/multiple audio events has been framed by Baddeley’s three components of working memory (Baddeley, 1976, 1986; Baddeley and Hitch, 1974) and contemporary theories of information processing with an elaborate system of sensory-specific filters and stores. Investigations have been designed to better understand the 7
  • 19.
    capacity limits forunimodal information and bimodal information, particularly the unique differences between auditory and visual events. Investigations on the recall of serial and nonserial information as well as paired or associated events have provided understanding of the coordination processes during information processing. Researchers have manipulated encoding methods, rehearsal times and strategies, and output. Measurement of performance has included reaction or response times, accuracy rates, including hit and false alarm rates, and a number of brain scan technologies. Designs have included detection, recognition, and discrimination protocols, each contributing different and, at times, conflicting outcomes. Participants have been instructed to attend to stimuli, ignore stimuli, and divide attention across events. Through these expansive and complex investigations of attention and working memory during dual tasks or multisensory environments, much has been learned about how humans process and retain information. PREVIEW While educators are particularly invested in understanding how learning is differentially achieved through vision and audition, research on learning through different senses has of interest to a broad audience. The effects of multisensory input are of special interest to music therapists and music educators. While the researchers studying attention during dual audio tasks have systematically manipulated frequency, duration, and intensity in short tone sequences, fewer researchers have used actual music. Music as a stimulus provides a number of foci for attention, including rhythm, pitch, melody, and harmony, to name a few. Additionally, cognitive memory strategies can be examined using music (Madsen & Madsen, 2002). The use of music as an agent for facilitating the learning of non-musical information has been investigated in addition to the teaching of music understanding and performance. The effect of music training upon memory for verbal and nonverbal information has provided a fertile ground for examining the role of experience in memory and attention. Attention Research Measuring Attention and Attention as a Central Resource Auditory attention has proven to be a difficult construct to measure though James’ (1902) claimed that we all know what attention is. Two such tests have claimed to measure auditory selective attention, namely the Goldman-Fristoe-Woodcock Auditory Selective Attention Test and the Flowers Auditory Test of Selective Attention. Glass, Franks, and Potter (1986) compared these test to determine if they indeed measured the same construct; the researchers found the 8
  • 20.
    tests to correlate(r = .44). Though the correlation was positive, the relative weakness of the relationship indicated that the construct of auditory selective attention was broad. Auditory selective attention ranges from being aware and localizing of sounds, to perceptual processing of relevance, as well as ignoring distraction, and maintaining focus of attention over time. Despite the difficulties presented by empirical measures of attention, a study by Kahneman, Ben-Ishai, and Lotan (1973) demonstrated the validity of attention as a construct. Based upon a high number of accidents – a behavior attributed to poor attention - bus drivers were tested for auditory attention. Kahneman et al. (1973) found moderate, positive correlations between the number of accidents per year by professional bus drivers in Israel and a test of selective auditory attention. PREVIEW Cowan (1995, 2005) has established the importance of attention to working memory; therefore, measures of working memory may to provide an estimate of one’s attention. One aspect of attention that has been frequently measured is the ability to resist distraction. Resistance to auditory distraction, referred to as the irrelevant sound paradigm (Jones, 1999), is a common technique. Beaman (2004) gave participants the Operations Span Task (OSPAN) for working memory during conditions of auditory distraction. He later compared the scores for relationship between the test and behavior. The OSPAN was not predictive of the irrelevant sound effect on serial or free recall of verbal material (Beaman, 2004). Irrelevant speech sounds and words affected both high and low scores from the OSPAN. Though the relationship was not hearty enough to demonstrate a significant relationship, low span individuals were more likely to experience intrusion from previous list trials than high span individuals. It does appear that these tests are spotlighting the same concept, though they have demonstrated flaws. Morey and Cowan (2004, 2005) verified attention to be a central resource in working memory. In the 2004 study, participants were asked to recite their 7-digit phone number, a random set of 7 digits, 2 digits, or no digits while examining visual arrays with 4, 6, or 8 colored squares and subsequently making same-different judgments on the visual arrays. The recitation of numbers was designed to prevent verbal rehearsal of the positions and colors of squares in the visual array by directing attention to numerical recitation. There were significant differences in scores by visual array size and recital condition with no significant interactions between the variables. Performance was best for the smallest array size. The participants’ scores were poorest when reciting 7 random digits in comparison to the other three conditions that did not differ 9
  • 21.
    significantly from oneanother. The authors concluded that some shared space in working memory was used for digits and visual information. The 2005 study provided additional support for the central resource for attention. Morey and Cowan (2005) found that participants were able to make visual array judgments of same or different equally well under no digit recall conditions and silent rehearsal of digits conditions, but vocal rehearsal of the digit lists significantly impacted visual array judgments. The recitation of the to-be-remembered digits, whether before the first visual array or after, interfered with visual memory. Attention was presumably drawn away from the visual task during recall of the list, indicative of a central attentional resource (Morey & Cowan, 2005). The distracting effect of speaking was likely the result of tapping into resources from the phonological loop, particularly since digits were not disruptive to visual array judgments when silently rehearsed. The impact of the distracting spoken digits occurred regardless of the location of the distractor in the sequence (before arrays or during). PREVIEW Further evidence from both studies that a central attention resource was responsible for auditory and visual information was the relationship between accuracy in both modalities. Morey and Cowan (2004) found that visual array comparisons were significantly more accurate when accompanied by correct digit list recall than when the digit recall was incorrect. The same relationship existed for correct digit recall as correct lists were accompanied by correct visual arrays. The co-occurrence of error (and accuracy) confounded the idea of a simple trade-off between stimulus types under dual-task conditions. The same relationship was found in the 2005 study; when the digits were recalled incorrectly, accuracy on the visual array task was lower than when digits were correctly recalled. Demonstration of successful attention allocation was evinced by correct recall of both digits and visual arrays. Research in the auditory distraction paradigm also supports attention as a major component of working memory. Berti and Schroger (2003) had listeners identify the duration of tones, the majority of which were 1000 Hz (90%) and a few of which (10%) were 1050 Hz or 950 Hz, immediately (low-load task) or upon the arrival of the next tone (high-load) task. There was a significant interaction between response times and load, and main effects for response times. Under the high-load condition, there was less difference in response times between standard and deviant tones. Response times were lower for both standard and deviant tones in the low-load condition, as would be expected. The greater interruption of the irrelevant pitch change in the low-load condition appeared to be triggered by the preattentive detection system, however, 10
  • 22.
    the task requiringgreater attention reduced the sensitivity of the preattentive system. Attention mediated response times until the task load was too large. An automatic response (faster reaction time) indicated a greater ability to ignore the irrelevant dimension of pitch during duration judgments. Additionally, this research provides evidence that the salience of stimulus features can be dependent upon task load. Endogenous and Exogenous Attention While attention can be directed toward certain stimuli or specific stimulus attributes, attention can be ‘captured’ by salient aspects of events without intentional attention shifts. Green and McKeown (2001) discussed the differentiation between endogenous attention, the top-down, voluntarily controlled attention, and exogenous attention, the attention that is largely automatic. Their research results provided evidence for stimulus-driven control of frequency selection during informative and uninformative cue trials despite the participants’ intention to ignore frequency. This processing of unintentional auditory features, or in other cases auditory streams, was precisely what challenged the computer engineers’ computational model of auditory selective attention (Wrigley & Brown, 2004). PREVIEW Another study in auditory perception confirmed the role of stimulus-driven attention. Crawley et al. (2002) conducted research designed to determine differences between musicians’ and nonmusicians’ ability to use primitive (stimulus-driven, bottom-up) and scheme-driven grouping to detect single errors in 3-voice music with homophonic or polyphonic textures. The authors proposed that musicians would demonstrate better flexibility in selecting schemes due to experience, particularly the ability to attend to single lines in homophonic music despite the likely perceptual grouping of chords. However, the data refuted this hypothesis. Both musicians and nonmusicians were better at identifying subtle melodic changes in a homophonic texture, particularly when the change was chord-unrelated. The performance of both groups suffered when directed to search for melodic changes in a specific voice instead of any change in the overall texture. While musicians were significantly better at the error detection task overall than nonmusicians, Crawley et al. (2002) concluded that music training did not appear to provide musicians with the ability to override perceptual grouping tendencies but did give them better ability to use the information in error detection. While control of attention was the intention, stimulus properties evoked automatic perceptual strategies despite intention to select a specific cognitive strategy. 11
  • 23.
    Attention Research withInfants and Young Children Sustained attention to a stimulus and distraction by other stimuli can be investigated in very young participants. Richard et al. (2004) investigated attention getting (localization) and attention holding (habituation) patterns of infants (5 months old) exposed to simple and complex auditory stimuli (repeating scale pattern) using a looking paradigm. The data on localization and habituation differentiated the two attention processes in response to these auditory stimuli; a progressive decrease in attention-holding but not attention-getting was observed across trials. By simply turning their heads, the infants indicated awareness of the location of presented stimuli and made preference decisions by the length of time the infant focused on a location. Infants preferred complex tones as indicated by longer looking times. These differences were significant. Distinct acoustic properties influenced the sustained attention of the infants. PREVIEW Infants attend to verbal and musical sounds differently. Kinney and Kagan (1976) tested the orienting responses of 7-month-old boys and girls for auditory stimuli, specifically short verbal (nonsense syllables) and musical phrases (varying in rhythm and timbre) that were presented along a continuum of variability from none to extreme. Using heard turns and heart rate deceleration as indicators of an orienting response and captured attention, the hypothesis that the response would be curvilinear with moderate stimulus changes being more alerting than no change or extreme change was supported for both types of stimuli. However, some distinct response differences were noted. More infants vocalized during musical stimulus presentations with great variability among the boys in the sample. Girls’ fixation responses on variable stimuli were closer to the predicted quadratic trend while boys’ responses fell into an inverted U-shape. Clearly infants’ responses were discriminate among the variable verbal and musical stimuli with differences between the sexes emerging. Infant attention to bimodal stimuli can also be tested. Ruff and Capozzoli (2003) investigated the attention getting properties of audio and visual distractors in children engaged in play with toys. They compared casual, settled, and focused attention disruption by audio only, visual only, or audio-visual distractors on children 10 months, 26 months, and 42 months old. Based upon the number of head turns as an indicator of the attention-getting properties of the distractor, there were significant differences among the age groups by modality of distractor. While the three age groups did not differ in responses to visual only distractors, 10 month old infants had more head turns in response to audio only (2 or 3 tone sequences) and audio-visual 12
  • 24.
    (tone sequences pluspictures on screen). The differences in complexity of the auditory distractor, 2 tones versus 3 tones, were evident in the 42-month-old group only. In the audio only condition, children looked longer at the monitor after simple (2-tones) tunes, but longer looking times were documented under audio-visual conditions following complex (3 tone) tunes. These findings seem to indicate that sustained attention by resisting distraction has a developmental sequence that is different based upon the modality of the distractor. The meaningfulness of the distractor, the actual tunes versus two tones, was a more salient distractor for the 42-month-old group. Perhaps the longer looking times were a reflection of conceptual processing. Bahrick and Lickliter (2000) provided convincing evidence that infants’ (5 months old) PREVIEW attention adhered to the intersensory redundancy hypothesis. The hypothesis proposed that information presented synchronously across two sense modalities was processed thoroughly as a result of focused attention to the event versus lesser attention for unimodal presentation. Infants were presented with a bimodal training stimulus where a red hammer pounded out distinct, synchronized auditory rhythms. When presented with a novel rhythm pattern during the test phase, significantly longer looking times were noted in comparison to repetition of the training rhythm. (Infants look longer at novel presentations.) However, when training was either visual or auditory alone, there were no significant differences in looking times during unimodal test phases. Not only were the infants stimulated by bimodal stimuli, the encoding of such events appeared to be more thorough and provided a basis for future decision making in comparison to singular inputs. These data provide further evidence of dual processing capabilities when the encoded stimuli recruit different sensory functions. Using a protocol similar to Bahrick and Lickliter (2000), Lewkowicz (2003) documented that infants as young as 4 months old detected changes in rhythm following audiovisual encoding of both syllables and sounds (toy hammer taps). However, only 10-month-old infants looked longer at a desynchronized audiovisual rhythm. The author concluded that these older infants were able to process the audiovisual events as a single perceptual stream rather than two separate streams. This developmental milestone would reduce the load and allow for the attention to desynchronized rhythm as a novel, meaningful stimulus. Infants from 4 to 10 months increased looking times to desynchronized, arrhythmic nonsense speech. The abilities of infants to process synchronized audiovisual events is particularly important to the development of language and speech. 13