音楽認知 2002

読書ノート４２
音楽と認知
p131
１つは、ここでは音楽というときもっぱら調性音楽(tonal music)つまり古典的な西
洋音楽を指す、ということである。音楽についての議論が紛糾するひとつの原
因は、われわれが音楽一般を論じているのか、それともわれわれにとって最も
身近であるような特定の音楽（例えば調性音楽）を論じているのかはっきりし
ないというところにあると思われる。もちろん究極的には、音楽の認知的理論
は、調性音楽以外をもカバーすべきものであるが、研究上の方略としては、従
来最も理論化が進んでいる特定の様式の音楽をとりあげ、そこで見出された知
見を他の様式の音楽にも適用してみる、というように進むことが生産的なので
はないか、と筆者は考える。（もう少し限定するなら、調性音楽の楽曲でも単
旋律を扱うのがよいのではあるまいか。旋律は単に音高の系列ではなく、リズ
ム構造や和声的な構造を暗に含んでいる。その意味でこれは単純ながら完結し
た本当の楽曲ということができると思う。）
　もう１つは、認知科学の研究一般にいえることであるが、音楽の認知的研究
のためには、様々な方法を、いわば便宜主義的(oppotunistic)に組み合わせて用い
るのが最も有効だ、という立場をとる。例えば音楽理論は、人間の認知とはひ
とまず独立に、音楽のシンタックスやセマンティクスを扱ったり、それを利用
しつつ特定の楽曲の構造を分析するものであるが、実はその際、暗に聴き手・
演奏者・作曲家の認知を想定していることが少なくない。こうした暗黙の過程
をより明示的にし、しかもそれを心理的に実在性のあるものに近づけていくと
いうことは、音楽の認知的研究の一方法としてきわめて有望であるように思わ
れる。
（中略）
それと同時に、人々がおこなう活動（何らかの意味で観察可能な認知過程の表
われといってもよい）を手がかりとして、その過程を推論するといった伝統的
な実験心理学的手法も、おおいに活用されるべきである。
（中略）
さらに認知科学では、コンピュータを利用することによって、ある認知的なシ
ステムが次のようなことができるためには、それはどのような知識をどのよう
な形でもっていなければならないかを論じることがある。そのような知識をも

つ認知的なシステムが直ちに人間のモデルになるか、ということは疑わしいが
システムに付与される情報処理機構や知識の心理的実在性を高めるように努力
することによって、これもまた重要な方法になりうる
人は楽曲をいかに表象するか
　さて、そうした便宜主義的な方法論に頼りつつ、調性音楽に含まれている情
報処理過程というものを考える時に、いくつか中心になる概念をおかなければ
ならない。まず第一に必要とされるのは表象という概念であろう。つまり、あ
る楽曲がいかに人々の心の中に表象されていると想定するかが、認知的研究の
出発点である。
　編者には、少なくとも楽曲の３種の内的表象を想定することが必要であると
思われる。第１は、楽譜に書かれた形に近い、いわば記譜された形の楽曲の表
象であって、これを記譜法的表象(notational representation)とよぶことにしよう。第 2
は、音響的表象(acoustic representation)である。ここで音響的といってわざと音楽的
といわないのは、いわば心の中で鳴っている未分化な音の流れと、それが適切
に区切られ、意味づけられた時とを区別して、しかも両方を含めたいからであ
る。後者はまさに音楽的表象(musical representetion)とよぶことが適当であると思わ
れる。第３の内的表象は、より運動的なもの(motor representation)であり、通常、楽
器演奏者はこうした形でも楽曲を表象しているとみられる。
p134
楽曲のパージング
　表象の問題のなかで最も興味深いのは、音響的表象がどのように構造化され
ているか（つまり音楽的表象となっているか）ということであろう。自然言語
処理の場合と同様に、われわれが物理的に、あるいは心の中で音の流れを聴い
た時に、その流れは何らかの仕方で区切られ、簡約化され、他の部分と関係づ
けられるであろう。言いかえると、われわれが持っているある楽曲の音響的表
象は通常なんらかのしかたで構造化されているに違いない。その構造がいくつ
かの水準をもつ階層的なものだと考えることも当を得ていると思われる。
　ここで問題になるのは、平賀氏が正しく指摘しているように、音楽の場合に
は単語にあたるような、いわば自然の切れ目がないということ、したがって構
文解析にあたる音楽的パージングの過程がより困難を含むということであろう

村尾氏の主張は、この点で大きな示唆を与えてくれる。彼がマイヤーその他を
下敷きにして説くところによれば、音の流れはいくつかの水準でグループ化さ
れる。このグループは基本的にはアクセントのある音（あるいは音群）１つに
対して、アクセントのない数個の音（あるいは音群）が結びつけられるという
形でおこる。アクセントは、拍子がはっきりしている楽曲の場合には、拍子上
ないしは小節線の区切り上からもきまってくるが、それだけでなく、その音が
持っている長さとか高さにも依存する。さらにまた、進行中だった傾向が停止
する、という意味で構造上重要な音もアクセントを持つ。これら３種のアクセ
ントが時にかさなったり、時に不一致になったりして、様々な程度の明確さを
もったグループ化が生じるのだ、というのが村尾氏の基本的な考え方である。
p135
つまり、文の理解が、構文解析からはじまるのと同じような意味で、楽曲の理
解は、グループ化からはじまるといってもよいのではあるまいか。（この際に
構文解析において意味処理が促進的に働くのと同じように、楽曲にある耳慣れ
たパターンが含まれていることは、適切なグループ化を促進すると考えてよい
であろう。）
　さて、原初的な音響的表象を音楽的表象につくり変えていく上で欠かせない
もう 1 つの認知過程は、音の流れのなかでいくつかの旋律線ないし旋律進行を
識別することである。（グループ化のなかでも、これなしでは成立しないもの
があることはすでに述べた通りである。）
（中略）
１つの単旋律をとりあげてみても、そのなかにはいくつかの隠された進行があ
りうる。それぞれの進行は、次にどのような音高がでてくるのかを暗に意味す
る。しかしそのような音高がすぐにあらわれるとは必ずしもいえず、むしろし
ばらく遅れて実現するのが、魅力的な旋律の常である。聴き手の側からいえば
ひとつの旋律のなかに、いくつかの進行、島岡の表現を借りれば、「よいつな
がり」を形成する「原音」（旋律原型を構成する音）の連鎖を感じとることが
できなくてはならない。このような仕方で、旋律をいくつかの暗意ー実現関係
の合成体として表象することもまた、楽曲の理解の本質的な一側面であると考
えられる。
　スロボーダが嘆いているように(Sloboda, 1985, p.65)、マイヤーにはじまる暗意

ー実現関係の心理的実在性を実験的に吟味した仕事はまだない、といわざるを
えない。実際に彼が示唆しているようなやり方（ある旋律のオリジナルとそれ
を一部改編したものが、それぞれどのような後続を「期待」させるか、続きを
作らせたり、いくつかの続きを与えてどれが最もよく調和するかを判断させ
る）を使って、人々が（どれほど速やかに）どのように楽曲を音楽的に表象す
るかを検討するのが、心理音楽学の中心的な課題であるに違いない。阿部氏が
報告している「終止音導出」の実験を、こうした方向にさらに発展させること
を期待したい。
音楽における理解の諸側面
　音楽が「わかる」とひとくちにいっても、それには様々な側面があることは
まちがいない。いまみたように、音の流れが適切にグループ化されること、そ
の中にいくつかの暗意ー実現関係が見つけられることは、明らかに「わかる」
ことの一部であるが、それだけでゃないであろう。
　こうしたいわば構造的な理解に対して、もっと「内容的」な理解というもの
もありうるように思われる。それは、音楽の場合には、人々がすでによく慣れ
親しんでいるなかば抽象化されたパターン（これはスクリプトといってよいも
のである。マイヤーは原構造 archetype とよんでいる）にあてはめて理解する、と
いったものである。例えば跳躍を含んだ旋律は、その飛び越した部分をうめる
ように、逆の方向に動いてくるという場合が多い。別の表現をとれば、聴き手
は、跳躍があった時に、それがそれに続く反対の方向への進行によってうめら
れる、という期待をもつわけである。そ r がその通り満たされれば、そこで
「わかった」という感じを持つと思われる。（聴いて簡単にわかる音楽、何か
仕事をしながら気楽に聴ける音楽というのは、こうした既知のパターンをたく
さん含んでいる楽曲であり、また隠された暗意ー実現関係が少ないものだとい
えるのかもしれない。）
　高次の理解が常にそうであるように、音楽が「わかる」というのも、すでに
みただけでもけっこう大変な仕事だが、音楽を深く理解するとか、楽しむとい
うためには、そこで起こっていることだけでなく、そこに（様式の制約からい
って）起こるかもしれなかったこと(what might have been presented)がわからなければ
ならないというのにいたっては、これは１回聴いただけでは無理だと思わざる
をえない。

読書ノート４３
The musical mind
The cognitive psychology of music
John A. Sloboda
p11
2. Music , language, and meaning
2. 1. Introduction
This chapter begins by looking at two influential theorists, the linguist Chomsky and the musicologist
Schenker. Their theories have some striking similarities. They both argue, for their own subject matter,
that human behavior must be supported by the ability to form abstract underlying representations. Two of
man's highest and most complex products seem to display something central about his intellect.
（中略）
The major portion of the chapter is organized around the subdivision of language and music into three
components: phonology, syntax, and semantics. Phonology concerns the way in which a potentially
infinite variety of sounds are 'parcelled up' into a finite number of discrete sound caregories which
constitute the basic communitive units. Syntax concerns the way in which these units are combined into
sequences. A major concern of those studying syntax has been the discovery of rules which reliably produce
legal sequences and eliminate illegal ones. Semantics concerns the way in which meaning is carried by the
sequences so constructed.
2.2. Chomsky and Schenker
p16
The demonstration that events quite far apart in both linguistic and musical sequences can have a close
structual relationship has two major consequences. The first is that whatever granners are eventually shown
to be fully adequate for language and music, they must be more powerful than 'finite state' grammers.
Chomsky (1957) provided the classic proof that this type of grammer is inadequate for a natural
language. In such a granner, words are generated one at atime, each word determining the set from which
the next word may be chosen. The rules are thus 'context independent'.It does not matter in which
context a wod is found; exactly the same set of consecutive words is permissible in each case. In a finite

state grammer, for instance, the word 'washes' must allow 'himself' and 'herself' to follow, since
'the boy washes himself' and 'the girl washes herself' are both correct English sentences. Yet such a
grammer would allow 'the boy washes herself', which is unaccetable. To improve on a finite-state
grammar we must introduce 'context sensitive' rules which take into account more than the immediately
preceding word. The same arguments apply by analogy to music.
2.3. Other comparisons between langage and music
p18
(d) The natural medium for both langage and music is auditory-vocal. That is, both langage and
music are primarily received as sequences of sounds and produced as sequences of voval movements which
create sounds.
p20
The first question we must ask is whether there is any entity which bears the same relationship to a
musical sequence as a thought bears to a linguistic sequence. A thought is not, in itself, a linguistic sequence
on the argument we have outlined. It exists independently of language and could be entertained by a non-
linguistic or pre-linguistic human. Is there any form of mental activity which could take place in a mind
without musical knowledge that could be somehow expressed by a musical sequence? Such activity would
be, precisely, one which could find musical expression in such natural but diverse forms as a Tibetan chant
or a nursery rhyme. One suggestion is that the mental substrate of music is something like that which
underlies certain types of story. In these stories a starting postion of equilibrium or rest is specified.Then
some disturbance is introduced into the situation, producing various problems and tentions which must be
resolved. The story ends with a return to equilibrium. The underlying representation for music could be
seen as a highly abstracted blueprint for such stories, retaining only the features they all have in common.
The learning of a musical language could then be seen as the acquisition of a way of representing these
features in sound. Maybe, therefore, we should look more closely at Schenker's Ursatz for insight into the
possible nature of universals: for, as a deep structure, it is likely to have a close resemblance to the
underlying thought representation of music.
p21
If we examine an Ursatz such as that given in Example 2.3, we find that all its notes are contained in
the tonic triad (of G major in this case) except the middle note of the upperline (the Urlinie). At the
midpoint of the Urlinie we thus find a departure from the resting position which is established at either end
of the Ursatz. Tension and discordhave been introduced; but it is motivated tension. One may argue that

in good stories neither the tensions nor the resolutions are arbitrary. We find it unsatisfactory when the
author introduces some deus ex machina to extricate the hero from a seemingly impossible situation.We
prefer it when the kernel of the solutionj is somehow implicit in what has gone before. For instance, the
villain's evil designs have within them the seeds of their own distruction; the internal dynamics of a
relationship lead the partners to the brink of breakdown and also provide the final resources to save it; and
so on. Similarly, the Ursatz satisfies because it is not just any note (say F) which i troduce tension. It is,
in this case, an A which has two highly important pivotal functions. Firstly, it creates a linear progression
in the Urlinie, B-A-G. The line has its own logic or pattern (two consecutive linear descents of one scale
step) so that, in one sense, the A becomes an inevitable consequence of travelling from the B to the G.
Secondly, it creates, together with its accompanying bass note, the elements of a new triad based on D (A
is the third harmonic of D). The tension-inducing element thus operates by attempting to set up a 'rival
' triad. In the final chord of the Ursatz we witness the 'defeat' of this rival system. Let us, then,
hypothesize that one appropriate 'deep' universal for musical thought is to be summerized in the phrases
'creation and resolution of motivated tension'. This notion has a familk resemblance to the
'implicative' theory of L. B. Meyer (see this chapter, section 2.5).
2.4. Musical phonology
p24
It appears that we acquire the categories of our native language very early in life.A set of studies by
Eimas and his colleagues has demonstrated that three-day-old infans already categorize sounds in the same
way as adults. This precocious ability strongly suggests the existence of special learning mechanisms for
speech patterns, since-of cource-infants categorize differently according to the language they are exposed to
in the first few days of life.
These features of language seem to have some rather close musical parallels. The basic 'phoneme' of
music is a 'note'. Like a phoneme a note is characterized by frequency and duration parameters. Within
a particular musical culture, all music is conposed from a small set of these notes, chosen from an
indefinitely large set of possibilities. Different cultures, however, chosse different subsets of possible notes for
their music. The selection takes place along two dimensions of sound: frequency and duration; these merit
separate discussion.
2.4.1. Categorical perception of frequency
p27

This study raises several issues. Firstly, there is the question of the accessibility to listeners of the
uncategorized frequency information. In the speech studies it would appear that this information is not
normally available to conscious perception. In music, however, this information certainly can be made
available to consciousness. If it were not so, then no chord could ever sound badly tuned-the assimilation
to categories would 'complete' the perceptual experience. The music listener, we must conclude, has some
ability to operate both within and outside the categorical mode.
A second question is, therefore, what perceptual contexts encourage categorical perception? For most
listeners the likely answer is that the context must be one which provides a framework will supply, at the
very least, two stimultaneous (or closely consequent) notes. Thus, what music listeners carry in their
memories are not the absolute pitches of any particular scale, but procedures for generating a scale from
any given tonic. In Locke and Keller's study this framework was supplied by the invariant outer notes of
the chord which maintained a 3:2 frequency ratio throughout. This identified them, within diatonic
tonality, as the tonic (first step) and dominant (fifth step) of a diatonic scale. This framework imposed a
categorization on the middle note as submediant (third step) of either a major or a minor diatonic scale.
Had the experiment been carried out using only a single variable note without chordal context, then it is
unlikely that any categorization would have taken place. Data from normal psychophysical studies support
this assertion. There is no evidence of discontinuity in the discrimination functions for single frequencies.
A third question concerns the genesis of categorization within the indivisual. The music data suggest
that, unlike langage, mere exposure to tonal music is insufficient to bring about categorical perception.
Some aspect of musical training heightens the tendency to categorize. For instance, most musical training
involves learning note names and scale terminology. It is possible that possessing verbal labels increases the
likelihood of categorical information being extracted and stored. Another possibility is that, for some
musicians, each category becomes associated with a prototypical 'absolute' frequency band. This is
possible, at least in Western cultures, because there are generally agreed conventions about the precise
frequencies to which instruments should be tuned. Concert A is defined as 440 Hz. Such frequencies could
come to represent the 'central' positions of scale categories for listeners, with deviant pitches being
assigned to the nearest prototype.There is considerable evidence for an ability that could support such
behaviour. It is called 'absolute pitch' or 'perfect pitch' and is possessed by a significant minority of
trained musicians. This is an accurate long-term memory for prototypical pitches and their associated scale
names; and I shall discuss it more fully in Chapter 5.（中略）
p28

A different type of evidence for assimilation of music to scale categories is provided by Dowling
(1978). In this study subjects were required to judge pairs of brief melodies as the same or different. In
some cases the second melody (which was always at a diffrent pitch from the first) was an exact
transposition of the first melody. In other cases, the melodic contour was a 'tonal answer' to the first;
that is, the melodic contour was transposed up or down within the same key. This has the consequent that
the exact intervals of original melody are not preserved. （中略）Dowling found that his subjects could
not consistently discriminate exact transpositions from tonal answers. One plausible explanation for this
finding is that, at least for unfamiliar melodies, subjects code melodies as contours in which the number of
scale ateps between adjacent notes is represented, but not the precise pitch distance.
2.4.2 Categorical perception of duration
p28
In a carefully designed series of experiments, Cutting and his colleagues have demonstrated that variations
in rise time (the time from sound onset until yhe time when waveform amplitude reachesits peak) are
responsible for the perception of ithis quality, rise time of 30 ms or less giving rise to 'plucked' sounds,
and those of 60 ms or more producing 'bowed' sounds. The discrimination function shows a peak at
the caregory boundary (around 40 ms) and troughs within each category. The interest of this
phenpmenon is threefold. First, it exactly matches the functions for the categorical perception of a phonetic
distinction in speech: the one displayed between the words 'chop' and 'shop'. This also depends on rise
times and showes a category boundary at about 40 msec. Secondary, adaptation to a sound well within
one category shifts the category booundary towards the unadaptated category. This exactly matches speech
perception data. Thirdly, infants as young as two months demonstrate categorical perception for these
sounds, just as they do for many speech sounds.
These findings have been used to dispute the view that speech perception involves unique psychological
processes and mechanisms. Cutting et al. state that 'it is evident that the arsena of empirical findings which
once distinguished speech perception as a unique type of auditory perception is steadly depleted.' It seems,
in fact, that we possess some perceptual mechanisms which are present from early age and are deployed in
both speech and music perception to produce categorical perception. The reason why the same physical
attribute (rise time) gives rise to different perceptual experience in speech and music is probably that the
acoustic contexts are different. There is evidence that the nature of the immediatedly following vowel/note
affects the way in which the rise portion is heard. For instance, the sound patterns for stop consonants (like
/t/ ans /p/) are heard as chirps when presenreted in isolation. They require a subsequent vowel in order

to be heard as speech sounds. Similaly, Cutting et al. sowes that the synthetic musical sounds must persist
for at least 250 ms after the initial rise in order to be heard as plicked or bowed.
The issue of context also assumes central importance in the second major aspect of musical duration that
I wish to discuss. This is the perception of the duration of a note. The results of standard psychological
tests tells us that when two successive isolated tones are presented for discrimination of duration (i.e. subjects
must say which is longer), then the longer the sounds are, the greater must be the difference between them
if it is tobe reliably detected. IN music , however, absolute durations are less important than the rhythmic
implications which notes acquire through their immediate context. Fundamental to most Western music is
the concept of a beat, a musical pulse with underlies any melody. In general, notes will either begin on
the beat or at some simple subdivision of yhe beat (half, third, and quarter being common). This fact is
reflected in musical terminology and notation, where there exists a limited set of categories for discribing
durations of notes. These categories are, for the most part, divisions of a longer category into two equal
halves. Thus, there are two crotchets (quarter notes) to every minim (half note); two quavers (eighth
notes) to every crotchet, and so on. In a paticular duration one of these symbols may be defined as having
a paticular duration (for instance, 'crotchet = 120' is a standard way of indicating that there should be
120 crotchet beats per minute).
p30
In the light of these considerations it becomes a little easier to understand some puzzling data provided
by Sternberg, Kroll, and zukorfsky (1982) who showed that three highly trained professional musicians,
including Pierre Boulez, were unable to reproduce non-standard subdivisions of a beat accurately. In the
experiments subjects heard a series of regular beats at one second intervals, one of which was followed by a
click at a delay ranging from one eiyhth and one seventh of a beat. The reproduction of the shorter
subdivisions (less than one third of a beat) were systematically in error, all being overestimates. In contrast,
reproduction was very accurate when the subdivision was half, three quaters, five sixths, or seven eighths of
a beat. A similar pattern of results was obtained when the subjects were asked to estimate the delay of a
click by giving a verbal categorization (e.g. 'between one eighth and one seventh of a beat'), except that
in this situation the errors were under estimates. Why were these subjects so poor? Maybe accurately
reproduced and estimayed delays correspond to frequently occuring rhythmic patterns in music, Which are
readily categorized.
p32
My discussion of musical phonology has been designed to illustrate one fundamental feature of music

behaviour. That is, we tend to caregorize our musical experience along the available dimensions of sound,
giving importance to differences between categories at the expense of differences within categories.The
notions of scale and metre are the fundamental concepts underlying musical phonology (although timbre
and intensity are arguably additional dimensions). The similarities between language and music in this
respect are striking, although categorical perception in music is neither as complet nor as universal as it
appears to be in language. To understand the musical significance of categorization we must now turn to
the way notes are combined with one another. This is the subject matter of musical syntax.

音楽認知 2002

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Similar to 音楽認知 2002

Similar to 音楽認知 2002 (20)

More from TeruKamogashira

More from TeruKamogashira (20)

Recently uploaded

Recently uploaded (20)

音楽認知 2002