The
Representation of
Speech Sounds
Anita Eka Puspita Sari (11200983)
Introduction!
Language is fundamentally a complex system well capable of regulating
and nurturing our social behavior, actions, and thoughts. This understanding has
brought several significant questions and issues on the relationship between
language and cognition to the forefront. This line of inquiry marks a significant
contribution not only to linguistics and psychology but also to cognitive science
that aims to develop all-encompassing theories of various human abilities,
including language
2
Description of Representation
A phonological representation is the mental representation of the sounds and combinations of
sounds that comprise words in a particular spoken language. Phonological representations can be
described at the acoustic level, the linguistic level, or the cognitive level. At the acoustic level, the
phonological representation for a word form is analyzed in terms of the raw signal, for example,
in terms of pitch, loudness, and duration. At the linguistic level, the word form is described in
terms of the vocal tract and the ways that it constrains the production of speech sounds, for
example, the manner of production and the place of articulation. At the cognitive level, the
phonological representation is described in terms of its assumed constituent elements, namely
consonant phonemes and vowel phonemes.
3
Description of Speech Sound!
Speech sounds are the vocal sounds we use to make up the words of the English
language. We use them every time we say a word out loud. Saying the right sounds in the right
order is what allows us to communicate with other people and understand what they are saying.
It can help to differentiate speech sounds from the alphabet. For example, in English, the
alphabet is made up of 26 letters. The 44 speech sounds in English are the pure sounds that letters
make when spoken, not related to the name of the letter.
Helping children to understand different speech sounds can be a really beneficial part of
speech development.
4
`Structural Phonology
5
– Sounds of a phonological system within a language constitute the minimal objects
of linguistic representation. Peeking towards phonetics, these objects are often
labeled and regarded as segments. (See also Pöchtrager 2012.)
– In order to determine classes of sounds, recurrence is made to phonetics, mainly to
articulation (e.g., velar, palatal, labial etc.) In a structuralist view, these features do
not define the content of a phoneme per se. Phonemes are
still seen principally in contrast to other phonemes (which prohibits a universal
conception of phonological representations of sounds).
Phonologycal Structures
Phonological structure has three levels:
– constituent structure: onsets, nuclei, post-nuclear rime
complement
– melodic structure: the structure of an onset or nucleus
– A-structure: structural representation of the A-element of SGP
6
Symbolic phonetic representation forms
(SPRS)
This topic describes the form of a symbolic phonetic representation (SPR).
An SPR consists of a sequence of allowable SPR symbols for a given language,
enclosed in quotations and placed within the phoneme tag. For example, the
following are valid SPRs in English:
though <phoneme alphabet="ibm" ph=".1Tru"> through </phoneme>
shocking <phoneme alphabet="ibm" ph=".1Sa.0kIG"> shocking </phoneme>
A period signals the beginning of a new syllable, the digits 1 and 0 indicate the
stress level of the syllables, and the letters T, r, u, S, a, k, I, and G represent
specific sounds of U.S. English speech. Each of these elements is discussed in
more detail in this topic.
7
8
Syllable boundaries
A period is used to mark the beginning of each syllable in the speech generated by the Text-to-Speech technology. However,
periods are optional in SPR input in all languages, and, except in German, do not affect how the Text-to-Speech rules divide a
word into syllables. by the text-to-speech rules.
In German, a period can be used in SPR input to trigger a syllable boundary at the specified location.
Syllable stress
Syllables can be marked for stress using the digits 1, or 2, or 0, for primary stress, secondary stress, and no stress, respectively.
Some languages do not use secondary stress and thus do not accept the use of the digit 2 in SPRs; see sections on specific
languages. If a word has more than one syllable, at least one of these syllables must be marked for primary stress, or the SPR is
considered invalid and is read out character by character. Other syllables can be marked with either secondary or no stress.
Syllables that are not marked for stress are assumed to have no stress
Suppose you do not know where the syllable boundaries are located in the word construction. In this example, any of the
following SPRs correctly place the primary stress on the highlighted vowel:
"construction"
"kXn1strHkSXn"
"kXns1trHkSXn"
"kXnst1rHkSXn"
"kXnstr1HkSXn"
9
Speech sound symbols
Each language uses its own inventory of SPR symbols for representing the speech sounds of that language. Tables in
the following sections contain valid SPR symbols for the sounds of each language, with examples of words in which
each sound occurs. Letters are case-sensitive, so "e" and "E", for example, represent two different sounds. Two-
character symbols must be contained in single quotes; for example, German heim "h'aj'm". SPRs containing sound
symbols that are not allowed in a current language are considered invalid, and are spelled out character by character.
The sounds of every language have specific distributional patterns within that language. For example, in all dialects of
English, the sound "G" in sing ".1sIG" does not occur at the beginning of a word. Other American English sounds that
have a particularly narrow distribution are the glottal stop "?", the flap "F", and the syllabic nasal "N". If you enter a
sound symbol in a context where it does not normally occur, the resulting speech may sound unnatural.
IBM Text-to-Speech technology applies a sophisticated set of linguistic rules to its input to reflect the processes by
which sounds change in specific contexts in natural language. For example, in American English, the sound "t" of
write ".1r1Yt" is pronounced as a flap "F" in writer ".1rY.0FR". SPR input undergo these modifications just as
ordinary input text does. In this example, whether you enter ".1rY.0tR" or ".1rY.0FR" does not affect the generated
speech.
10
Conclusion
The aim, to lay out a conception of phonology that is both as minimal and abstract as possible seems to be
maximally fulfilled.
Most, if not all insights of SGP can be retained as they were proposed in KLV85, KLV89 and SGP.
Pöchtragers ideas about phonological structure are incorporated without recurring to arbitrarily active structural
relations.
The premises to assign the sound systems of many languages a sensible, intuitively clear and phonologically
plausible representation are quite good.
For future research: the interaction of various empty or filled positions may reveal even more sophisticated
insights about the notion of phonological length.
Thanks!
11

REpresentation of Speech Sound.pptx

  • 1.
    The Representation of Speech Sounds AnitaEka Puspita Sari (11200983)
  • 2.
    Introduction! Language is fundamentallya complex system well capable of regulating and nurturing our social behavior, actions, and thoughts. This understanding has brought several significant questions and issues on the relationship between language and cognition to the forefront. This line of inquiry marks a significant contribution not only to linguistics and psychology but also to cognitive science that aims to develop all-encompassing theories of various human abilities, including language 2
  • 3.
    Description of Representation Aphonological representation is the mental representation of the sounds and combinations of sounds that comprise words in a particular spoken language. Phonological representations can be described at the acoustic level, the linguistic level, or the cognitive level. At the acoustic level, the phonological representation for a word form is analyzed in terms of the raw signal, for example, in terms of pitch, loudness, and duration. At the linguistic level, the word form is described in terms of the vocal tract and the ways that it constrains the production of speech sounds, for example, the manner of production and the place of articulation. At the cognitive level, the phonological representation is described in terms of its assumed constituent elements, namely consonant phonemes and vowel phonemes. 3
  • 4.
    Description of SpeechSound! Speech sounds are the vocal sounds we use to make up the words of the English language. We use them every time we say a word out loud. Saying the right sounds in the right order is what allows us to communicate with other people and understand what they are saying. It can help to differentiate speech sounds from the alphabet. For example, in English, the alphabet is made up of 26 letters. The 44 speech sounds in English are the pure sounds that letters make when spoken, not related to the name of the letter. Helping children to understand different speech sounds can be a really beneficial part of speech development. 4
  • 5.
    `Structural Phonology 5 – Soundsof a phonological system within a language constitute the minimal objects of linguistic representation. Peeking towards phonetics, these objects are often labeled and regarded as segments. (See also Pöchtrager 2012.) – In order to determine classes of sounds, recurrence is made to phonetics, mainly to articulation (e.g., velar, palatal, labial etc.) In a structuralist view, these features do not define the content of a phoneme per se. Phonemes are still seen principally in contrast to other phonemes (which prohibits a universal conception of phonological representations of sounds).
  • 6.
    Phonologycal Structures Phonological structurehas three levels: – constituent structure: onsets, nuclei, post-nuclear rime complement – melodic structure: the structure of an onset or nucleus – A-structure: structural representation of the A-element of SGP 6
  • 7.
    Symbolic phonetic representationforms (SPRS) This topic describes the form of a symbolic phonetic representation (SPR). An SPR consists of a sequence of allowable SPR symbols for a given language, enclosed in quotations and placed within the phoneme tag. For example, the following are valid SPRs in English: though <phoneme alphabet="ibm" ph=".1Tru"> through </phoneme> shocking <phoneme alphabet="ibm" ph=".1Sa.0kIG"> shocking </phoneme> A period signals the beginning of a new syllable, the digits 1 and 0 indicate the stress level of the syllables, and the letters T, r, u, S, a, k, I, and G represent specific sounds of U.S. English speech. Each of these elements is discussed in more detail in this topic. 7
  • 8.
    8 Syllable boundaries A periodis used to mark the beginning of each syllable in the speech generated by the Text-to-Speech technology. However, periods are optional in SPR input in all languages, and, except in German, do not affect how the Text-to-Speech rules divide a word into syllables. by the text-to-speech rules. In German, a period can be used in SPR input to trigger a syllable boundary at the specified location. Syllable stress Syllables can be marked for stress using the digits 1, or 2, or 0, for primary stress, secondary stress, and no stress, respectively. Some languages do not use secondary stress and thus do not accept the use of the digit 2 in SPRs; see sections on specific languages. If a word has more than one syllable, at least one of these syllables must be marked for primary stress, or the SPR is considered invalid and is read out character by character. Other syllables can be marked with either secondary or no stress. Syllables that are not marked for stress are assumed to have no stress Suppose you do not know where the syllable boundaries are located in the word construction. In this example, any of the following SPRs correctly place the primary stress on the highlighted vowel: "construction" "kXn1strHkSXn" "kXns1trHkSXn" "kXnst1rHkSXn" "kXnstr1HkSXn"
  • 9.
    9 Speech sound symbols Eachlanguage uses its own inventory of SPR symbols for representing the speech sounds of that language. Tables in the following sections contain valid SPR symbols for the sounds of each language, with examples of words in which each sound occurs. Letters are case-sensitive, so "e" and "E", for example, represent two different sounds. Two- character symbols must be contained in single quotes; for example, German heim "h'aj'm". SPRs containing sound symbols that are not allowed in a current language are considered invalid, and are spelled out character by character. The sounds of every language have specific distributional patterns within that language. For example, in all dialects of English, the sound "G" in sing ".1sIG" does not occur at the beginning of a word. Other American English sounds that have a particularly narrow distribution are the glottal stop "?", the flap "F", and the syllabic nasal "N". If you enter a sound symbol in a context where it does not normally occur, the resulting speech may sound unnatural. IBM Text-to-Speech technology applies a sophisticated set of linguistic rules to its input to reflect the processes by which sounds change in specific contexts in natural language. For example, in American English, the sound "t" of write ".1r1Yt" is pronounced as a flap "F" in writer ".1rY.0FR". SPR input undergo these modifications just as ordinary input text does. In this example, whether you enter ".1rY.0tR" or ".1rY.0FR" does not affect the generated speech.
  • 10.
    10 Conclusion The aim, tolay out a conception of phonology that is both as minimal and abstract as possible seems to be maximally fulfilled. Most, if not all insights of SGP can be retained as they were proposed in KLV85, KLV89 and SGP. Pöchtragers ideas about phonological structure are incorporated without recurring to arbitrarily active structural relations. The premises to assign the sound systems of many languages a sensible, intuitively clear and phonologically plausible representation are quite good. For future research: the interaction of various empty or filled positions may reveal even more sophisticated insights about the notion of phonological length.
  • 11.