Vietnamese Acquisition Of English Word Stress

    Vietnamese Acquisition Of English Word Stress Vietnamese Acquisition Of English Word Stress Document Transcript

    • BRIEF REPORTS AND SUMMARIES TESOL Quarterly invites readers to submit short reports and updates on their work. These summaries may address any areas of interest to Quarterly readers. Edited by CATHIE ELDER University of Auckland PAULA GOLOMBEK Pennsylvania State University Vietnamese Acquisition of English Word Stress THU T. A. NGUYEN University of Queensland St. Lucia, Queensland, Australia JOHN INGRAM University of Queensland St Lucia, Queensland, Australia ■ The acquisition of prosody in a second language (L2) is relatively understudied; however, the acquisition of stress—especially in English as L2—has recently received some attention (e.g., Archibald, 1992, 1993, 1995, 1997, 1998; Pater, 1997; Peperkamp, Dupoux, & Sebastián-Gallés, 1999). Nevertheless, these studies focus on the transfer of the phonologi- cal aspect of stress (e.g., stress placement and truncation);the transfer of first language (L1) acoustic features in realizing L2 stress has not been widely investigated. In a recent study on the bidirectional transfer between English and Japanese, Ueyama (2000) found that when produc- ing L2 word accent in the beginning stage of L2 development, L2 speakers tend to import L1 phonetic features, for example, pitch (F0: fundamental frequency) in L1 Japanese. However, speakers can also modify these L1 habits to simulate L2 patterns. In addition, L2 speakers use an acoustic correlate that is already active in the L1 system (e.g., F0 in L1 Japanese) to learn to control a correlate that is not active in L1 (e.g., syllable duration in L1 Japanese). This study examined the transfer of tonal acoustic correlates in Vietnamese learners’ production of English word stress. More specifically, the study examined acoustic features that native and nonnative speakers (Vietnamese learners of English) use to differentiate stressed from unstressed syllables in noun- verb pairs (e.g., as in the words record vs. rec ord). Vietnamese and TESOL QUARTERLY Vol. 39, No. 2, June 2005 309
    • English, respectively, represent two broadly contrastive prosodic types: tone languages and stress languages. English has a system of culminative word stress, but Vietnamese, a tonal language, has no system of word stress; rather, it has a system of lexically distinctive tones (Nguyen, 1970; Nguyen, 1980). Stress is different from tone in several ways. First, stress is culminative (head-marking); that is, in stress languages, with few exceptions, every (content) word has at least one stressed syllable. Second, because a prominence hierarchy may occur among multiple stresses (e.g., primary vs. secondary stresses in English), stress is hierarchical. Third, stress can mark edges or boundaries in some systems; for example, some languages prefer iambic feet (stress on the final syllable), but others prefer trochaic ones (stress on the initial syllable). Fourth, stress is rhythmic in systems where stressed and unstressed syllables alternate and where clashes (adjacent stresses) are avoided. Fifth, stress contrasts tend to be en- hanced segmentally: Stressed syllables may be lengthened by vowel lengthening or by gemination, and unstressed syllables may be weakened by vowel reduction (Kager, 1996). Vietnamese, as a tonal language, has no system of culminative word stress but a system of six lexical tones in which pitch is used to contrast individual lexical items or words. As a result, English and Vietnamese differ in terms of how they manipulate the acoustic correlates in word-level prosody. Studies on the acoustic correlates of English stress show that judgments of linguistically signifi- cant stress in English are contingent on at least four acoustical param- eters: fundamental frequency, duration, amplitude, and vowel quality (Beckman, 1986; Fry, 1955; Lehiste & Peterson, 1959; Lieberman, 1960). Research on tonal languages (Chao, 1930, 1980; Gandour, 1983; Gandour & Harshman, 1978; Hashimoto, 1986; Ruhlen, 1976; Vance, 1977; Wang, 1967) shows that a limited number of tonal dimensions are used to signal tonal distinctions: pitch height, direction of pitch movement, pitch range, and beginning and ending point of pitch movement. Of these pitch characteristics, two primary dimensions of linguistic tone are most commonly identified: pitch height and the direction of pitch movement (Gandour, 1983; Gandour & Harshman, 1978). In Vietnamese, in addition to direction of pitch movement (tone contour) and pitch height, tones are distin- guished by voice quality, intensity, and duration (Nguyen & Edmondson, 1997; Pham, 2000; Vu Thanh Phuong, 1981). Intensity was found to highly correlate with pitch (Vu Thanh Phuong, 1981) and thus is supplementary to pitch. Duration or, particularly, tonal length is not a distinctive feature in Vietnamese (Pham, 2000; Vu Thanh Phuong, 1981) but only varies in segmental contexts. From a study on native perception of Vietnamese tones, Vu Thanh Phuong (1981) concludes that the direction of pitch movement, pitch height, and voice quality play a more important role than other tonal dimensions, such as duration and 310 TESOL QUARTERLY
    • intensity, in the identification of tones. Intensity and duration support perception but play no independent role in tone recognition. The aforementioned studies show that even though both languages employ F0 as perceptual cues (to tone in Vietnamese and stress in English), the two languages differ in terms of how acoustic cues are manipulated. In English, stressed syllables are longer than unstressed syllables (i.e., duration is an active correlate in producing word stress), and unstressed vowels tend to be reduced. In contrast, in Vietnamese, a syllable-timed language, no systematic difference in duration or vowel quality among syllables has been found. The comparison of acoustic features used in the two languages shows potential prosodic transfer effects in the ways that Vietnamese learners produce English word stress patterns: (a) The active role of F0 as a tonal cue in Vietnamese probably facilitates the production of F0 (and intensity) contrasts between lexically stressed and unstressed syllables in L2 English. (b) Because duration and vowel reduction do not function in Vietnamese as active cues for tonal contrasts, Vietnamese learners will have difficulties producing the requisite vowel duration and quality contrasts for English word stress. METHOD Linguistic Materials Twenty minimal pairs of nouns and verbs such as pre sent (noun) and present (verb) were used as the stimulus items. Five minimal pairs were three-syllable words (document vs. document), the remaining 15 pairs were two-syllable words (see Appendix for the complete stimulus set). For each pair, the noun form has word stress on the first syllable, and the verb form has word stress on the second or third syllable. The two forms are segmentally homophonous except for the vowel reduction in the unstressed syllable. Each noun and verb was embedded in the carrier sentence, “Say the word ______ again.” Stress was marked on each stressed vowel to make sure that Vietnamese speakers produced the correct stress patterns. To ensure that speakers produced the correct contrastive stress patterns, the sentences were presented in pairs with a target noun form immediately followed by a counterpart verb form. For example: 1. a. Say the word “conduct” again. b. Say the word “conduct ” again. 2. a. Say the word “present” again. b. Say the word “present ” again. BRIEF REPORTS & SUMMARIES 311
    • Subjects Three groups of subjects participated in this experiment: beginning- level Vietnamese learners of English, advanced-level Vietnamese speak- ers of English, and a control group of native speakers of Australian English. The beginning group consisted of 10 subjects (3 from Hanoi, 3 from Hue, and 4 from Saigon; 5 male and 5 female), and they were paid for their participation. All had just completed their first year as English majors at universities in Hanoi, Hue, and Saigon. They had all started learning English at the age of 12 (in secondary school) with the grammar translation method, which focuses on vocabulary and grammar learning. However, during their first year in university, they were exposed to communicative English learning. The advanced group consisted of 10 postgraduate students at the University of Queensland (4 Southerners, 3 Northerners, and 3 Hue speakers). They were in the age range 25–32. Their length of residence in Australia varied from 8 months to 10 years. Eight of them had received a bachelor of arts degree in English and had been teaching for 2 to 3 years. They were working toward a master of arts degree in TESOL. Two other subjects had just finished bachelor’s degrees in science studies. Like the beginning-level group, the advanced- level students had started learning English at the age of 12 with the grammar translation method. They were exposed to communicative language teaching methods during their 4 years of undergraduate study. They spoke Vietnamese and English, and they had very limited knowl- edge of French, which they had learned at university as a second foreign language with a curriculum that strongly emphasized grammar. Two native speakers of Australian English (a phonetician and a linguist from the University of Queensland linguistics program) served as the control native speaker group. It is worth noting that the nonnative speaker groups in this study included speakers of the three main Vietnamese dialects: speakers from Hanoi representing the northern dialect, speak- ers from Hue representing the central dialects, and speakers from Saigon representing the southern dialects. This study was originally designed to examine dialectal differences in prosodic transfer effect, but a preliminary analysis and other related studies showed no significant dialectal differences on variables investigated. Therefore, the dialect factor was excluded from this report. Procedures Before the recording, subjects were presented the list of contrastive sentence pairs and provided sufficient time for familiarization and practice. They then read the sentences aloud three times in their normal speaking manner. Only the third repetition was recorded and used for 312 TESOL QUARTERLY
    • analysis. The two native speakers recorded five repetitions, all of which were used for analysis. The recording was made in a quiet room using Speech Station, sound recording and editing software, at 20-kHz sam- pling rate and 16-bit precision. Measurements The acoustic measurements included fundamental frequency (F0), vowel and syllable duration, and intensity of the accent-bearing elements (the first syllable and the second syllable in a two-syllable word, or the first syllable and the third syllable in the three-syllable word). All the measurements were made using Emu Speech Tools (Cassidy, 1999). Studies of the effects of stress and accent on duration in English have shown that not only the rhymes but also the initial consonants are lengthened relative to their counterparts in unstressed syllables (see, e.g., Ingrisano & Weismer, 1979; Umeda, 1977). Therefore, in this experiment, we measured the duration of the vowel as well as of the whole syllable, including the onset and the rhyme. We also measured the fundamental frequency F0 (in Hz) and intensity (in db) values from the center point of each vowel. ANALYSIS First, to examine the effect of word stress (stressed vs. unstressed) on the acoustic correlates (i.e., to find out whether each speaker group could distinguish between stressed and unstressed syllables based on the four acoustic correlates), a series of two-way ANOVAs (stress speakers) were conducted on each acoustic parameter (F0, intensity, syllable duration, and vowel duration) for each speaker group. This process yielded 12 separate data sets (4 acoustic parameters 3 speaker groups). The independent variables were stress (stressed vs. unstressed) and speakers. The dependent variable was the acoustic parameter. Second, to compare the degree of difference in acoustic values between stressed and unstressed syllables among the three speaker groups (e.g., the difference in degree of stressed-syllable lengthening), the ratios of stressed to unstressed vowels in terms of the four acoustic parameters were calculated (e.g., F0 ratios were calculated by dividing the F0 value of the stressed vowel by that of the corresponding unstressed vowel). Then one-way ANOVAs with planned comparison among speaker groups (native vs. advanced, advanced vs. beginner, and native vs. beginner) by the Tukey method were conducted on the ratios of each acoustic parameter (F0 ratios, intensity ratios, syllable duration ratios, and vowel duration ratios). BRIEF REPORTS & SUMMARIES 313
    • RESULTS Results of Anovas on Stress as a Factor (Stressed vs. Unstressed) The results of 12 separate ANOVAs showed significant main effects for stress (p 0.001) and speakers (p 0.001), but no significant interaction effect of stress speakers. The significant speaker effect merely indicates speaker variation in intrinsic voice characteristics. Therefore, we examine only the main effect of stress (see Table 1). As shown in Table 1 and Figure 1, the F0 and intensity values on stressed syllables were significantly higher than the same values on the unstressed syllables across the three speaker groups, indicating that all three groups could differentiate stressed from unstressed syllables in terms of F0 and intensity. However, there is a difference among speaker groups on the duration parameter. As shown in Table 1 and Figure 2, the native Australian English speakers and the advanced Vietnamese learners of English produced stressed syllables and vowels that were significantly longer than the corresponding unstressed syllables, but the beginning Vietnamese learners produced stressed and unstressed syllables with no significant difference in duration. This result indicates that beginners failed to encode the duration cue in their production of English word stress. Results on Acoustic Ratios of Stressed to Unstressed Syllables This section examines the magnitude of difference in acoustic values between stressed and unstressed syllables among speaker groups (e.g., the difference in degree of stressed-syllable lengthening and vowel- duration reduction among the three speaker groups). The results of one- way ANOVAs (ratios speaker groups) on the ratio of each acoustic TABLE 1 Results of ANOVAs on Stress Native Advanced Beginner Syllable duration F(1,1) 14.9 F(1,1) 26.4 F(1,1) 2.3 p 0.001 p 0.0001 NS Vowel duration F(1,1) 12.8 F(1,1) 69.5 F(1,1) 4.6 p 0.001 p 0.0001 p 0.04 F0 F(1,1) 60 F(1,1) 146 F(1,1) 40.8 p 0.0001 p 0.0001 p 0.0001 Intensity F(1,1) 14 F(1,1) 168 F(1,1) 32.4 p 0.001 p 0.0001 p 0.0001 314 TESOL QUARTERLY
    • FIGURE 1 Mean and Standard Deviation of F0 Values (left) and Intensity Values (right) 300 120 Mean intensity (db) 250 * 100 Mean F0 (hz) * * 200 80 * * * 150 60 100 40 50 20 0 0 Native Advanced Beginner Native Advanced Beginner Stressed Unstressed * significantly different at p 0.01. value of stressed to unstressed syllables all showed significant main effects (p 0.001). The significant pair comparisons by the Tukey method between speaker groups are given in Table 2 (only those significant at p 0.01 are flagged). As shown in Table 2 and Figure 3, the native and advanced groups produced significantly greater duration ratios of stressed to unstressed syllables than beginners did, but advanced and native groups produced no significant difference in duration ratios. The mean ratios of syllable and vowel duration produced by the native and advanced groups range from 1.3 to 1.5, respectively. In contrast, the ratios of beginning speakers are clustered around 1.0, which indicates that they produced no differ- ence in duration between stressed and unstressed syllables, confirming FIGURE 2 Mean and Standard Deviation of Syllable Duration 600 Mean duration (ms) 500 * * 400 Stressed 300 Unstressed 200 100 0 Native Advanced Beginner * significantly different at p 0.01. BRIEF REPORTS & SUMMARIES 315
    • TABLE 2 Comparison of Ratio Difference in Acoustic Values Between Stressed and Unstressed Syllables Among Speaker Groups F0 Intensity Syllable duration Vowel duration Advanced beginning-level Vietnamese speakers of English *** *** *** *** Advanced-level native speaker of English Beginning-level native speaker of English *** *** ***Significant at p 0.01. the ANOVA results on duration presented in Table 1. Advanced speakers show ratio magnitudes equivalent to that of native speakers, indicating that they produce native-like duration patterns (i.e., the magnitude of difference in duration between a stressed syllable and an unstressed syllable is the same as that of native speakers). Even though the difference in F0 ratios and intensity ratios between the advanced and beginner groups is statistically significant, the magnitude of difference is not large. Vietnamese speakers of English and native speakers of English generally do not produce significantly different F0 and intensity ratios (i.e., no significant difference in magnitude of pitch contrast between Vietnamese and native English speakers). FIGURE 3 Average Ratio of Acoustic Parameters of English Stressed/Unstressed Syllables 1.8 1.6 1.4 1.2 Advanced Ratios 1 Beginning 0.8 0.6 Native 0.4 0.2 0 F0 Intensity Syllable Vowel duration duration 316 TESOL QUARTERLY
    • CONCLUSION The results of the experiment generally support the predictions. Both groups of Vietnamese speakers could differentiate stressed from un- stressed syllables in terms of F0 and intensity as well as native English speakers did. This finding suggests that the active role of F0 (and intensity) as tonal cues in Vietnamese facilitated the production of F0 (and intensity) contrast between lexically stressed and unstressed syl- lables in L2 English. It is noted that increase in intensity between unstressed and stressed syllables tended to correlate with changes of voice pitch but was usually marginal and thus had less perceptual discriminating power, consistent with classical experiments on the pho- netics of English word stress (Fry, 1955). Although advanced speakers could produce native-like duration pat- terns between stressed and unstressed syllables, beginners failed to differentiate English stressed and unstressed syllables in terms of dura- tion. This finding suggests a negative transfer effect: Because duration does not function as an active cue in Vietnamese tonal distinctions, Vietnamese beginning English learners fail to encode this cue in their L2 production. Nevertheless, advanced speakers’ ability to produce con- trasting duration between stressed and unstressed syllables indicates that this feature is learnable. In conclusion, the results of this study are consistent with findings in our related investigations (Nguyen, 2003; Nguyen & Ingram, 2002) that native speakers and nonnative speakers employ acoustic cues in different ways that are optimally suited to their respective L1 phonologies. Native speakers produced word stress using both pitch and duration cues. In contrast, when compared with advanced Vietnamese learners of English, beginning Vietnamese learners produced word stress that accommo- dated L2 pitch and intensity targets but not timing parameters such as duration and vowel reduction. Although beginning Vietnamese learners failed to realize or recognize syllable duration contrast and vowel reduction, phonetic features that are not active in their L1, this result does not mean that they do not have the ability to perceive or to encode duration contrast, but that they need to be explicitly taught at the initial stage to encode this necessary cue. Explicitly teaching learners about these features will help them master the features faster than letting them pick up the features through exposure to the language, particularly in a foreign language context. Therefore, it is necessary for ESL teachers to draw learners’ awareness to these features and to provide them with explicit training, particularly the vowel reduction and syllable duration contrast in the acquisition of English word stress. BRIEF REPORTS & SUMMARIES 317
    • THE AUTHORS T. A. T. Nguyen is a postdoctoral research fellow and a lecturer in linguistics at the School of English, Media Studies, and Art History, the University of Queensland, Australia. His research interests include phonetics, phonology, second language phonology, and Vietnamese prosodic phonology, second language phonology. He has been a TESOL lecturer in Vietnam. John Ingram is a senior lecturer in linguistics at the School of English, Media Studies, and Art History, the University of Queensland, Australia. His research interests include phonetics, phonology, and psycholinguistics. His most recent work includes studies on Parkinson’s Disease and on Vietnamese acquisition of Australian English. REFERENCES Archibald, J. (1992). Transfer of L1 parameter settings: Some empirical evidence from Polish metrics. Canadian Journal of Linguistics, 37(3), 301–339. Archibald, J. (1993). Metrical phonology and the acquisition of L2 stress. In F. R. Eckman (Ed.), Confluence: linguistics, L2 acquisition and speech pathology (pp. 37– 48). Amsterdam: John Benjamins. Archibald, J. (1995). A longitudinal study of the acquisition of English stress. Calgary working papers in linguistics, 17, 1–10. Calgary, Alberta, Canada: Department of Linguistics, University of Calgary. Archibald, J. (1997). The acquisition of English stress by speakers of nonaccentual languages: Lexical storage versus computation of stress. Linguistics, 35, 167–181. Archibald, J. (1998). Second language phonetics, phonology, and typology. Studies in Second Language Acquisition, 20(2), 189–213. Beckman, M. E. (1986). Stress and non-stress accent. Dordrecht, Holland: Foris. Cassidy, S. (1999, September). Compiling multi-tiered speech databases into the relational model: Experiments with the Emu System. Eurospeech ’99: Vol. 6 (pp. 2239–2242). Budapest, Hungary: N.p. Chao, Y. R. (1930). A system of tone letters. Le Maitre Phonetique, 45, 283–319. Chao, Y. R. (1980). Chinese tone and English stress. In L. R. Waugh & C. H. Van Schooneveld (Eds.), The melody of language (pp. 41–44). Baltimore, MD: University Park Press. Fry, D. B. (1955). Duration and intensity as physical correlates of linguistic stress. Journal of the Acoustical Society of America, 27, 765–768. Gandour, J. (1983). Tone perception in Far Eastern languages. Journal of Phonetics, 11, 149–175. Gandour, J., & Harshman, R. A. (1978). Crosslanguage differences in tone percep- tion: A multidimensional scaling investigation. Journal of Phonetics 11, 149–175. Hashimoto, A. (1986). Tone sandhi across Chinese dialects. In The Chinese Language Society of Hong Kong (Ed.), Wang Li memorial volumes (pp. 445–474). Hong Kong: Joint Publishing. Ingrisano, D., & Weismer, G. (1979). s-Duration: methodological and linguistic factors. Phonetica, 36, 32–43. Kager, R. (1996). The metrical theory of word stress. In J. A. Goldsmith (Ed.), The handbook of phonological theory (pp. 367–443). Cambridge, MA: Blackwell. Lehiste, I., & Peterson, G. E. (1959). Vowel amplitudes and phonemic stress in American English. Journal of the Acoustical Society of America, 31, 428. Lieberman, P. (1960). Some acoustic correlates of word stress in American English. Journal of the Acoustical Society of America, 32, 451. 318 TESOL QUARTERLY
    • Nguyen, D. H. (1980). Language in Vietnamese society. Chicago: University of Illinois Press. Nguyen, D. L. (1970). A contrastive phonological analysis of English and Vietnamese (Pacific Linguistics Series, No 8). Canberra: Australian National University. Nguyen, T. A. T. (2003). Prosodic transfer: The tonal constraints on Vietnamese acquisition of English stress and rhythm. Unpublished doctoral dissertation, University of Queensland, Australia. Nguyen, T. A. T., & Ingram J. (2002). Native and Vietnamese production of compound and phrasal stress patterns. In J. H. L. Hansen & B. Pellom (Eds.), ICSLP-2002 Conference Proceedings: Vol. 2. 7th International Conference on Spoken Language Processing, September 16–20, 2002, Denver, Colorado. (pp. 533–536). Boul- der, CO: Center for Spoken Language Research. Nguyen, V. L., & Edmondson, J. (1997). Tones and voice quality in modern northern Vietnamese: Instrumental case studies. Mon-Khmer Studies, 28, 1–18. Pater, J. (1997). Metrical parameter missetting in second language acquisition. In S. J. Hannahs & M. Young-Scholten (Eds.), Focus on phonological acquisition (pp. 235–262). Amsterdam: John Benjamins. Peperkamp, S., Dupoux, E., & Sebastián-Gallés, N. (1999). Perception of stress by French, Spanish, and bilingual subjects. In Proceedings of Eurospeech ’99: Vol. 6 (pp. 2683–2686). Budapest, Hungary: N.p. Pham, H. (2000). Vietnamese tone: Tone is not pitch. Unpublished doctoral dissertation, University of Toronto, Ontario, Canada. Ruhlen, M. (1976). A guide to the languages of the world. Palo Alto, CA: Language Universals. Ueyama, M. (2000). Prosodic transfer: An acoustic study of L2 English vs. L2 Japanese. Unpublished doctoral dissertation, University of California, Los Angeles. Umeda, N. (1977). Consonant duration in American English. Journal of the Acoustical Society of America, 60, 846–858. Vance, T. J. (1977). Tonal distinctions in Cantonese. Phonetica, 34, 93–107. Vu, T. P. (1981). The acoustic and perceptual nature of tone in Vietnamese. Unpublished doctoral dissertation, Australian National University, Canberra. Wang, W. S.-Y. (1967). Phonological features of tone. International Journal of American Linguistics, 33(2), 93–105. APPENDIX List of Test Words Say the word “______” again. 1. upset(n) 8. proceeds(n) 15. rebel(n) upset(v) proceed(v) rebel(v) 2. offset(n) 9. addict(n) 16. compliment(n) offset(v) addict(v) compliment(v) 3. segment(n) 10. ally(n) 17. implement(n) segment(v) ally(v) implement(v) 4. fragment(n) 11. relay(n) 18. document(n) fragment(v) relay(v) document(v) 5. accent(n) 12. confine(n) 19. regiment(n) accent(v) confine(v) regiment(v) 6. compress(n) 13. combine(n) 20. interlock(n) compress(v) combine(v) interlock(v) 7. conduct(n) 14. present(n) conduct(v) present(v) BRIEF REPORTS & SUMMARIES 319