More Related Content

More from Lviv Startup Club(20)

Maksym Vakulenko: Positional variations of the location of Ukrainian vowel formants

  1. Maksym Vakulenko Institute of Problems of Artificial Intelligence Kyjiv, Ukraine 1
  2. Abstract The variations in the location of formants of Ukrainian vowels caused by different phonetic contexts are analyzed. The corresponding invariant acoustic parameters being the ratios between frequencies of the formants caused by the tube resonance are determined. It is shown that different phonetic environments cause different formant configurations in vowels which is a case of sound assimilation. In particular, high vowels make the other vowels in the phonetic word move higher, and the front vowels make the back ones move forward. Such formant shifts are more pronounced in unstressed positions and especially in a coda. Key words: Ukrainian vowels, formant frequency ratio, invariant acoustic parameters, formant shift, sound assimilation. 2
  3. Intro Досвід: Fulbright Program Scholarship (2003-2004) – єдиний український фонетичний проєкт; Project of Lionbridge Technologies (2016-2018); 17-а Славістична конференція (Саппоро, Японія, вересень 2022); Публікації ... Фонетичні дані: • розпізнавання мовлення; • озвучування тексту; • багатомовний пошук інформації; ……………………………. 3
  4. Intro Відображення фон: розпізнавання через іншу мову (Lionbridge). Vakulenko, Maksym. 2021. “Calculation of Phonetic Distances between Speech Sounds.” In: Journal of Quantitative Linguistics 28 (3): 223–236. Published online: October 23, 2019. DOI: 10.1080/09296174.2019.1678709. Фонемізатор: автоматичне переведення тексту у фонематичну форму. Vakulenko, M. O. 2022. “A Tool for Scholarly Phonemic Transcription of Ukrainian Texts.” In: Proceedings of the 2022 12th International Conference on Advanced Computer Information Technologies (ACIT). 26-28 September 2020, Spišska Kapitula, Slovakia. Pp. 516–519. DOI: 10.1109/ACIT54803.2022.9913111. Github: https://github.com/Mova-2020/Phonemic-transcription-of-Ukrainian- texts. 4
  5. Intro Нуль-прохідні моделі (zero- shot models): словесний опис об’єкта. Потрібно точно описати відмінності від того об’єкта, на який мережу вже навчено. 5
  6. Intro Потреба у фонетичній інформації. Ультразвук (США), артикулограф (Німеччина). Фонетика в Україні: обняти й плакати. • Ніна Тоцька (ЛФШ): 1981. 1 мовець. Два покоління – рух у болото. • 2008: закриття відділу фонетики в Інституті української мови. • ЛЕФ КНУ (Тоцька): тільки навчальний процес. • Саппоро-2022: від України тільки двоє (+ проф. О. Стеріополо). Вікіпедія, YouTube: ненаукова фантастика (хибні дані + хибна інтерпретація). Фонетичні дослідження української мови: США, Польща. !? 6
  7. Intro 7
  8. Intro 8
  9. Prerequisites • Informants: three male native Ukrainian speakers. • Dialect spoken: basic (South-Eastern) Ukrainian dialect. • Recording device: Microphone for Xiaomi Redmi Note 10. • Stimuli: isolated sounds pronounced in the normal tone, in whisper, and in ascending and descending tone; individual words containing various combinations of these available in (Bilodid 1969) augmented with extra examples. • Software used for wave analysis: WaveLab 2.1, SoundForge 4.0, iZotope RX 8 Audio Editor. 9
  10. /a/ Stressed /a/: F1 = 670 Hz (“ǧja”) – 833 Hz (isolated), F2 = 994 Hz (“vsjake”) – 1353 Hz (“tacja”); r21 = F2/F1 = 1.34 - 1.82. Unstressed /a/: F1 = 520 Hz (“rjabá”) – 819 Hz (“malá”), F2 = 994 Hz (“paljú”) – 2054 Hz (“rjabí”); r21 = F2/F1 = 1.34 - 3.69. After a [j] and soft consonants, and in an unstressed position, the more central and elevated allophones appear. 10
  11. /a/ The unstressed /a/ can acquire the properties of the preceding consonant: small amplitude of the fundamental frequency after the voiceless consonants, an additional Helmholtz formant (as in “sami”, “xalva”, etc.). The amplitude of this formant is smaller than the amplitude of the corresponding overtone of the harmonic signal corresponding to the configuration of a voiceless consonant. This is probably the effect of the progressive assimilation by the preceding consonant. After consonants, when the Helmholtz formant is prominent, the second formant of an /a/ is replaced by a formant that is a multiple of the Helmholtz formant. 11
  12. Fig. 1. A 2d spectrogram of a final unstressed /a/ in “xata”. F1 = 4F0 = 713 Hz, F2 = 7F0 = 1248 Hz; r21 = F2/F1 = 1.75. Sound [ɐ]. Effect: coda. 12
  13. Fig. 2. A 2d spectrogram of an unstressed /a/ in “davy”. F1 = 301 Hz, F2 = 609 Hz, F3 = 1519 Hz; r32 = F3/F2 = 2.49; r31 = F3/F1 = 5.05; r21 = F2/F1 = 2.02. Sound [ɘ]. Effect: distant assimilation by the next vowel. 13
  14. Fig. 3. A 2d spectrogram of an unstressed /a/ after a [rʲ] in “rjaba”. F1 = 520 Hz, F2 = 1835 Hz. r21 = F2/F1 = 3.53. Sound [ə]. Effect: progressive assimilation by the preceding consonant. 14
  15. /o/ Stressed /o/: F1 = 241 Hz (“vikno”) – 504 (“oko”), F2 = 477 Hz (“djoǧotj”) – 906 Hz (“dolju”); r21 = 1.33 - 3.00. Unstressed /o/: F1 = 289 Hz (“kožux”) – 527 Hz (“otara”, “ǧolubka”), F2 = 512 Hz (“ne tobi”) – 1082 Hz (“otoka”); r21 = 1.34 - 3.13. Isolated /o/: r21 = 1.50. 15
  16. /o/ The stressed /o/ in the words “pole”, “volja”, “kolo”, “kola”, “vikno”, and “sjoǧodni” is close to an [ʊ]. “Mova”: [o]. The vowel assimilation effect: “ne tobi”, “ta xodim”, some realizations of “kožux” where the F1 decreases => upward movement of this allophone. F1 does not noticeably decrease in “zozulja”, “ǧolubka”, and in “na sobi”. After a [j] and palatalized consonants, there appear allophones close to an [ʊ]. The unstressed allophones of an /o/ are not only more advanced but also lower than the stressed ones: midheight character of this phoneme (Vakulenko 2018) . In the word “koxaty” and in some other words, the unstressed /o/ approaches an [ʊ] that may result from coarticulation with the preceding consonant. 16
  17. Fig. 4. A 2d spectrogram of an isolated /o/. F1 = 499 Hz, F2 = 710 Hz; r21 = F2/F1 = 1.42. Sound [o]. 17
  18. Fig. 5. A 2d spectrogram of a rising-tone /o/. F1 = 469 Hz, F2 = 705 Hz; r21 = F2/F1 = 1.50. Sound [o]. 18
  19. Fig. 6. A 2d spectrogram of a stressed /o/ in “vikno”. F1 = 3F0 = 328 Hz, F2 = 6F0 = 661 Hz; r21 = F2/F1 = 2.02. Sound [ʊ]. Effect: distant assimilation by the preceding vowel, coda. 19
  20. Fig. 7. A 2d spectrogram of an unstressed /o/ in “djoǧotj”. F1 = 5F0 = 521 Hz, F2 = 9F0 = 959 Hz; r21 = F2/F1 = 1.84. Sound [o+]. 20
  21. Fig. 8. A 2d spectrogram of an unstressed /o/ in “kožux”. F1 = 3F0 = 491 Hz, F2 = 6F0 = 985 Hz; r21 = F2/F1 = 2.01. Sound [o+]. 21
  22. Fig. 9. A 2d spectrogram of an unstressed /o/ in “otoka”. F1 = 3F0 = 533 Hz, F2 = 6F0 = 1082 Hz; r21 = F2/F1 = 2.03. Sound [o+]. 22
  23. Fig. 10. A 2d spectrogram of an unstressed /o/ in “ta_xodim”. F1 = 400 Hz, F2 = 3F0 = 600 Hz, r21 = F2/F1 = 1.50. Sound [o]. 23
  24. Fig. 11. A 2d spectrogram of an unstressed /o/ in “zozulja”. F1 = 3F0 = 478 Hz, F2 = 6F0 = 968 Hz; r21 = F2/F1 = 2.02. Sound [o+]. 24
  25. /u/ Stressed /u/: F1 = 206 Hz (“manjunja”) – 331 (“vudu”, “dumy”), F2 = 311 Hz (“manjunja”) – 556 Hz (isolated); r21 = 1.48 - 2.00. Unstressed /u/: F1 = 241 Hz (“maljuvaty”) – 348 Hz (“maljuvaty”), F2 = 372 Hz (“maljuvaty”) – 766 Hz (“temu”) ; r21 = 1.48 - 2.00. Isolated /u/: r21 = 1.67. Observation: F3 and F4 are not proportional to F0 (isolated). After palatalized consonants, combined formants were observed with similar formant ratios. 25
  26. Fig. 12. A 2d spectrogram of a stressed /u/ in “buv”. F1 = 372 Hz, F2 = 617 Hz; r21 = F2/F1 = 1.66. Sound [u]. 26
  27. Fig. 13. A 2d spectrogram of a stressed /u/ in “ljuba”. F1 = 2F0 = 279 Hz, F2 = 3F0 = 425 Hz, F3 = 7F0 = 959 Hz, F4 = 8F0 = 1117 Hz, F5 = 1257 Hz; r54 = F5/F4 = 1.25; r53 = F5/F3 = 1.31; r52 = F5/F2 = 2.96; r43 = F4/F3 = 1.16; r41 = F4/F1 = 4.00; r21 = F2/F1 = 1.52. Sound [u] mixed with an [ʊ]. Effect: progressive assimilation by the preceding palatalized consonant having combined formants. 27
  28. /e/ Stressed /e/: F1 = 495 Hz (“temu”) – 740 (“perše”), F2 = 1808 Hz (“perše”) – 2903 Hz (“llje”); r21 = 2.46 - 5.57. Unstressed /e/: F1 = 282 Hz (“nesty”) – 740 Hz (“beze”), F2 = 600 Hz (“nesty”, “beru”) – 1835 Hz (“beze”); r21 = 2.46 - 5.57. 28
  29. /e/ In some realizations of the isolated phoneme, the formant frequencies are not proportional to F0. The preceding voiceless consonant does not result in the decreased amplitude of the fundamental frequency: the absence of progressive assimilation in sound intensity. The distant assimilation by a high vowel => a significant lowering of F1. The allophones have high variance in formant frequencies. 29
  30. Fig. 14. A 2d spectrogram of an isolated /e/. F1 = 512 Hz, F2 = 1966 Hz; r21 = F2/F1 = 3.84. Sound [ɛ̝]. 30
  31. Fig. 15. A 2d spectrogram of an isolated whispered /e/. F1 = 775 Hz, F2 = 2045 Hz, F3 = 2557 Hz; r32 = F3/F2 = 1.25, r31 = F3/F1 = 3.30, r21 = F2/F1 = 2.64. Sound [ɛ]. 31
  32. Fig. 16. A 2d spectrogram of a stressed /e/ in “perše”. F1 = 4F0 = 705 Hz (-68 дБ), F2 = 1835 Hz (-90 дБ); r21 = F2/F1 = 2.60. Sound [ɛ]. 32
  33. Fig. 17. A 2d spectrogram of an unstressed /e/ in “beze”. F1 = 4F0 = 504 Hz, F2 = 14F0 = 1782 Hz; r21 = F2/F1 = 3.54. Sound [ɛ̝-]. 33
  34. /y/ Stressed /y/: F1 = 248 Hz (“sydy”) – 390 (“vyrij”), F2 = 1852 Hz (descending) – 2832 Hz (“tyša”), F3 = 2547 Hz (isolated) – 3508 (“tyša”); r32 = 1.16 - 1.27. Unstressed /y/: F1 = 267 Hz (“sydy”) – 305 Hz (“nyzjkyj”), F2 = 2232 Hz (“sydy”) – 2293 Hz (“nyzjke”), F3 = 2621 Hz (“sydy”) – 2964 (“nyzjku”); r32 = 1.16 - 1.27. Isolated /y/: r32 = 1.20. 34
  35. /y/ F2 for unstressed allophones: within that of the stressed ones. A lax vowel moves towards the central position on the Jones chart (cf. Stevens 1998, pp. 294–299) => the central character of an /y/ (cf. Vakulenko 2018). Central phonemes display also a small difference in F2 for tense and lax allophones. At the same time, some observed spectral characteristics indicate possible front realizations of this phoneme. 35
  36. Fig. 18. A 2d spectrogram of an isolated /y/. F1 = 372 Hz, F2 = 2125 Hz, F3 = 2626 Hz, F4 = 3388 Hz, F5 = 3909 Hz, F6 = 4689 Hz, F7 = 5869 Hz, F8 = 8132 Hz; r32 = F3/F2 = 1.24, r42 = F4/F2 = 1.59, r43 = F4/F3 = 1.29, r52 = F5/F2 = 1.84, r53 = F5/F3 = 1.49, r54 = F5/F4 = 1.15, r62 = F6/F2 = 2.21, r63 = F6/F3 = 1.79, r64 = F6/F4 = 1.38, r65 = F6/F5 = 1.20, r72 = F7/F2 = 2.76, r73 = F7/F3 = 2.23, r74 = F7/F4 = 1.73, r75 = F7/F5 = 1.50, r76 = F7/F6 = 1.25, r82 = F8/F2 = 3.83, r83 = F8/F3 = 3.10, r84 = F8/F4 = 2.40, r85 = F8/F5 = 2.08, r86 = F8/F6 = 1.73, r87 = F8/F7 = 1.39. Sound [ɪ]. 36
  37. Fig. 19. A 2d spectrogram of an unstressed /y/ in “nyzjka”. F1 = 273 Hz, F2 = 2325 Hz, F3 = 2799 Hz; r32 = F3/F2 = 1.20. Sound [ɪ]. 37
  38. /i/ Stressed /i/: F1 = 213 Hz (isolated) – 317 (isolated), F2 = 2202 Hz (isolated) – 3240 Hz (isolated), F3 = 2621 Hz (“jiža”) – 3974 (isolated); r32 = 1.19 - 1.47. Unstressed /i/: F1 = 246 Hz (“peredistorija”) – 407 Hz (“inši”), F2 = 1818 Hz (“peredistorija”) – 2220 Hz (“peredistorija”), F3 = 2584 Hz (“bezimennyj”) – 3146 (“bezimennyj”); r32 = 1.27 - 1.47. Isolated /i/: r32 = 1.25. 38
  39. /i/ • An unstressed allophone, as compared to its stressed cognate, has a somewhat higher F1 and lower F2, so it is lower and more retracted. • An unstressed allophone that does not result in palatalization of a preceding consonant (after prefixes and prefixoids) is similar to a normal unstressed one. It has slightly lower formants than the routine unaccented allophone, i. e. it is higher and more central. 39
  40. Fig. 20. A 2d spectrogram of an isolated /i/ (pre-ending). F1 = 317 Hz, F2 = 2224 Hz, F3 = 2859 Hz, F4 = 3507 Hz, F5 = 3981 Hz, F6 = 4456 Hz, F7 = 7860 Hz; r32 = F3/F2 = 1.29, r42 = F4/F2 = 1.58, r43 = F4/F3 = 1.23, r54 = F5/F4 = 1.14, r53 = F5/F3 = 1.39, r52 = F5/F2 = 1.79, r62 = F6/F2 = 2.00, r63 = F6/F3 = 1.56, r64 = F6/F4 = 1.27, r65 = F6/F5 = 1.12, r72 = F7/F2 = 3.53, r73 = F7/F3 = 2.75, r74 = F7/F4 = 2.24, r75 = F7/F5 = 1.97, r76 = F7/F6 = 1.76. Sound [i̞]. 40
  41. Fig. 21. A 2d spectrogram of an unstressed /i/ after a prefix in “peredistorija”. F1 = 246 Hz, F2 = 2220 Hz, F3 = 2834 Hz; r32 = F3/F2 = 1.28. Sound [i-], or [ɪ̝]. 41
  42. Table 1. Formant frequencies and ratios of the Ukrainian vowels Vowel F1, Hz F2, Hz F3, Hz Formant ratio a (stressed) 670 – 833 994 – 1353 - 1.34 – 1.82 (4/3) a (unstressed) 520 – 819 994 – 2054 - 1.34 – 3.69 o (stressed) 241 – 504 477 – 906 - 1.33 – 3.00 (3/2) o (unstressed) 289 – 527 512 – 1082 - 1.34 – 3.13 u (stressed) 206 – 331 311 – 556 - 1.48 – 2.00 (5/3) u (unstressed) 241 – 348 372 – 766 - 1.47 – 2.53 e (stressed) 495 – 740 1808 – 2903 - 2.46 – 5.57 (3) e (unstressed) 282 – 740 600 – 1835 - 2.00 – 3.54 y (stressed) 248 – 390 1852 – 2832 2547 – 3508 1.16 – 1.27 (6/5) y (unstressed) 267 – 305 2232 – 2293 2621 – 2964 1.17 – 1.31 i (stressed) 213 – 317 2202 – 3240 2621 – 3974 1.19–1.47 (5/4) i (unstressed) 246 – 407 1818 – 2033 2584 – 3146 1.27–1.47 42
  43. Discussion The obtained results support, in general, the earlier classification of Ukrainian vowels proposed in (Vakulenko 2018), supplementing and amending the data on formant frequencies presented in (Bilodid 1969; Tocjka 1981). In particular, the formant shifts in unaccented allophones of an /o/ support our earlier conclusion about its midheight character (cf. Vakulenko 2018). The variations in acoustic invariants for different allophones generalize the results obtained in (Vakulenko 2011). At the same time, high variance and strong overlapping of F2 values for the allophones of phonemes /i/ and /y/ still leave room for possible classification of a modern Ukrainian /y/ as a retracted front phoneme ɪ-. To be able to provide more definite decisions, we need to involve more informants and more innovative instrumental techniques, such as articulography. 43
  44. Discussion The task to determine the ranges of fundamental frequency and formant frequencies for the group of speakers should be carried out accurately and carefully because each speaker has their own frequency range, and the same sounds produced by different informants usually have different frequency spectra. We should build vowel diagrams for each individual speaker and then generalize them. Ukrainian: tense/lax (sounds, phonemes) -> stressed/unstressed (allophones). 44
  45. Discussion How to separate front and central vowels? On the phonemic level, this task may be resolved by comparing shifts of F2 for tense and lax allophones. For individual sounds, we need other approaches, such as the determination of the neutral sound formant frequencies for each speaker and the consequent comparison of formant differences. The Jones diagram is not sufficient for precise vowel classification. We need to involve additional acoustic characteristics, such as higher formants and invariant formant ratios. The sound displayed in Fig. 20, was classified as a lowered high front vowel i̞ instead of an ɪ due to the invariant ratio F3/F2 = 1.29. 45
  46. Literature • Bilodid, I. K. (Ed.). 1969. Suchasna ukrajinsjka literaturna mova. Fonetyka [Contemporary standard Ukrainian. Phonetics]. Kyjiv: Naukova dumka (in Ukrainian). • DSTU 9112:2021 (NEQ ISO 9: 1995). 2022. Cyrillic-Latin transliteration and Latin-Cyrillic retransliteration of Ukrainian texts. Writing rules. URL: https://zakon.rada.gov.ua/rada/show/v0260774-21#Text. • Stevens, K. N. Acoustic phonetics. Cambridge, Mass.: MIT Press. • Tocjka, N. I. 1981. Suchasna ukrajinsjka literaturna mova: fonetyka, orfoepija, ghrafika, orfoghrafija [Contemporary standard Ukrainian: Phonetics, orthoepy, graphemics, orthography]. Kyjiv: Vyshcha shkola (in Ukrainian). • Vakulenko, Maksim O. 2011. “Metod akustychnykh invariantiv u doslidzhenni gholosnykh zvukiv ukrajinsjkoji movy [The method of acoustic invariants in the study of Ukrainian vowels].” In: Slavia 4: 431-442 (in Ukrainian). • Vakulenko, Maksym O. 2015. “Practical transcription and transliteration: Eastern-Slavonic view.” In: Govor 32(1): 35-56. URL: https://hrcak.srce.hr/file/244983 [Retrieved August 11. 2022]. • Vakulenko, Maksym O. 2018. “Ukrainian vowel phones in the IPA context.” In: Govor 35 (2): 189- 214. URL: https://www.hfiloloskod.hr/images/HFD/Govor/Govor-2-2018-web.pdf [Retrieved August 11. 2022]. DOI: 10.22210/govor.2018.35.11. 46
  47. Thank you for your attention! 47

Editor's Notes

  1. Можливі підходи. Тут використано наявну інформацію про звуки українського мовлення.
  2. Перспективні на сьогодні підходи базуються на так званих нуль-прохідних моделях. Тут потрібна достеменна фонетична інформація.
  3. Належна частка критичного мислення.
  4. Діаграма голосних, або діаграма Джонса. По осях – формантні частоти. Слабкі голосні (lax vowels) - ненаголошені для української мови - містяться ближче до центра діаграми відносно сильних. Середня лінія – межа (без асиміляції). Для кожного мовця своя діаграма (різні ЧОТ, різні форманти). Співвідношення => спочатку побудувати діаграми для кожного, потім усереднювати за мовцями.
  5. Українські голосні. [o]: Ближче до [а] чи до [у]? Бемольність (пониження другої форманти попереднього приголосного) => напівзакритий (напівпіднесений). [и]: десь у жовтому. Ненаголошені [и] та [е] подібні => рудий прямокутник. Де межа між передніми та середніми? Симетрія для [i] та [u].
  6. Активні військові дії => віддалено. We collected phonetic data from …
  7. The formant frequencies for a stressed and an unstressed /a/ are given. The formant ratio F2/F1 ranges from 1.34 to 1.82 for stressed allophones and from 1.34 to 3.69 for unstressed ones. The isolated /a/ has the formant ratio of four to three. After a [j] and soft consonants, as well as in an unstressed position, the more central and elevated allophones appear.
  8. The formant ratio r21 ranges from 1.33 to 3.00 for stressed allophones and from 1.34 to 3.13 for unstressed. Isolated /o/: r21 = 1.50.
  9. The range of F2 for lax allophones of an /y/ lies within that of the stressed ones. Given that an unstressed vowel moves towards the central (indifferent) position on the Jones chart (cf. Stevens 1998, pp. 294–299), this fact indicates the central character of an /y/ as reported in (Vakulenko 2018). It should be added that central phonemes display also a small difference in F2 for tense and lax allophones. At the same time, some observed spectral characteristics indicate possible front realizations of this phoneme.