2. This slide is a brief translation of
this Japanese blog article.
3. The main problem is the difference
among /p/, /t/ and /k/. In this field, many old
papers says that the difference lays on time
change of the frequency of the second
formant. However, it is hard to find these
time changes. I suspect this old hypothesis.
4. The method for synthesis of these
consonants was explored for exploring the
difference. Then, the methods for synthesis
of /pi/, /ti/ and /ki/ were revealed (peculiar
to vowel /i/).
5. The problem definition is that the goal is
synthesis /pi/, /ti/ and /ki/ from vowel /i/ by
signal processing. Since this problem
argues only articulation position,
substitution to /bi/, /di/ and /gi/ is regarded
as success.
6. This slide presents the synthesis methods and the
opinions to the synthesized utterances by one sixties
male and one sixties female. The reason of few testees
is that I do not belong to any laboratory nor company.
The testees listened only my voice (because they do
not have vitality to listen many utterances), though I
used the corpus " "Spoken Language" and the DSR
Projects Speech Corpus (PASL-DSR)" [1] for this
research. I thank the researchers at National Institute
of Informatics Speech Resources Consortium.
7. The spectrogram of /i/ is shown as Fig.1,
which is processed. The sampling
frequency was 16 kHz. The maximum value
of frequency (vertical axis) is 8 kHz. The
width of time (horizontal axis) is
approximately 0.3 seconds. Red represents
large power and blue represents small
power.
9. The method to synthesize /ki/ is to make
narrow band colored noise at the second
formant shown by the black arrow as Fig.2.
This narrowness is important for this
method, but the center frequency of the
noise is not important.
11. The two testees said that this synthesized
/ki/ was more intelligible than /pi/ and /ti/
that are shown below.
12. The method for synthesis of /pi/ has two or
three steps. First, slide the part shown by
the black rectangle toward lower frequency.
Second, add noise at the part shown by the
blue rectangle. Although many utterance /i/
changed to /pi/ by these two steps, /i/ of my
voice did not. Third, replace the power at
the part shown by the green rectangle to
zero amplitude, which made my /i/ to /pi/.
14. This process for /pi/ almost agrees a
conventional hypothesis about time change
of the center frequency of the second
formant.
While the male testee said that this /pi/ does
not sound intelligible, the female testee said
that it sounds intelligible so much.
15. The method for synthesis of /ti/ is not clear,
because resynthesized /i/s by several
methods sounded like /ti/. A method is to
make the amplitude at the part shown by the
green rectangle to constant value and to
make the phase at this part to random value.
17. While the male testee said that this /ti/
sounds intelligible so much, the female
testee said that it sounds like sometimes /ti/
and sometimes /pi/.
18. Although I made /po/, /to/ and /ko/, the
male testee said that they sound like /o/
without consonants. This article does not
mention them. however these synthesized
utterance are distributed with the other
utterances and scripts.
19. The Octave scripts and the synthesized
utterances are distributed at SkyDrive. You
may use them. I permit you to modify them
and to redistribute them. I will be happy if
you develop this research, write papers, and
publish them with your name.
20. Reference
• [1] S. Itahashi, "Creating Speech Corpora
for Speech Science and Technology,"
IEICE Trans. Vol.E74, No.7, pp.1906-1910,
1991.