1. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
Prosody Modeling for Synthesis of
Storytelling Style Speech
Second Seminar
by
Parakrant Sarkar
Roll No: 12IT72P08
Under the Supervision of
Dr. K.Sreenivasa Rao
Department of Computer Science and Engineering
Indian Institute of Technology Kharagpur
March 4, 2016
2. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
OUTLINE
1. Work Done till First Seminar
2. Work Done after First Seminar
2.1 Modeling of pauses using FFNN, SVM and ELM
2.2 Unsupervised Pause Position Prediction
2.3 Modeling of pauses based on Discourse modes
2.4 Duration Modeling for Storytelling Style Speech
3. Conclusion
4. Future Work
5. Publications
6. References
3. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
WORK DONE TILL FIRST
SEMINAR
4. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
LIST OF PROBLEMS ADDRESSED
1. Hindi Story Synthesis framework was proposed.
SSED: Story-specific Emotion Detection Module
SSPG: Story-specific Prosody Generation Module
SSPI: Story-specific Prosody Incorporation Module
2. Three stage pause prediction model was proposed
considering.
Position of the pause.
Duration of the pause.
5. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
PROPOSED PAUSE PREDICTION MODEL
Pause/ Non-pause
Short / Medium / Long Pause
Short Pause
Duration Predictor
Medium Pause
Duration Predictor
Long Pause
Duration Predictor
Pause Position Prediction Model
Pause Duration Prediction Model
First Stage
Second Stage
Story TextStory speech
corpus
6. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
WORK DONE AFTER
FIRST SEMINAR
7. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
LIST OF PROBLEMS:
1. Modeling of pauses using FFNN, SVM and ELM.
2. Unsupervised Pause Position Prediction.
3. Modeling of pauses based on Discourse modes.
4. Duration Modeling for Storytelling Style Speech.
8. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
STORY SPEECH CORPUS
100 stories collected: Panchatantra and Akbar-Birbal.
# sentences per story: 20-25
# total words: 24400
Duration of the speech corpus: 3 hours (approx.)
9. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
PREDICTION OF PAUSE
POSITION
10. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
LIST OF FEATURES
1. Positional features:
Position of the current word from the beginning and
ending of the utterance.
Total number of words in the utterance.
2. Structural features:
Total number of phones in the current word, previous two
and following two words.
Total number of syllables in the current word, previous two
words and following two words.
Total number of phones in the utterance.
3. Morphological features:
Part-of-Speech (POS) of current word, previous two words
and following two words.
4. Story-semantic features
Emotion associated with the current word.
Phonetic strength of current word.
Genre of the Story
11. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS FOR PREDICTING THE PAUSE POSITION
Table: Performance of data-driven models (CART, FFNN, SVM and
ELM) for pause position prediction.
CART
Recall Precision F1
Non-pause 0.89 0.94 0.91
Pause 0.68 0.81 0.74
FFNN
Non-pause 0.90 0.94 0.92
Pause 0.71 0.83 0.77
SVM
Non-pause 0.91 0.93 0.92
Pause 0.78 0.81 0.79
ELM
Non-pause 0.89 0.92 0.90
Pause 0.71 0.82 0.76
12. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF THE PAUSE POSITION PREDICTION
MODEL
13. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
Prediction of Pause
Duration
14. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
FEATURES USED FOR DETERMINING THE PAUSE
DURATION
1. Morphological features:
Terminal syllable of the current word, previous two and
following two words.
2. Structural features:
Position of the vowel in the terminal syllable.
Number of segments (i.e., consonants) before and after the
nucleus (i.e., vowel) in the terminal syllable.
3. Positional features:
Total number of phones in the terminal syllable of the
current word, previous two and following two words.
Position of the current word from the beginning and
ending of the utterance.
15. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
PERFORMANCE OF CART, FFNN, SVM AND ELM
MODELS FOR PREDICTING THE PAUSE DURATION
Model ¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART
241.60
247.21 107.12 155.87 0.67
FFNN 251.25 75.13 70.68 0.87
SVM 251.70 117.69 138.28 0.71
ELM 245.98 107.25 110.11 0.67
¯x: Average of actual pause duration values.
¯y: Average of predicted pause duration values.
µ: Average prediction error.
σ: Standard deviation of average prediction error.
γx,y: Correlation coefficient.
16. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
SCATTER PLOTS OF CART, FFNN, SVM AND ELM
MODELS
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(a) CART
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(b) FFNN
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(c) SVM
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(d) ELM
17. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
MULTI-STAGE PAUSE
DURATION PREDICTION
18. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS OF THE CLASSIFICATION OF PAUSE BASED
ON LIMITED INTERVAL
CART
Recall Precision F-1
long 0.73 0.62 0.67
medium 0.50 0.52 0.51
short 0.53 0.63 0.58
FFNN
long 0.72 0.65 0.68
medium 0.54 0.57 0.55
short 0.62 0.59 0.60
SVM
long 0.68 0.65 0.66
medium 0.51 0.58 0.54
short 0.61 0.58 0.59
ELM
long 0.72 0.58 0.64
medium 0.55 0.52 0.53
short 0.65 0.62 0.63
19. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF CLASSIFYING A PAUSE IN ONE OF THE
PAUSE TYPE:SHORT, MEDIUM AND LONG
20. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS FOR PREDICTING PAUSE DURATION
Table: Performance of mulit model framework using CART, FFNN,
SVM and ELM models based on objective measures
CART
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
L CART 347.96 372.92 78.46 57.31 0.73
M CART 208.43 199.30 26.19 17.02 0.67
S CART 87.33 84.99 24.51 17.81 0.72
Overall 181.16 184.18 39.23 40.17 0.70
FFNN
L FFNN 350.96 342.70 62.59 46.87 0.67
M FFNN 203.15 195.05 25.87 18.92 0.75
S FFNN 87.32 82.55 28.29 17.69 0.68
Overall 148.98 143.13 33.70 28.39 0.73
SVM
L FFNN 353.16 299.10 72.40 63.75 0.73
M FFNN 203.15 190.98 29.96 16.79 0.52
S FFNN 90.70 77.41 22.96 17.88 0.63
Overall 189.49 164.88 38.50 42.85 0.65
ELM
L FFNN 353.16 340.83 69.32 44.51 0.66
M FFNN 203.15 195.09 27.35 14.64 0.65
S FFNN 90.38 80.91 22.34 16.24 0.81
Overall 190 180.02 36.85 34.19 0.71
21. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
SCATTER PLOTS OF MULTI MODEL FRAMEWORK
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(e) CART
0 100 200 300 400 500
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(f) FFNN
0 100 200 300 400 500 600
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(g) SVM
0 100 200 300 400 500 600
0
100
200
300
400
500
actual pause duration (in ms)
predictedpauseduration(inms)
(h) ELM
22. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ANALYSIS OF PAUSE MODEL BASED ON SINGLE AND
MULTI MODEL FRAMEWORK
Figure: Average prediction error (in ms)
23. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
UNSUPERVISED PAUSE
POSITION PREDICTION
MODEL
24. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
UNSUPERVISED FEATURES EXTRACTION
METHODOLOGY
Story Speech
Corpus
Most frequent occuring
words
Unique
Words
Feature
Extraction
SVD
Dictionary
m*n m*r
Figure: Feature extraction method
m = 3579 unique words.
n = 300 most frequent occurring word.
r = 50 reduced dimension of the co-occurrence matrix.
25. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
UNSUPERVISED FEATURES EXTRACTION
METHODOLOGY CONTD..
26. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS
Table: Performance measures of unsupervised data-driven models:
CART, FFNN, SVM and ELM for pause position prediction
CART
Recall Precision F1
Non-pause 0.86 0.94 0.89
Pause 0.79 0.58 0.66
FFNN
Non-pause 0.85 0.91 0.88
Pause 0.82 0.62 0.70
SVM
Non-pause 0.81 0.88 0.84
Pause 0.77 0.68 0.72
ELM
Non-pause 0.84 0.90 0.86
Pause 0.82 0.63 0.71
27. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
MODELING OF PAUSES
BASED ON DISCOURSE
MODES
28. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
DISCOURSE MODE
Three discourse modes of a story are considered:
Descriptive : 547 #sentences
Dialogue: 279 #sentences
Narrative: 1134 #sentences
29. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
STATISTICS OF THE PAUSES FOR VARIOUS MODES OF
DISCOURSE
Table: Satistics of the Pauses for various modes of discourse based on
limited intervals.
Descriptive Mode
Pause Type Mean (ms) StdDev (ms) % in original
Long Pause 447.59 240.18 6.15
Medium Pause 116.99 52.59 6.94
Short Pause 128.68 59.73 6.96
Narrative Mode
Pause Type Mean (ms) StdDev (ms) % in original
Long Pause 413.00 179.89 13.14
Medium Pause 197.61 27.89 11.48
Short Pause 93.97 30.19 23.84
Dialogue Mode
Pause Type Mean (ms) StdDev (ms) % in original
Long Pause 492.28 211.00 4.85
Medium pause 196.44 27.22 2.44
Short Pause 92.24 28.97 4.20
30. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
HISTOGRAM PLOTS
0 100 200 300 400 500 600 700
Duration in ms
0
20
40
60
80
100
120
Frequency
(a) Descriptive Mode
31. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
CONTD..
0 100 200 300 400 500 600 700
Duration in ms
0
50
100
150
200
250
300
Frequency
(b) Narrative Mode
32. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
CONTD..
0 100 200 300 400 500 600 700
Duration in ms
0
10
20
30
40
50
Frequency
(c) Dialogue Mode
33. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
PAUSE PREDICTION MODEL
Story-speech
Corpus
Dialogue
Story text
NarrativeDescriptive
Figure: Classifying Story text into three modes of Discourse
34. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
Pause/ Non-pause
Short / Medium / Long Pause
Short Pause
Duration Predictor
Medium Pause
Duration Predictor
Long Pause
Duration Predictor
Pause Position Prediction Model
Pause Duration Prediction Model
First Stage
Second Stage
Figure: Proposed pause prediction model
35. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF FIRST STAGE OF PAUSE POSITION
PREDICTION MODEL
Table: Performance of CART Model for predicting pause postion
Descriptive
Recall Precision F-1 Score
Non-pause 0.978 0.837 0.902
Pause 0.454 0.88 0.60
Dialogue
Recall Precision F-1 Score
Non-pause 0.952 0.872 0.91
Pause 0.569 0.793 0.663
Narrative
Recall Precision F-1 Score
Non-pause 0.953 0.856 0.902
Pause 0.552 0.806 0.655
36. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF THE PAUSE POSITION PREDICTION
MODEL
Descriptive Mode: 72%
Dialogue Mode:76.05%
Narrative Mode: 75.25%
37. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF SECOND STAGE OF PAUSE PREDICTION
MODEL
Table: Performance of CART for long, medium and short pause
classification
Descriptive
Pause Type Recall Precision F-1 Score
long 0.56 0.46 0.51
medium 0.48 0.39 0.43
short 0.56 0.47 0.50
Dialogue
Recall Precision F-1 Score
long 0.50 0.72 0.59
medium 0.53 0.65 0.58
short 0.40 0.55 0.46
Narrative
Recall Precision F-1 Score
long 0.37 0.48 0.42
medium 0.53 0.46 0.49
short 0.73 0.46 0.56
38. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF CLASSIFYING A PAUSE IN ONE OF THE
PAUSE TYPE:SHORT, MEDIUM AND LONG
Descriptive Mode: 53%
Dialogue Mode:48%
Narrative Mode: 54%
39. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ACCURACY OF THIRD STAGE OF PAUSE PREDICTION
MODEL
Table: Performance of CART for pause duration prediction
Descriptive
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 468.24 486.93 90.79 90.92 0.58
CART medium 201.03 189.97 24.59 18.95 0.56
CART short 88.26 93.48 34.86 12.37 0.53
Overall 228.84 234.72 40.42 41.42 0.55
Narrative
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 402.87 397.99 74.90 77.26 0.69
CART medium 198.09 198.87 10.60 10.79 0.66
CART short 93.73 92.69 9.65 9.40 0.69
Overall 244.98 242.71 44.30 37.04 0.68
Dialogue
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 472.47 463.31 60.96 61.93 0.60
CART medium 193.31 200.33 12.02 10.01 0.75
CART short 87.56 89.03 13.74 12.07 0.66
Overall 214.52 214.60 23.70 22.92 0.77
40. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
IDEAL CASE: ACCURACY OF THIRD STAGE OF PAUSE
PREDICTION MODEL
Table: Performance of CART for pause duration prediction
Descriptive
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 468.24 486.93 80.89 100.92 0.78
CART medium 201.03 189.97 14.39 12.95 0.76
CART short 88.26 93.48 13.76 7.37 0.73
Overall 228.84 234.72 34.48 37.24 0.75
Narrative
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 402.87 397.99 74.90 77.26 0.69
CART medium 198.09 198.87 10.60 9.79 0.66
CART short 93.73 92.69 9.65 7.14 0.69
Overall 244.98 242.71 37.16 37.04 0.68
Dialogue
¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
CART long 472.47 463.31 52.96 61.93 0.71
CART medium 193.31 200.33 7.02 7.01 0.85
CART short 87.56 89.03 9.74 10.07 0.76
Overall 214.52 214.60 20.41 22.92 0.77
41. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
ANALYSIS OF PAUSE PREDICTION MODEL IN
DISCOURSE MODE
Figure: Average Prediction Error (in ms)
42. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
DURATION MODELING
FOR STORYTELLING STYLE
SPEECH
43. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
FEATURES USED FOR TRAINING
1 Positional Features (Baseline):
Position of the current word from the beginning and
ending of the utterance.
Position of the current syllable from the beginning and
ending of the utterance.
Position of a syllable in the word.
Position of the vowel in the syllable.
Syllable Identity: Segments of the syllable (consonants and
vowels) for current syllable
Syllable Identity of previous two syllables and following
two syllables.
2 Structural Features (Baseline):
Total number of words in the utterance.
Total number of phones in the utterance.
Total number of syllables in the utterance.
Total number of syllables in the current word, previous two
words and following two words.
Total number of phones in the current word, previous two
44. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
FEATURES USED FOR TRAINING CONTD..
2 Structural Features (Baseline Contd..):
Number of segments (i.e. consonants) before the nucleus
(i.e. vowel) in the syllable.
Number of segments after the nucleus (i.e. vowel) in the
syllable.
3 Story-specific Features
Emotion (sad, anger, happy, fear) of the current word in the
utterance.
Genre of the story (fable, legendary, folk-tales)
Whether the word is a content or functional word.
Whether the word is stressed or not.
45. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS
Table: Performance of CART model for predicting the syllable
duration
Model ¯x (in ms) ¯y (in ms) µ (in ms) σ (in
ms)
γx,y
Baseline 208.97 212 56.02 52.79 0.58
Story-specific 208.97 211.91 46.72 39.72 0.70
¯x: Average of actual pause duration values.
¯y: Average of predicted pause duration values.
µ: Average prediction error.
σ: Standard deviation of average prediction error.
γx,y: Correlation coefficient.
46. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
PROPOSED METHOD
Fable Folk-tale Legendary
Story
47. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
RESULTS CONTD..
Table: Accuracy of prediction by CART model based on Story Genre
Model ¯x (in ms) ¯y (in ms) µ (in ms) σ (in ms) γx,y
Fable 209.97 208.89 38.20 31.70 0.80
Folk-tale 204.86 209.71 36.88 31.06 0.77
Legendary 212.58 209.57 37.81 39.51 0.83
Overall 209.13 209.39 37.63 34.09 0.80
48. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
SUMMARY AND CONCLUSION
1. Modeling of pauses using FFNN, SVM and ELM are
carried out.
2. Unsupervised Pause Position Prediction is proposed.
3. Modeling of pauses based on Discourse modes is studied.
4. Duration Modeling for Storytelling Style Speech.
49. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
FUTURE WORK
In future, we will be extending the current work to include the
followings:
1. Subjective listening test need to be carried out for the
proposed pause prediction model.
2. Modeling of word prominence for story text.
3. Analysis and modeling of pitch for storytelling style
speech based on three modes of discourse.
4. Analysis and modeling of intensity for storytelling style
speech.
50. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
Acknowledgments
The authors would like to thank the Department of
Information Technology, Government of India, for funding
the project, Development of Text-to-Speech synthesis for Indian
Languages Phase II, Ref. no. 11(7)/2011HCC(TDIL). The author
also like to thank all the DAC committee memebers,
supervisor, and all seminar attendees.
51. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
DISSEMINATION OF RESEARCH
Conference:
1. Parakrant Sarkar, K. Sreenivasa Rao, “Analysis and Modeling Pauses for
Synthesis of Storytelling Speech based on Discourse modes”, in Proceedings of the
IEEE International Conference on Contemporary Computing (IC3 2015), JIIT Noida,
11-13 August India.
2. Parakrant Sarkar, K. Sreenivasa Rao,“Data-Driven Pause Prediction for
Synthesis of Storytelling Style Speech based on Discourse Modes”, in Proceedings
of the IEEE International Conference on Electronics, Computing and Communication
Technologies (CONECCT 2015), IIIT Bangalore, 10-11 July India.
3. Parakrant Sarkar, K. Sreenivasa Rao, “Modeling Pauses for Synthesis of
Storytelling Style Speech Using Unsupervised Word Features”, Procedia Computer
Science, Volume 58, 2015, pages 42-49, 10-13 Aug 2015.
4. P Sarkar, K. S Rao, “Data-driven pause prediction for speech synthesis in
storytelling style speech,” in 2015 Twenty First National Conference on
Communications (NCC) , pages 1-5, 27 Feb. - 1 Mar, IIT Bombay, 2015.
Journal:
1. Parakrant Sarkar, K. Sreenivasa Rao, “Modeling of pauses for Storytelling Style
Speech Synthesis”, Computer Speech and Language [Under Revision]
52. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
REFERENCES I
[1] P. Taylor and A. W. Black, “Assigning phrase breaks from part-of-speech
sequences,” Computer Speech & Language, vol. 12, no. 2, pp. 99–117, 1998.
[2] P. Zervas, M. Maragoudakis, N. Fakotakis, and G. Kokkinakis, “Bayesian
Induction of Intonational Phrase Breaks,” Eurospeech, 2003.
[3] K. Yoon, “A Prosodic Phrasing Model for a Korean Text-to-speech Synthesis
System ,” Computer Speech & Language, vol. 20, no. 1, pp. 69 – 79, 2006.
[4] S. Kim, J. Lee, B. Kim, and G. G. Lee, “Incorporating second-order information into
two-step major phrase break prediction for korean,” in INTERSPEECH 2006 -
ICSLP, Ninth International Conference on Spoken Language Processing, Pittsburgh, PA,
USA, September 17-21, 2006, 2006.
[5] A. Parlikar and A. W. Black, “A grammar based approach to style specific phrase
prediction,” in Interspeech, 2011, pp. 2149–2152.
[6] A. Vadapalli, P. Bhaskararao, and K. Prahallad, “Significance of word-terminal
syllables for prediction of phrase breaks in Text-to-Speech systems for Indian
Languages,” in 8th ISCA Speech Synthesis Workshop. Barcelona, Spain: ISCA,
August 31– September 2 2013, pp. 189 – 194.
[7] N. S. Krishna and H. A. Murthy, “A New Prosodic Phrasing Model for Indian
Language Telugu,” in INTERSPEECH. ISCA, 2004.
53. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
REFERENCES II
[8] K. Ghosh and K. Sreenivasa Rao, “Data-Driven Phrase Break Prediction for Bengali
Text-to-Speech System,” in Contemporary Computing - 5th International Conference,
IC3 2012, Noida, India, August 6-8, 2012. Proceedings, ser. Communications in
Computer and Information Science. Springer Berlin Heidelberg, 2012, vol. 306,
pp. 118 – 129.
[9] A. W. Black and P. Taylor, “The Festival Speech Synthesis System: System
Documentation,” Human Communciation Research Centre, University of
Edinburgh, Scotland, UK, Tech. Rep. HCRC/TR-83, 1997.
54. Work Done till First Seminar Work Done after First Seminar Conclusion Future Work Publications References
Thank You