Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
ylchen
1. Yunlin Chen
Master of Engineering on Software Engineering
Chang’an Campus
NorthWestern Polytechnical University 710129
H (+86) 18702963896
B ylchen@mobvoi.com
Í LinkedIn:chen yunlin
Education
2014-Present Master of Engineering, Northwestern Polytechnical University, China.
Majored in Software Engineering
Supvisor: Prof. Lei Xie
2010-2014 Bachelor of Engineering, Northwestern Polytechnical University, China.
Majored in Software Engineering
Supvisor: Prof. Lei Xie
Research Topics
Statistical Parametric Speech Synthesis
Unit Selection
Speaker Adaptation
Deep Learning
Pattern Recognition and Machine Learning
Project Experience
201506-Present Intern at Mobvoi (chumenwenwen, Beijing, China)
{ Work as speech synthesis engineer (co-supversior: Xin Lei)
{ Develop and set up HMM based speech synthesis system
{ Design and build LSTM speech synthesis system, and achieve better speech quality, prosodic
and other aspects than HMM tts
{ Use FST and n-gram based open-source G2P tools phonetisaurus to do grapheme to phoneme,
and use CRF to do pause level prediction
201511-201512 HMM Based Trajectory Tiling Unit Selection for TTS
{ Realise HTT for TTS, adopt phoneme and state as concatenation units
{ Apply HMM based speech parameter generation algorithm to generate trajectory for guiding
unit selection, use the distance of lsp and f0 between target units and candidate units as target
score, and use Kullback-Leibler Divergence of two units as concatenation score, then construct
a unit sausage
{ Implement viterbi algorithm to search the best path in the sausage, and use NCC to find the
best concatenation point, finally adjacent waveform units along the optimal path are shifted by
the best offset and concatenated with triangular cross-fading
201503-201506 LSTM Based Speech Synthesis
{ Apply the bidirectional long short-term memory recurrent neural network (BLSTM-RNN)
approach into acoustic model, extract lsp and f0 as output features and convert full label to
one hot feature as neural input, and also apply the same approach into duration model
{ Design two feedward layers and two blstm layers as neural structure, which both acoustic model
and duration model adopt
{ Compare different input features, such as add state information and context, which proves
LSTM can learn better model without context
201410-201503 HMM Based Speech Synthesis
2. { Attempt different order of different parameters such as 34th mgc, 41th lsp and 12th mgc-lsp,
find that 40 dimensions lsp and 1 dimension gain is best in our Chinese HMM based tts system
{ Realise an English speech synthesis system, which uses flite to generate full label and use
htsengine to get parameters and speech, in the training we use festival to auto get full and
mono label
201403-201408 Talking Avatar Based on Android platform
{ Investigate and implement an Android application about talking avatar, which use hmm based
speech synthesis and MPEG4 criterion to dirve 3D avatar open mouth to speak, smile, move
eyes and head
Skills
Programming C/C++, Python, Java, Shell, Cuda, Thrust, Boost, Blade(Bazel), GTest, GFlags, Matlab, Leveldb,
Kaldi
Language Chinese(Mother-tongue), English(fluent)
Awards
2014 Excellent graduate, in Northwestern Polytechnical University
2013 National Endeavor Scholarship, Northwestern Polytechnical University
2012 National Endeavor Scholarship, Northwestern Polytechnical University
2011 National Endeavor Scholarship, Northwestern Polytechnical University
2011 First Prize of Scholarship, Northwestern Polytechnical University
2010 Second Prize of Mathematical Modeling Contest, Northwestern Polytechnical University
2010 First Prize of Scholarship, Northwestern Polytechnical University
Activities
201110-present Teacher Assistant
2014 Volunteer of the 2014 International Conference on Orange Technologies (ICOT), September 20-23,
Xi’an, China
2014 Volunteer of the 17th Forum of Department Heads of Computing Discipline, October 20-22, Xi’an,
China
2014 Volunteer of the 2014 International Doctoral Forum, December 5-7, Xi’an, China
Reference
Lei Xie xielei21st@gmail.com
Xin Lei mikelei@mobvoi.com
Publications
[1] Pengcheng Zhu, Lei Xie, and Yunlin Chen. “Articulatory Movement Prediction Using Deep Bidirectional
Long Short-Term Memory Based Recurrent Neural Networks and Word/Phone Embeddings”. In: 16th Annual
Conference of the International Speech Communication Association (INTERSPEECH). Dresden, Germany,
2015.