Huaiping Ming
Personal Details
S Ming
F N Huaiping
D  B October 07, 1986
O Research Engineer
M (65)-91328088
Contacts
A Institute for Infocomm Research, A*STAR,
Singapore.
A 1 Fusionopolis Way, #08-01 Connexis
(South Tower), Singapore 138632.
E minghuaiping@gmail.com
Research Interests
Audio & Speech Signal Processing, Voice Conversion, Text to Speech Synthesis, Machine Learning
Qualifications
M  - P RA I  I R A*STAR, Singapore
S.  - M.  M.E. C S  T Northwestern Polytechnical University
S.  - J.  B.S. C S  T Northwestern Polytechnical University
Research & Project Experience
S.

∼
P
Exemplar-based Voice Conversion & Emotional Voice Conversion
We proposed an exemplar-based sparse representation of timbre and
prosody for voice conversion that does not necessitate separately
timbre conversion and prosody conversions. e spectrum, energy
contour and fundamental frequency are simultaneously converted in
a sparse constrained exemplar-based voice conversion framework. e
proposed method achieves good results for both speaker conversion and
emotional voice conversion.
M.

∼
D.

Phase Retrieval
We addressed the problem of phase retrieval to recover a signal
from the magnitude of its Fourier transform. A simple iterative
minimization algorithm recovers a sparse signal from measurements of
its Fourier transform (or other linear transform) magnitude based on
the minimization of a block l1 norm. e proposed algorithm is robust
to noise and scalable in practical implementation. For speech signals,
the voice quality of the reconstructed speech is almost as good as the
original speech.
S.

∼
M.

Automatic Music Transcription
We evaluated different time-frequency representation functions for
automatic music transcription. ree different time-frequency
representation functions including IIR, FIR filter bank semigram
and constant-Q transform semigram are designed and tested. e
experiment results show that the filter bank based representations
are suitable for multiple-instrument recordings and the CQT-based
representations are suitable for solo-instrument recordings.
J.

∼
A.

Audio Enhancement System
is is a commercial project cooperated with a leading telecom
company in China, and I worked as the team leader. We designed
an audio effect enhancement system including 3D Virtual Sound,
Equalizer, Voice Clarity Enhancement, etc. I designed a Head-related
Transfer Function based 3D virtual surround sound sub system and a
10-band music equalizer. e result of our 3D virtual surround sound
was considered comparable with the result of Dolby Pro Logic by the
experts of the company.
2
S.

∼
O.

Remote Control System for Household Appliances
is is a remote control system for household appliances like TV, DVD,
Air Conditioner, etc. I designed an embedded system that converts
signal from Bluetooth to infrared signal, and the converted infrared
signal is compatible with remote controllers. We designed a remote
controller on cellphone based on this device. is remote control
system was integrated into a Smart Home system we built.
Skills
• • • C/C++
• • • Python
• • • Matlab
• • • IELTS 6.5
• • • Linux/Ubuntu
• • • LATEX
Honors & Awards
2013 Awarded by Nanyang Technological University Fund for Six-month Visiting Researcher
2012 Final Year Project was supported by School Funding; Final year thesis identified as outstanding
2011 First Prize of “TI CUP”– Internet of ings Innovation Design Contest for College students
2011 Awarded as “Merit Student of School Level” and won the first class scholarship
2010 My class was awarded as the “Excellent Class” of NWPU
2010 Awarded as “Merit Student of School Level” and won the first class scholarship
2009 Identified as Outstanding Member in “Jing Jin” Students League.
2009 Awarded as “Merit Student of School Level” and won the first class scholarship
2008 Endeavor Scholarship of Virya Foundation, HongKong
Activities
Mar. 24, 2016 Oral Presentation in ICASSP 2016
Sept. 21, 2015 Oral Presentation in workshop on Affective Social Multimedia Computing 2015
Sept. 10, 2015 Oral Presentation in INTERSPEECH 2015
Dec. 6, 2014 Oral Presentation in the 7th International Doctoral Forum
Jul. 10, 2014 Poster Presentation in ChinaSIP 2014
Sept., 2013 - Mar., 2014 Visiting Researcher at Institute for Infocomm Research, A*STAR, Singapore
Nov. 8, 2013 Oral Presentation in YES 2013
Apr., 2011 - Oct., 2011 Conference Volunteer for APSIPA ASC 2011
Spet. 1 - 3, 2009/2010 Freshmen Reception for NWPU
Publications & Patents
[1] Dongyan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Quy Hy Nguyen,
Minghui Dong, and Haizhou Li. “an automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker
similarity”. in 9th ISCA Speech Synthesis Workshop, INTERSPEECH, 2016.
[2] Huaiping Ming, Dongyan Huang, Lei Xie, Jie Wu, Minghui Dong, and Haizhou Li. “deep bidirectional lstm modeling of timbre and prosody for
emotional voice conversion”. in INTERSPEECH, 2016.
[3] Huaiping Ming, Dongyan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, and Haizhou Li. “exemplar-based sparse representation of timbre and
prosody for voice conversion”. in International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016.
[4] Minghui Dong, Chenyu Yang, Yanfeng Lu, Jochen Walter Ennes, Dongyan Huang, Huaiping Ming, Rong Tong, Siu Wa Lee, and Haizhou Li.
“mapping frames with dnn-hmm recognizer for nonparallel voice conversion”. in Asia-Pacific Signal and Information Processing Association Annual
Summit and Conference (APSIPA ASC), 2015.
[5] Huaiping Ming, Dongyan Huang, Lei Xie, Haizhou Li, and Minghui Dong. “an alternating optimization approach for phase retrieval”. in
INTERSPEECH, 2015.
[6] Huaiping Ming, Dongyan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, and Haizhou Li. “fundamental frequency modeling using wavelets for
emotional voice conversion”. in 6th Affective Computing and Intelligent Interaction (ACII) Workshop on Affective Social Multimedia Computing, 2015.
[7] Huaiping Ming, Dongyan Huang, Lei Xie, and Haizhou Li. “learning optimal features for music transcription”. in the 2nd IEEE China Summit
and International Conference on Signal and Information Processing (ChinaSIP), 2014.
[8] Huaiping Ming, Dongyan Huang, Lei Xie, and Haizhou Li. “filter bank design for automatic music transcription”. in the 2013 Young Engineers
and Scientists Conference on Multimedia, Communication and Mobile Application Technologies (YES), 2013.
[9] China Patent: ZL 201220142956.1. “multifunction remote control system on smart phone”. 2012.
[10] China Software Copyright Registration No. 2013SR053130. “a bluetooth signal to infrared signal software on singlechip”. 2012.
[11] China Software Copyright Registration No. 2013SR053406. “a 10-band music equalizer on tms320c6416 dsp platform”. 2012.

Cv huaiping

  • 1.
    Huaiping Ming Personal Details SMing F N Huaiping D  B October 07, 1986 O Research Engineer M (65)-91328088 Contacts A Institute for Infocomm Research, A*STAR, Singapore. A 1 Fusionopolis Way, #08-01 Connexis (South Tower), Singapore 138632. E minghuaiping@gmail.com Research Interests Audio & Speech Signal Processing, Voice Conversion, Text to Speech Synthesis, Machine Learning Qualifications M  - P RA I  I R A*STAR, Singapore S.  - M.  M.E. C S  T Northwestern Polytechnical University S.  - J.  B.S. C S  T Northwestern Polytechnical University Research & Project Experience S.  ∼ P Exemplar-based Voice Conversion & Emotional Voice Conversion We proposed an exemplar-based sparse representation of timbre and prosody for voice conversion that does not necessitate separately timbre conversion and prosody conversions. e spectrum, energy contour and fundamental frequency are simultaneously converted in a sparse constrained exemplar-based voice conversion framework. e proposed method achieves good results for both speaker conversion and emotional voice conversion. M.  ∼ D.  Phase Retrieval We addressed the problem of phase retrieval to recover a signal from the magnitude of its Fourier transform. A simple iterative minimization algorithm recovers a sparse signal from measurements of its Fourier transform (or other linear transform) magnitude based on the minimization of a block l1 norm. e proposed algorithm is robust to noise and scalable in practical implementation. For speech signals, the voice quality of the reconstructed speech is almost as good as the original speech. S.  ∼ M.  Automatic Music Transcription We evaluated different time-frequency representation functions for automatic music transcription. ree different time-frequency representation functions including IIR, FIR filter bank semigram and constant-Q transform semigram are designed and tested. e experiment results show that the filter bank based representations are suitable for multiple-instrument recordings and the CQT-based representations are suitable for solo-instrument recordings. J.  ∼ A.  Audio Enhancement System is is a commercial project cooperated with a leading telecom company in China, and I worked as the team leader. We designed an audio effect enhancement system including 3D Virtual Sound, Equalizer, Voice Clarity Enhancement, etc. I designed a Head-related Transfer Function based 3D virtual surround sound sub system and a 10-band music equalizer. e result of our 3D virtual surround sound was considered comparable with the result of Dolby Pro Logic by the experts of the company.
  • 2.
    2 S.  ∼ O.  Remote Control Systemfor Household Appliances is is a remote control system for household appliances like TV, DVD, Air Conditioner, etc. I designed an embedded system that converts signal from Bluetooth to infrared signal, and the converted infrared signal is compatible with remote controllers. We designed a remote controller on cellphone based on this device. is remote control system was integrated into a Smart Home system we built. Skills • • • C/C++ • • • Python • • • Matlab • • • IELTS 6.5 • • • Linux/Ubuntu • • • LATEX Honors & Awards 2013 Awarded by Nanyang Technological University Fund for Six-month Visiting Researcher 2012 Final Year Project was supported by School Funding; Final year thesis identified as outstanding 2011 First Prize of “TI CUP”– Internet of ings Innovation Design Contest for College students 2011 Awarded as “Merit Student of School Level” and won the first class scholarship 2010 My class was awarded as the “Excellent Class” of NWPU 2010 Awarded as “Merit Student of School Level” and won the first class scholarship 2009 Identified as Outstanding Member in “Jing Jin” Students League. 2009 Awarded as “Merit Student of School Level” and won the first class scholarship 2008 Endeavor Scholarship of Virya Foundation, HongKong Activities Mar. 24, 2016 Oral Presentation in ICASSP 2016 Sept. 21, 2015 Oral Presentation in workshop on Affective Social Multimedia Computing 2015 Sept. 10, 2015 Oral Presentation in INTERSPEECH 2015 Dec. 6, 2014 Oral Presentation in the 7th International Doctoral Forum Jul. 10, 2014 Poster Presentation in ChinaSIP 2014 Sept., 2013 - Mar., 2014 Visiting Researcher at Institute for Infocomm Research, A*STAR, Singapore Nov. 8, 2013 Oral Presentation in YES 2013 Apr., 2011 - Oct., 2011 Conference Volunteer for APSIPA ASC 2011 Spet. 1 - 3, 2009/2010 Freshmen Reception for NWPU Publications & Patents [1] Dongyan Huang, Lei Xie, Yvonne Siu Wa Lee, Jie Wu, Huaiping Ming, Xiaohai Tian, Shaofei Zhang, Chuang Ding, Mei Li, Quy Hy Nguyen, Minghui Dong, and Haizhou Li. “an automatic voice conversion evaluation strategy based on perceptual background noise distortion and speaker similarity”. in 9th ISCA Speech Synthesis Workshop, INTERSPEECH, 2016. [2] Huaiping Ming, Dongyan Huang, Lei Xie, Jie Wu, Minghui Dong, and Haizhou Li. “deep bidirectional lstm modeling of timbre and prosody for emotional voice conversion”. in INTERSPEECH, 2016. [3] Huaiping Ming, Dongyan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, and Haizhou Li. “exemplar-based sparse representation of timbre and prosody for voice conversion”. in International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016. [4] Minghui Dong, Chenyu Yang, Yanfeng Lu, Jochen Walter Ennes, Dongyan Huang, Huaiping Ming, Rong Tong, Siu Wa Lee, and Haizhou Li. “mapping frames with dnn-hmm recognizer for nonparallel voice conversion”. in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2015. [5] Huaiping Ming, Dongyan Huang, Lei Xie, Haizhou Li, and Minghui Dong. “an alternating optimization approach for phase retrieval”. in INTERSPEECH, 2015. [6] Huaiping Ming, Dongyan Huang, Lei Xie, Shaofei Zhang, Minghui Dong, and Haizhou Li. “fundamental frequency modeling using wavelets for emotional voice conversion”. in 6th Affective Computing and Intelligent Interaction (ACII) Workshop on Affective Social Multimedia Computing, 2015. [7] Huaiping Ming, Dongyan Huang, Lei Xie, and Haizhou Li. “learning optimal features for music transcription”. in the 2nd IEEE China Summit and International Conference on Signal and Information Processing (ChinaSIP), 2014. [8] Huaiping Ming, Dongyan Huang, Lei Xie, and Haizhou Li. “filter bank design for automatic music transcription”. in the 2013 Young Engineers and Scientists Conference on Multimedia, Communication and Mobile Application Technologies (YES), 2013. [9] China Patent: ZL 201220142956.1. “multifunction remote control system on smart phone”. 2012. [10] China Software Copyright Registration No. 2013SR053130. “a bluetooth signal to infrared signal software on singlechip”. 2012. [11] China Software Copyright Registration No. 2013SR053406. “a 10-band music equalizer on tms320c6416 dsp platform”. 2012.