張正一 (Chou Shouichi)/ MGdesigner
Paul Liu and me organize 徴音梅林開発委員會
● Wikimedia.tw: member of board of directors
(and direct tech development )
● A programmer
● A musician (Jazz ukulele, DTM)
In any Vocaloid product EULA
You didn't get whole rights
no “anti-society” （反社会） works
(so,What works are “anti-society”?)
● Trademarks protection (images, keywords)
(ex: 'Vocaloid' ,' 初音ミク ',' 初音ミク ''s image)
A free vocaloid-like
DIY a 『 vocaloid 』
● Programs: editor(frontend)+resampler+wavtool
● data: vocal DB - oto.ini + wav samples
● Vocal DB is an open spec ,many people DIY
vocaloid programs working flow
1.Editor: compose the melody(many notes)
2.Resampler: modulate a sample to Specified
pitch,or other parameters (velocity...).
3.Wavtool: combine these modulated wavs
Finally,we get a song vocal wav file,and mix into a
● Charge no fee,not freedom
● Default resampler work badly
● DB bad international support (S-JIS)
● Oto.ini no implementing ini comments “;“
● UTAU always auto sort oto.ini (hard collaboration)
● Hard UI control
● Not open source
● Its development is very private
One day, Paul Liu talked to me
● New Algorithm, 'World' better than Vocaloid2
Author: 日本山梨大學 Doctor 森勢将雅
● Patent free
● EFB-GW(Synthesizer) for UTAU
● Open source(old version GPL,newer is BSD)
During Dec,Dr. 森勢 'll do another great upgrade
How good is World algorithm?
● very awesome 'autotune'
(original official test is a realtime Karaoke autotune for 音痴 s. )
● Modulate a sample to any pitch without distortion
(Keep F0 well)
(Vocaloid2 can't ,so Miku need 3 different range versions of each sample)
● Very fast ,no need to pre-preapre frequency tables
(Just do it real time)
● If X86, Even works good on older machines(maybe
1: 14 Special effects
Defined in oto.ini
● 3 breath : br1,br2,br3 ( ex:Miku only have these breath. )
● Spanish 'R' rolling: trill
● Cough: cough
● Cry,dry tears:drytears
● Blownose: blownose
● Sucking: suck
sigh( 嘆 ):sgn1,sgn2,sgn3,sgn4
● Whistle :whsl
● clean throat: clnt
2: 日本方言 possible
EX: 円唇母音'う' in 関西弁 (video)
● in Mandarin ,there is the same 'u'
● Just borrow what we recorded.
● also can borrow other Mandarin samples for
synthesizing 方言 or some foreign languages.
(ex: 1 or 2 foreign lyrics in a Japanese song)
'v'ocaloid also can do speech
Better than traditional speech synthesis
● Accent(= pitch,velocity,rhythm,speed) controllable
● Could do many emotion(melody lines) : cry,angry...
TTS,story telling,emotional ' うかがか' possible
● Some tests which I have done by Miku: 1,2,3,4 based on my scale
algorithm. 'Auto render' possible,but….
● If use Vocaloid to do this,you need to beg YAMAHA for opening API. But
our software stack are open source. She could do more than singing.
About the vocal
Her name is 羅竺 (Lo Chu).
● We choose her voice from 20 girls from on internet.
● She is a singer in a JAZZ / anime cover song band.
● Also vocal acting trained.
● Japanese accent not bad.
Japanese friend ATsushi 發音指導
But very hard work
Japanese recording need 3~4 hours.
Intact Madarin(possibility on math ,then minus
repeated samples by Phonology)
Madarin recording needs days.
Our Oto.ini DB spec
● You can use ';' for comments
● Editors programs shouldn't resort the file
● IPA based (International Phonetic Alphabet)
● By IPA,Different languages could use common
(no more re-recording again, keep the DB size smaller, more storage efficiency )
Engine (now is xvsqExec ,may need to
Linne-editor (in dev)
(song editor,front end)
(resampler,EFB-GW variant )
Other programs in the future
Problem now: the editor(frontend)
● Cadencii is written by .net with binding too
many Windows native calls
● Jcadencii is very slow (Cadencii java port)
● Upstream dev stopped. We also give it up.
● Another open Utau frontend:
http://fluidvocalsynth.weebly.com/ (also .Net)
● Open source community
● OSS programmers,musicians,a
● Members are international(TW,JP)
● Official Site
● Github: https://github.com/ProjectMeilin/
● Slack (tech talk): https://meilin.slack.com/
(email me for invitation : email@example.com )
● FB fan page
● FB group (more about DB making and musician)
● Youtube channel