Slide author
●
張正一 (Chou Shouichi)/ MGdesigner
●
Paul Liu and me organize 徴音梅林開発委員會
● Wikimedia.tw: member of board of directors
(and direct tech development )
● A programmer
● A musician (Jazz ukulele, DTM)
● Shoichi.chou@gmail.com
In any Vocaloid product EULA
You didn't get whole rights
●
no “anti-society” (反社会) works
(so,What works are “anti-society”?)
● Trademarks protection (images, keywords)
(ex: 'Vocaloid' ,' 初音ミク ',' 初音ミク ''s image)
No using Miku images=not popular
musicians are
controlled
No freedom
Be ruled
● Using a Gibson guitar,you are its master.
● Using Vocaloid products, You are their slave.
A free vocaloid-like
●
DIY a 『 vocaloid 』
● Programs: editor(frontend)+resampler+wavtool
● data: vocal DB - oto.ini + wav samples
● Vocal DB is an open spec ,many people DIY
vocaloid programs working flow
1.Editor: compose the melody(many notes)
2.Resampler: modulate a sample to Specified
pitch,or other parameters (velocity...).
3.Wavtool: combine these modulated wavs
Finally,we get a song vocal wav file,and mix into a
song
but
● Charge no fee,not freedom
● Default resampler work badly
● DB bad international support (S-JIS)
● Oto.ini no implementing ini comments “;“
● UTAU always auto sort oto.ini (hard collaboration)
● Hard UI control
● Not open source
● Its development is very private
During 2011-2012
One day, Paul Liu talked to me
● New Algorithm, 'World' better than Vocaloid2
●
Author: 日本山梨大學 Doctor 森勢将雅
● Patent free
● EFB-GW(Synthesizer) for UTAU
● Open source(old version GPL,newer is BSD)
● https://github.com/mmorise/World
●
During Dec,Dr. 森勢 'll do another great upgrade
How good is World algorithm?
● very awesome 'autotune'
(original official test is a realtime Karaoke autotune for 音痴 s. )
● Modulate a sample to any pitch without distortion
(Keep F0 well)
(Vocaloid2 can't ,so Miku need 3 different range versions of each sample)
● Very fast ,no need to pre-preapre frequency tables
(Just do it real time)
● If X86, Even works good on older machines(maybe
on ARM)
徵音梅林
「徵音」 :Ancient Chinese,Japanese
Pentatonic scale note. (Do Re Mi Sol La)
宮(きゅう)、商(しょう)、 角(かく)、 徴(
Also means We 'recruit' a voice actor(and
also a Jazz singer) from Internet
「梅林」: Merlin(super wizard)
林: Linux
Project Meilin Features
● CC-BY
● Utau compatible
● Professional recording(in studio)
● Src:24bits 48000hz wavs
●
VCV 連続音 、 VC 単独音
(V - Vowel c - Consonant)
● Recorded: Japanese,Mandarin(Taiwan style)
How good? A test
● Commercial Miku VS. open content Meilin
● V2 Miku each sample recorded high,middle,low
versions
VS.
Meilin each sample just record 1
version.
Listen to the
comparing video…
( song: 歌い手様総合テスト , Start from 0:44)
Especially check super low pitch and super
high pitch if is distorted (失真) ?
fact
Miku DB:
● 1gb+
● Only Japanese
Meilin DB:
● 627mb
● Japanese+Mardarin
●
Mardarlin DB is 3 倍
of JP DB
thank to Dr. 森勢
Without his effort and kindness,
a good FOSS virtual singer is
imposible
1: 14 Special effects
Defined in oto.ini
● 3 breath : br1,br2,br3 ( ex:Miku only have these breath. )
● Spanish 'R' rolling: trill
● Cough: cough
● Cry,dry tears:drytears
● Blownose: blownose
● Sucking: suck
●
sigh( 嘆 ):sgn1,sgn2,sgn3,sgn4
● Whistle :whsl
● clean throat: clnt
2: 日本方言 possible
●
EX: 円唇母音'う' in 関西弁 (video)
● in Mandarin ,there is the same 'u'
● Just borrow what we recorded.
● also can borrow other Mandarin samples for
synthesizing 方言 or some foreign languages.
(ex: 1 or 2 foreign lyrics in a Japanese song)
'v'ocaloid also can do speech
synthesis
Better than traditional speech synthesis
● Accent(= pitch,velocity,rhythm,speed) controllable
● Could do many emotion(melody lines) : cry,angry...
●
TTS,story telling,emotional ' うかがか' possible
● Some tests which I have done by Miku: 1,2,3,4 based on my scale
algorithm. 'Auto render' possible,but….
● If use Vocaloid to do this,you need to beg YAMAHA for opening API. But
our software stack are open source. She could do more than singing.
Thanks to sponsor 阿怪 (Aguai),my master
(A famous POP song producer in TW.)
About the vocal
●
Her name is 羅竺 (Lo Chu).
● We choose her voice from 20 girls from on internet.
● She is a singer in a JAZZ / anime cover song band.
● Also vocal acting trained.
● Japanese accent not bad.
●
Japanese friend ATsushi 發音指導
But very hard work
Japanese recording need 3~4 hours.
But
Intact Madarin(possibility on math ,then minus
repeated samples by Phonology)
Madarin recording needs days.
LINNE platform
● We defined the FOSS 'v'ocaloid stack
● Of course opensource
● Compatible with Utau DB (but UTF-8)
● resampler+wavtool+editor(interface)+DB
-making tools
● May include 'hardware'
Our Oto.ini DB spec
● You can use ';' for comments
● Editors programs shouldn't resort the file
● UTF-8
● IPA based (International Phonetic Alphabet)
● By IPA,Different languages could use common
pronunciation samples
(no more re-recording again, keep the DB size smaller, more storage efficiency )
Engine (now is xvsqExec ,may need to
be changed)
Jcadencii
Linne-editor (in dev)
(song editor,front end)
Wavtool-pl
(GPL wavtool)
tn_fnds_yc (gpl)
(resampler,EFB-GW variant )
World lib
Other programs in the future
ex: linne-TTS
Problem now: the editor(frontend)
● Cadencii is written by .net with binding too
many Windows native calls
● Jcadencii is very slow (Cadencii java port)
● Upstream dev stopped. We also give it up.
● Another open Utau frontend:
http://fluidvocalsynth.weebly.com/ (also .Net)
徴音梅林開発委員会
● Open source community
● OSS programmers,musicians,a
physicist,Phonologists,artists...
● Members are international(TW,JP)
welcome
● Official Site
● Github: https://github.com/ProjectMeilin/
● Slack (tech talk): https://meilin.slack.com/
(email me for invitation : shoichi.chou@gmail.com )
● FB fan page
● FB group (more about DB making and musician)
● Youtube channel