MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training

2022.01.07
MusicBERT:
Symbolic Music Understanding
with Large-Scale Pre-Training
Mingliang Zeng, Xu Tan, Rui Wang, Zeqian Ju, Tao Qin, Tie-Yan Liu
ACL 2021
Hyeshin Chu

Contents
• Overview
• Introduction
• Related Work
• Methodology
• Experiments & Results
• Conclusion

3
Overview
Suggest novel methods to apply NLP approaches to music domain
Introduce MusicBERT, a large-scale pretrained model for symbolic music understanding
Evaluate the performance on four tasks

4
Overview
Suggest novel methods to apply NLP approaches to music domain
Introduce MusicBERT, a large-scale pretrained model for symbolic music understanding
Evaluate the performance on four tasks

5
Contributions
Construct a large-scale symbolic music corpus
– Million MIDI Dataset(MMD)
Design some mechanisms to enhance
pre-training with symbolic music data
(OctupleMIDI Encoding & Masking Strategies)
Achieve the state-of-the-art results on
four music understanding tasks
: Melody Completion, Accompaniment Suggestion,
Genre Classification, and Style Classification

6
Related Work
Symbolic Music Understanding Symbolic Music Encoding
Masking Strategies in Pre-
training
Word2vec models for
music:
• Huang et al., 2016
• Madjiheurem et al.,
2016
Divide music pieces
 Fixed duration music
slices
• Herremans et al., 2017
• Chuan et al., 2020
Small NN models &
Only a few music tokens
as inputs
MIDI-based
• MIDI
• REMI (Huang and Yang,
2020)
• CP (Hsiao et al., 2021)
Pianoroll-based
• Brunner et al., 2018
• Ji et al., 2020
Still need long input tokens
Application of masking
strategies for music
domain
• MASS (Song et al., 2019)
• SpanBERT (Joshi et al., 2020)
Not considering the
difference between
NLP & music

7
Model Overview
MusicBERT, a large scale Transformer model for symbolic music understanding

8
Model Overview
Based on Transformer encoder (Vaswani et al., 2017)

9
Model Overview
A novel encoding method, OctupleMIDI, to encode the music sequence more
efficiently

10
Model Overview
Predict music tokens as output

11
OctupleMIDI Encoding
Figure 2. Different encoding methods for symbolic music

12
Previous MIDI-based representations: Still long for Transformer structure
(computation complexity & learning inefficiency)

13
OctupleMIDI,
a compact symbolic music encoding method
• Encode 6 notes into 6 tokens
• Much shorter than REMI & CP
• Apply to various kinds of music

14
OctupleMIDI,

15
OctupleMIDI,
Each Octuple token:
• Correspon to a note
• Contain 8 elements

16
Time Signature
Tempo
Bar and Position
A fraction (e.g., 2/4):
• Length of a beat (note duration  e.g., a quarter note in
2/4),
• Number of beats in a bar (e.g., 2 beats in 2/4)
Beats per minute (BPM)
• Pace of music
• From 16 to 256 for OctupleMIDI
On-set time of a note
• 256 bars in a music piece (0 to 255)
• 1/64 note to represent the on-set time of a note (from
0)

17
Instrument
Pitch
Duration
Velocity
Follow MIDI format
• 129 tokens to represent instruments
• 0 to 127: different general instruments (e.g., piano and
bass)
• 128: special percussion instrument (e.g., drum)
Note pitches for general instruments
• 128 tokens to represent pitch values (follow MIDI
format)
Note pitches for percussion instruments
• 128 pitch tokens to represent percussion type
Note duration
• 128 tokens (percussion: all set to 0)
Quantize the velocity of a note into 32 different values
• Interval of 4 (e.g., 2, 6, 10, … , 122, 126)

18
Masking Strategy
Bar-level masking strategy:
Elements with the same type in the same bar & mask simulaneously
 Avoid information leakage & Learn the contextual representation well

19
Pre-training Corpus
Table 2. Size of different music datasets
OctupleMIDI encoding is universal
 Most MIDI files can be converted
without noticeable loss of musical
information
 Cleaning and deduplication
 Obtain Million-MIDI Dataset (MMD):
1.5 million songs with 2 billion octuple
tokens (musical notes)

20
Experiments & Results
Pre-training Setup Fine-tuning MusicBERT Method Analysis
Table 4. Model configurations of MusicBERT
Small MusicBERT
To compare with baselines (similar data
size)
Base MusicBERT
To achieve the SOTA results

21
Four downstream task
Melody Completion Genre & Style Classification
Accompaniment Suggestion
Table 3. Results of different models on the four downstream tasks

22
Table 3. Results of different models
on the four downstream tasks
Task Find the most matched consecutive phrase
in a given set of candidates for a given melodic
phrase
Evaluation The rate of correctly chosen phrase
in the top k candidates
Best Performance 𝑀𝑢𝑠𝑖𝑐𝐵𝐸𝑅𝑇𝑠𝑚𝑎𝑙𝑙 , 𝑀𝑢𝑠𝑖𝑐𝐵𝐸𝑅𝑇𝑏𝑎𝑠𝑒

23
Task To find the most related accompaniment phrase
in a given set of harmonic phrase candidates for a
given melodic phrase
Evaluation The rate of correctly chosen phrase
in the top k candidates

24
Task To classify the genre and style
Dataset TOP-MAGD for genre, MASD for style
Evaluation F1-micro score

25
Experiment on 𝑴𝒖𝒔𝒊𝒄𝑩𝑬𝑹𝑻𝒔𝒎𝒂𝒍𝒍
Effectiveness of OctupleMIDI
Effectiveness of Bar-Level
Masking
Effectiveness of Pre-training
OctupleMIDI significantly outperforms REMI and CP
: Learn from a larger proportion of a music song
with the compact OctupleMIDI encoding
Table 5. Results of different encoding methods

26
Masking
Random Randomly masks the elements in the octuple
token
Octuple Randomly mask some octuple tokens
(mask all the elements in an octuple token)
Bar The elements with the same type in the same bar are

27
Masking
Pre-training is critical for symbolic music
understanding

28
Conclusion
Propose OctupleMIDI encoding & bar-level masking strategy for music
domain
Develop MusicBERT, a large-scale pre-trained model
for symbolic music understanding
Achieve state-of-the-art performance on
all four evaluated symbolic music understanding task

29
For my research
Acquire some baseline models & datasets to review
Understand new symbolic music representation method
Learn how to design experiments to measure each feature of a model

MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training

Similar to MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training (9)

More from ivaderivader

More from ivaderivader (20)

Recently uploaded

Recently uploaded (20)

MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training