MDX challenge 2021 town hall presentation

•

0 likes•212 views

Chin Yun Yu

My slides use in mdx 2021 town hall online meeting.

Engineering

MDX 2021 Town Hall
Presentation
Chin-Yun Yu, Kin-Wai Cheuk
8/22
1

About Us
Chin-Yun Yu
● Organizer of the team Kazane_Ryo_no_Danna
● Software Engineer, Rayark (current position)
● Engineer, Vive R&D/Multi Media, HTC
● Research Assistant, Music and Culture Technology Lab, Institute of Information
Science, Academia Sinica
Kin-Wai Chuek
● PhD candidate, Singapore University of Technology and Design
2

Motivation
● Try out new ideas
● Get familiar with diﬀerent types of MIR topics
● Test our limits
● GET THE PRIZE
3

Final rank
● 4th place on leaderboard A
● 6th place on leaderboard B
4

Model Fusion
● Blending outputs from three diﬀerent model linearly
● Round 2 score: 6.85
Drums Bass Other Vocals
Model 1 (frequency masking) 0.2 0.2 0 0.2
Model 2 (frequency masking) 0.2 0.17 0.5 0.4
Model 3 (time domain) 0.6 0.73 0.5 0.4
6

Model 1: Improving X-UMX
● Two tiny tweaks on the loss function
○ Calculate the frequency-domain mse loss (Eq. 4a in [1]) using complex value instead of absolute
value
○ Apply one iteration of Multichannel Wiener Filtering (MWF) before calculating the loss
■ Our diﬀerentiable norbert package:
https://github.com/yoyololicon/norbert/tree/batch-support
○ Round 2 score: 5.5(baseline) => 5.64 (complex mse + MWF)
[1] Sawata, Ryosuke, et al. "All for One and One for All: Improving Music Separation by Bridging Networks." ICASSP 2021-2021
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. 7

Model 2: U-Net
● Adopt the U-Net structure from Spleeter
● Replace each convolution block with a D3
Block from D3Net
● Add two local attention layers as the
bottleneck layer
● Round 2 score: 6.03
● We also experimented with biaxial biLSTMs
(time and frequency axes)
● Round 2 score: 5.85
2D local
attention x2
8

Model 3: Improving Demus
● Cropping mechanism
○ The original Demucs uses center-crop to connect skipped layers when doing upsampling,
which might introduce some time shifting between layers
○ We neglectthose extra values at the end of the sequences
skipped layer
hidden layer
skipped layer
hidden layer
9

Model 3: Improving Demus
● Using 4 independent decoders, each of which corresponds to one source
○ We adjust the channel size of the decoders to have a comparable number of parameters
compared to the original version
● Round 2 score with these two modiﬁcations: 6.07 (baseline) => 6.27
LSTM
x2
LSTM
x2
Enc Dec
Enc
10

What we liked
● musdb dataloader is pretty slow
● Unexpected server error
● Cannot change team name after creation
12
● Friendly community
● Instant help from the support team
What could be improved

Acknowledgement
● Sung-Lin Yeh
● Showmin Wang
● Yu-Te Wu
● Yin-Jyun Luo
13

What's hot

cbs_sips2005Jordi Arnabat

Convolutional neural network from VGG to DenseNetSungminYou

2021 03-02-transformer interpretabilityJAEMINJEONG5

Dividing and Aggregating Network for Multi-view Action Recognition [Poster in...Dongang (Sean) Wang

Cnn methodAmirSajedi1

HardNet: Convolutional Network for Local Image DescriptionDmytro Mishkin

Learning Convolutional Neural Networks for GraphsMathias Niepert

Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone

CNNUkjae Jeong

Convolutional Neural Network and Its ApplicationsKasun Chinthaka Piyarathna

Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...Universitat Politècnica de Catalunya

PR-284: End-to-End Object Detection with Transformers(DETR)Jinwon Lee

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015Jia-Bin Huang

Deep Learning - RNN and CNNPradnya Saval

Image classification with Deep Neural NetworksYogendra Tamang

Neural Network as a functionTaisuke Oe

Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Networkivaderivader

Machine Learning - Convolutional Neural NetworkRichard Kuo

Image classification using CNNNoura Hussein

What's hot (20)

cbs_sips2005

Convolutional neural network from VGG to DenseNet

2021 03-02-transformer interpretability

Dividing and Aggregating Network for Multi-view Action Recognition [Poster in...

Cnn method

HardNet: Convolutional Network for Local Image Description

Learning Convolutional Neural Networks for Graphs

Modern Convolutional Neural Network techniques for image segmentation

CNN

Convolutional Neural Network and Its Applications

Multilayer Perceptron (DLAI D1L2 2017 UPC Deep Learning for Artificial Intell...

PR-284: End-to-End Object Detection with Transformers(DETR)

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.

Lecture 29 Convolutional Neural Networks - Computer Vision Spring2015

Deep Learning - RNN and CNN

Image classification with Deep Neural Networks

Neural Network as a function

Traffic Demand Prediction Based Dynamic Transition Convolutional Neural Network

Machine Learning - Convolutional Neural Network

Image classification using CNN

Similar to MDX challenge 2021 town hall presentation

Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...Lviv Data Science Summer School

Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu

Diametrical Mesh of Tree (D2D-MoT) Architecture: A Novel Routing Solution for...IDES Editor

Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01Hemant Jha

Ek35775781IJERA Editor

IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline

Sudormrf.pdfssuser849b73

Survey on optical flow estimation with DLLeapMind Inc

15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptxNibrasulIslam

Future semantic segmentation with convolutional LSTMKyuri Kim

Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...NVIDIA Taiwan

Semantic Segmentation on Satellite ImageryRAHUL BHOJWANI

Efficient Implementation of Low Power 2-D DCT ArchitectureIJMER

Optimized Layout Design of Priority Encoder using 65nm TechnologyIJEEE

TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...Iosif Itkin

240318_JW_labseminar[Attention Is All You Need].pptxthanhdowork

Resume@NarasimhaReddyNarasimha Reddy Kukkam

Seq2Seq (encoder decoder) model佳蓉倪

PR-144: SqueezeNext: Hardware-Aware Neural Network DesignJinwon Lee

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya

Similar to MDX challenge 2021 town hall presentation (20)

Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classif...

Once-for-All: Train One Network and Specialize it for Efficient Deployment

Diametrical Mesh of Tree (D2D-MoT) Architecture: A Novel Routing Solution for...

Vlsiphysicaldesignautomationonpartitioning 120219012744-phpapp01

Ek35775781

IJCER (www.ijceronline.com) International Journal of computational Engineerin...

Sudormrf.pdf

Survey on optical flow estimation with DL

15_NEW-2020-ATTENTION-ENC-DEC-TRANSFORMERS-Lect15.pptx

Future semantic segmentation with convolutional LSTM

Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...

Semantic Segmentation on Satellite Imagery

Efficient Implementation of Low Power 2-D DCT Architecture

Optimized Layout Design of Priority Encoder using 65nm Technology

TMPA-2017: Distributed Analysis of the BMC Kind: Making It Fit the Tornado Su...

240318_JW_labseminar[Attention Is All You Need].pptx

Resume@NarasimhaReddy

Seq2Seq (encoder decoder) model

PR-144: SqueezeNext: Hardware-Aware Neural Network Design

Transfer Learning and Domain Adaptation - Ramon Morros - UPC Barcelona 2018

Recently uploaded

Design For Accessibility: Getting it right from the startQuintin Balsdon

Unleashing the Power of the SORA AI lastest leapRishantSharmaFr

457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptxrouholahahmadi9876

Theory of Time 2024 (Universal Theory for Everything)Ramkumar k

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsArindam Chakraborty, Ph.D., P.E. (CA, TX)

Computer Networks Basics of Network DevicesChandrakantDivate1

COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA

Online electricity billing project report..pdfKamal Acharya

Introduction to Serverless with AWS LambdaOmar Fathy

Introduction to Data Visualization,Matplotlib.pdfsumitt6_25730773

Online food ordering system project report.pdfKamal Acharya

Employee leave management system project.Kamal Acharya

DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal

Double Revolving field theory-how the rotor develops torqueBhangaleSonal

Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan

Learn the concepts of Thermodynamics on Magic MarksMagic Marks

Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptxMuhammadAsimMuhammad6

Thermal Engineering -unit - III & IV.pptDineshKumar4165

Generative AI or GenAI technology based PPTbhaskargani46

Hostel management system project report..pdfKamal Acharya

Recently uploaded (20)

Design For Accessibility: Getting it right from the start

Unleashing the Power of the SORA AI lastest leap

457503602-5-Gas-Well-Testing-and-Analysis-pptx.pptx

Theory of Time 2024 (Universal Theory for Everything)

FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads

Computer Networks Basics of Network Devices

COST-EFFETIVE and Energy Efficient BUILDINGS ptx

Online electricity billing project report..pdf

Introduction to Serverless with AWS Lambda

Introduction to Data Visualization,Matplotlib.pdf

Online food ordering system project report.pdf

Employee leave management system project.

DC MACHINE-Motoring and generation, Armature circuit equation

Double Revolving field theory-how the rotor develops torque

Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...

Learn the concepts of Thermodynamics on Magic Marks

Orlando’s Arnold Palmer Hospital Layout Strategy-1.pptx

Thermal Engineering -unit - III & IV.ppt

Generative AI or GenAI technology based PPT

Hostel management system project report..pdf

MDX challenge 2021 town hall presentation

1. MDX 2021 Town Hall Presentation Chin-Yun Yu, Kin-Wai Cheuk 8/22 1

2. About Us Chin-Yun Yu ● Organizer of the team Kazane_Ryo_no_Danna ● Software Engineer, Rayark (current position) ● Engineer, Vive R&D/Multi Media, HTC ● Research Assistant, Music and Culture Technology Lab, Institute of Information Science, Academia Sinica Kin-Wai Chuek ● PhD candidate, Singapore University of Technology and Design 2

3. Motivation ● Try out new ideas ● Get familiar with diﬀerent types of MIR topics ● Test our limits ● GET THE PRIZE 3

4. Final rank ● 4th place on leaderboard A ● 6th place on leaderboard B 4

5. Our Approach 5

6. Model Fusion ● Blending outputs from three diﬀerent model linearly ● Round 2 score: 6.85 Drums Bass Other Vocals Model 1 (frequency masking) 0.2 0.2 0 0.2 Model 2 (frequency masking) 0.2 0.17 0.5 0.4 Model 3 (time domain) 0.6 0.73 0.5 0.4 6

7. Model 1: Improving X-UMX ● Two tiny tweaks on the loss function ○ Calculate the frequency-domain mse loss (Eq. 4a in [1]) using complex value instead of absolute value ○ Apply one iteration of Multichannel Wiener Filtering (MWF) before calculating the loss ■ Our diﬀerentiable norbert package: https://github.com/yoyololicon/norbert/tree/batch-support ○ Round 2 score: 5.5(baseline) => 5.64 (complex mse + MWF) [1] Sawata, Ryosuke, et al. "All for One and One for All: Improving Music Separation by Bridging Networks." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. 7

8. Model 2: U-Net ● Adopt the U-Net structure from Spleeter ● Replace each convolution block with a D3 Block from D3Net ● Add two local attention layers as the bottleneck layer ● Round 2 score: 6.03 ● We also experimented with biaxial biLSTMs (time and frequency axes) ● Round 2 score: 5.85 2D local attention x2 8

9. Model 3: Improving Demus ● Cropping mechanism ○ The original Demucs uses center-crop to connect skipped layers when doing upsampling, which might introduce some time shifting between layers ○ We neglectthose extra values at the end of the sequences skipped layer hidden layer skipped layer hidden layer 9

10. Model 3: Improving Demus ● Using 4 independent decoders, each of which corresponds to one source ○ We adjust the channel size of the decoders to have a comparable number of parameters compared to the original version ● Round 2 score with these two modiﬁcations: 6.07 (baseline) => 6.27 LSTM x2 LSTM x2 Enc Dec Enc 10

11. General Impressions 11

12. What we liked ● musdb dataloader is pretty slow ● Unexpected server error ● Cannot change team name after creation 12 ● Friendly community ● Instant help from the support team What could be improved

13. Acknowledgement ● Sung-Lin Yeh ● Showmin Wang ● Yu-Te Wu ● Yin-Jyun Luo 13

MDX challenge 2021 town hall presentation

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to MDX challenge 2021 town hall presentation

Similar to MDX challenge 2021 town hall presentation (20)

Recently uploaded

Recently uploaded (20)

MDX challenge 2021 town hall presentation