SlideShare a Scribd company logo
1 of 37
Download to read offline
NMF in PyTorch
yoyololicon/pytorch-NMF
By Chin Yun Yu
Workshop in Marine Ecoacoustics and Informatics Lab, 9/19
About me
● Background
○ Bachelor of CS, NCTU
○ Research Assistant in Music and Culture Technology Lab, IIS, AS
○ Engineer in Vive R&D/Multi Media, HTC (current position)
○ MIR research/open source project entusiast
○ Music producer
○ Guitarist of Catalyst (Taipei)
Outline
● What is NMF?
● What is PyTorch?
● How I developed PyTorch NMF
● What NMF can do with PyTorch
(particularly in the erra of deep
learning)?
What is NMF?
● Non-negative Matrix Factorization
● V, W, H > 0
● Useful to analyze audio spectrogram
Parameter update
● Févotte, Cédric, and Jérôme Idier. "Algorithms for nonnegative matrix
factorization with the β-divergence." Neural computation 23.9 (2011):
2421-2456.
● multiplicative update
Drums Transcription
● Wu, Chih-Wei, et al. "A review of automatic drum transcription." IEEE/ACM
Transactions on Audio, Speech, and Language Processing 26.9 (2018):
1457-1483.
Extend to convolutional case
● Smaragdis, Paris. "Non-negative matrix factor deconvolution; extraction of
multiple sound sources from monophonic inputs." International Conference on
Independent Component Analysis and Signal Separation. Springer, Berlin,
Heidelberg, 2004.
Drums Transcription (with NMFD)
● Wu, Chih-Wei, et al. "A review of automatic drum transcription." IEEE/ACM
Transactions on Audio, Speech, and Language Processing 26.9 (2018):
1457-1483.
Drums Transcription (with NMFD)
● Dittmar, Christian. "Source Separation and Restoration of Drum Sounds in
Music Recordings." (2018).
Reverse Engineering the Amen Break
● Dittmar, Christian. "Source Separation and Restoration of Drum Sounds in
Music Recordings." (2018).
Music Structure Analyze
● López-Serrano, Patricio, et al. "NMF TOOLBOX: MUSIC PROCESSING
APPLICATIONS OF NONNEGATIVE MATRIX FACTORIZATION." (2019).
2D deconvolutional case
● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D
deconvolution for blind single channel source separation." International
Conference on Independent Component Analysis and Signal Separation.
Springer, Berlin, Heidelberg, 2006.
2D deconvolutional case
● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D
deconvolution for blind single channel source separation." International
Conference on Independent Component Analysis and Signal Separation.
Springer, Berlin, Heidelberg, 2006.
2D deconvolutional case
● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D
deconvolution for blind single channel source separation." International
Conference on Independent Component Analysis and Signal Separation.
Springer, Berlin, Heidelberg, 2006.
2D deconvolutional case
● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D
deconvolution for blind single channel source separation." International
Conference on Independent Component Analysis and Signal Separation.
Springer, Berlin, Heidelberg, 2006.
Let it BEE - replace sound source
● Driedger, Jonathan, Thomas Prätzlich, and Meinard Müller. "Let it Bee-Towards
NMF-Inspired Audio Mosaicing." ISMIR. 2015.
https://www.audiolabs-erlangen.de/resources/
MIR/2015-ISMIR-LetItBee
PyTorch
● One of the most well-known deep learning frameworks
● Lauched in 2016 by Facebook
● Features
○ Easy to use python API
○ Dynamic computing graphs
○ Support GPU acceleration
○ automatic gradients calculation
○ Easy for prototyping
● Has been quickly adopted by researchers from many fields
CONFERENCE
PT
2018
PT
2019
PT GROWTH
TF
2018
TF 2019 TF GROWTH
CVPR 82 280 240% 116 125 7.7%
NAACL 12 66 450% 34 21 -38.2%
ACL 26 103 296% 34 33 -2.9%
ICLR 24 70 192% 54 53 -1.9%
ICML 23 69 200% 40 53 32.5%
https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/
Developement
of torchnmf
● multiplicative update
● deconvolutional class
inheritance
re-visit multiplicative update rules
Gradient descent in Deep Learning
can obtain via torch.autogradlearning rate
old parameters
new parameters
re-factor multiplicative update
coefficients
can obtain via torch.autograd
Regular NMF v.s. PyTorch NMF
➔ Weight update in regular NMF
a. Compute positive component
b. Compute negative component
c. derive multiplicative update coefficients
d. multiplication
➔ Weight update in PyTorch NMF
a. Compute loss value
b. derive gradients via backward propagation
c. Compute positive component
d. Compute negative component by subtraction
e. derive multiplicative update coefficients
f. multiplication
➔ Reduce almost half lines of code -> easy to maintain
Extended to convolutional cases
Using torch.nn.functional.conv1d/2d/3d and inheriting from NMF base class
NMF
NMFD NMF2D NMF3D
NMF NMFD
NMF2D NMF3D
Utilize the power of GPU
Demo Time
Into the erra of
Deep Learning
?
https://github.com/Pold87/academic-keyword-occurrence
How if we combine DL with traditional
methods ?
● Ravanelli, Mirco, and Yoshua Bengio.
"Speaker recognition from raw waveform with
sincnet." 2018 IEEE Spoken Language
Technology Workshop (SLT). IEEE, 2018.
● Use Band-passed signal as input feature
● Less parameters to learn, more robust,
converge faster, lower error
Conclusion of torchnmf
● Advantages compare to others
○ Maintenance is more easy
○ Better support for convolutional cases (especially 3D)
○ Can run on GPU for faster convergence
● News
○ Feature such as batching/controlling sparsity is on the way!
● Next step
○ Full autograd support so it can be integrated in other DL models
○ Documentations
○ upload to PyPI
● Feel free to create PRs or issues!
name = "torchnmf"
__version__ = '0.2'
__maintainer__ = 'Chin-Yun Yu'
__email__ = 'ya70201@gmail.com'
Q&A

More Related Content

What's hot

What's hot (20)

MIRU2016 チュートリアル
MIRU2016 チュートリアルMIRU2016 チュートリアル
MIRU2016 チュートリアル
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
複素ラプラス分布に基づく非負値行列因子分解
複素ラプラス分布に基づく非負値行列因子分解複素ラプラス分布に基づく非負値行列因子分解
複素ラプラス分布に基づく非負値行列因子分解
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
音響システム特論 第11回 実環境における音響信号処理と機械学習
音響システム特論 第11回 実環境における音響信号処理と機械学習音響システム特論 第11回 実環境における音響信号処理と機械学習
音響システム特論 第11回 実環境における音響信号処理と機械学習
 
[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation[DL輪読会]Adversarial Feature Matching for Text Generation
[DL輪読会]Adversarial Feature Matching for Text Generation
 
深層学習 勉強会第1回 ディープラーニングの歴史とFFNNの設計
深層学習 勉強会第1回 ディープラーニングの歴史とFFNNの設計深層学習 勉強会第1回 ディープラーニングの歴史とFFNNの設計
深層学習 勉強会第1回 ディープラーニングの歴史とFFNNの設計
 
Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元
 
Active Learning の基礎と最近の研究
Active Learning の基礎と最近の研究Active Learning の基礎と最近の研究
Active Learning の基礎と最近の研究
 
ベイズ深層学習5章 ニューラルネットワークのベイズ推論 Bayesian deep learning
ベイズ深層学習5章 ニューラルネットワークのベイズ推論 Bayesian deep learningベイズ深層学習5章 ニューラルネットワークのベイズ推論 Bayesian deep learning
ベイズ深層学習5章 ニューラルネットワークのベイズ推論 Bayesian deep learning
 
Onoma-to-wave: オノマトペを利用した環境音合成手法の提案
Onoma-to-wave: オノマトペを利用した環境音合成手法の提案Onoma-to-wave: オノマトペを利用した環境音合成手法の提案
Onoma-to-wave: オノマトペを利用した環境音合成手法の提案
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
[DL Hacks]Variational Approaches For Auto-Encoding  Generative Adversarial Ne...[DL Hacks]Variational Approaches For Auto-Encoding  Generative Adversarial Ne...
[DL Hacks]Variational Approaches For Auto-Encoding Generative Adversarial Ne...
 
深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定
 
Saito2103slp
Saito2103slpSaito2103slp
Saito2103slp
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
 
Explanation of SinGAN
Explanation of SinGANExplanation of SinGAN
Explanation of SinGAN
 
JTubeSpeech: 音声認識と話者照合のために YouTube から構築される日本語音声コーパス
JTubeSpeech:  音声認識と話者照合のために YouTube から構築される日本語音声コーパスJTubeSpeech:  音声認識と話者照合のために YouTube から構築される日本語音声コーパス
JTubeSpeech: 音声認識と話者照合のために YouTube から構築される日本語音声コーパス
 
Kameoka2017 ieice03
Kameoka2017 ieice03Kameoka2017 ieice03
Kameoka2017 ieice03
 

Similar to NMF in PyTorch

ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2
Minh Tuan Nguyen
 
Wavelet based image compression technique
Wavelet based image compression techniqueWavelet based image compression technique
Wavelet based image compression technique
Priyanka Pachori
 
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
Ijripublishers Ijri
 
Performance of waveform cross correlation using a global and regular grid of...
Performance of waveform cross correlation  using a global and regular grid of...Performance of waveform cross correlation  using a global and regular grid of...
Performance of waveform cross correlation using a global and regular grid of...
Mikhail Rozhkov
 

Similar to NMF in PyTorch (20)

CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2CVPR 2018 Paper Reading MobileNet V2
CVPR 2018 Paper Reading MobileNet V2
 
FYP presentation
FYP presentationFYP presentation
FYP presentation
 
ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2ICT-GroupProject-Report2-NguyenDangHoa_2
ICT-GroupProject-Report2-NguyenDangHoa_2
 
The World Nuclear Industry Status Report 2014
The World Nuclear Industry Status Report 2014The World Nuclear Industry Status Report 2014
The World Nuclear Industry Status Report 2014
 
Critical analysis of radar data signal de noising by implementation of haar w...
Critical analysis of radar data signal de noising by implementation of haar w...Critical analysis of radar data signal de noising by implementation of haar w...
Critical analysis of radar data signal de noising by implementation of haar w...
 
J010224750
J010224750J010224750
J010224750
 
World Nuclear Report 2014
World Nuclear Report 2014World Nuclear Report 2014
World Nuclear Report 2014
 
Wavelet based image compression technique
Wavelet based image compression techniqueWavelet based image compression technique
Wavelet based image compression technique
 
Master Thesis Defense
Master Thesis DefenseMaster Thesis Defense
Master Thesis Defense
 
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
Enhanced Watemarked Images by Various Attacks Based on DWT with Differential ...
 
HEMS: A Home Energy Market Simulator
HEMS: A Home Energy Market SimulatorHEMS: A Home Energy Market Simulator
HEMS: A Home Energy Market Simulator
 
Technical efficiency of producers’ in the dryland areas of west africa a mu...
Technical efficiency of producers’ in the dryland areas of west africa   a mu...Technical efficiency of producers’ in the dryland areas of west africa   a mu...
Technical efficiency of producers’ in the dryland areas of west africa a mu...
 
Progressive Transformer-Based Generation of Radiology Reports
Progressive Transformer-Based Generation of Radiology ReportsProgressive Transformer-Based Generation of Radiology Reports
Progressive Transformer-Based Generation of Radiology Reports
 
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...Learning Biologically Relevant Features Using Convolutional Neural Networks f...
Learning Biologically Relevant Features Using Convolutional Neural Networks f...
 
Wavelet transform in image compression
Wavelet transform in image compressionWavelet transform in image compression
Wavelet transform in image compression
 
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
Ijri ece-01-02 image enhancement aided denoising using dual tree complex wave...
 
Parallel WaveGAN review
Parallel WaveGAN reviewParallel WaveGAN review
Parallel WaveGAN review
 
New books july2013
New books july2013New books july2013
New books july2013
 
PhD_Dissertation_HeZhang
PhD_Dissertation_HeZhangPhD_Dissertation_HeZhang
PhD_Dissertation_HeZhang
 
Performance of waveform cross correlation using a global and regular grid of...
Performance of waveform cross correlation  using a global and regular grid of...Performance of waveform cross correlation  using a global and regular grid of...
Performance of waveform cross correlation using a global and regular grid of...
 

Recently uploaded

Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
Kira Dess
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.
benjamincojr
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
rahulmanepalli02
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
MaherOthman7
 

Recently uploaded (20)

Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
 
Artificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdfArtificial intelligence presentation2-171219131633.pdf
Artificial intelligence presentation2-171219131633.pdf
 
electrical installation and maintenance.
electrical installation and maintenance.electrical installation and maintenance.
electrical installation and maintenance.
 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
 
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
Intro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney UniIntro to Design (for Engineers) at Sydney Uni
Intro to Design (for Engineers) at Sydney Uni
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 

NMF in PyTorch

  • 1. NMF in PyTorch yoyololicon/pytorch-NMF By Chin Yun Yu Workshop in Marine Ecoacoustics and Informatics Lab, 9/19
  • 2. About me ● Background ○ Bachelor of CS, NCTU ○ Research Assistant in Music and Culture Technology Lab, IIS, AS ○ Engineer in Vive R&D/Multi Media, HTC (current position) ○ MIR research/open source project entusiast ○ Music producer ○ Guitarist of Catalyst (Taipei)
  • 3. Outline ● What is NMF? ● What is PyTorch? ● How I developed PyTorch NMF ● What NMF can do with PyTorch (particularly in the erra of deep learning)?
  • 4. What is NMF? ● Non-negative Matrix Factorization ● V, W, H > 0 ● Useful to analyze audio spectrogram
  • 5. Parameter update ● Févotte, Cédric, and Jérôme Idier. "Algorithms for nonnegative matrix factorization with the β-divergence." Neural computation 23.9 (2011): 2421-2456. ● multiplicative update
  • 6. Drums Transcription ● Wu, Chih-Wei, et al. "A review of automatic drum transcription." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26.9 (2018): 1457-1483.
  • 7. Extend to convolutional case ● Smaragdis, Paris. "Non-negative matrix factor deconvolution; extraction of multiple sound sources from monophonic inputs." International Conference on Independent Component Analysis and Signal Separation. Springer, Berlin, Heidelberg, 2004.
  • 8. Drums Transcription (with NMFD) ● Wu, Chih-Wei, et al. "A review of automatic drum transcription." IEEE/ACM Transactions on Audio, Speech, and Language Processing 26.9 (2018): 1457-1483.
  • 9. Drums Transcription (with NMFD) ● Dittmar, Christian. "Source Separation and Restoration of Drum Sounds in Music Recordings." (2018).
  • 10. Reverse Engineering the Amen Break ● Dittmar, Christian. "Source Separation and Restoration of Drum Sounds in Music Recordings." (2018).
  • 11. Music Structure Analyze ● López-Serrano, Patricio, et al. "NMF TOOLBOX: MUSIC PROCESSING APPLICATIONS OF NONNEGATIVE MATRIX FACTORIZATION." (2019).
  • 12. 2D deconvolutional case ● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D deconvolution for blind single channel source separation." International Conference on Independent Component Analysis and Signal Separation. Springer, Berlin, Heidelberg, 2006.
  • 13. 2D deconvolutional case ● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D deconvolution for blind single channel source separation." International Conference on Independent Component Analysis and Signal Separation. Springer, Berlin, Heidelberg, 2006.
  • 14. 2D deconvolutional case ● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D deconvolution for blind single channel source separation." International Conference on Independent Component Analysis and Signal Separation. Springer, Berlin, Heidelberg, 2006.
  • 15. 2D deconvolutional case ● Schmidt, Mikkel N., and Morten Mørup. "Nonnegative matrix factor 2-D deconvolution for blind single channel source separation." International Conference on Independent Component Analysis and Signal Separation. Springer, Berlin, Heidelberg, 2006.
  • 16. Let it BEE - replace sound source ● Driedger, Jonathan, Thomas Prätzlich, and Meinard Müller. "Let it Bee-Towards NMF-Inspired Audio Mosaicing." ISMIR. 2015. https://www.audiolabs-erlangen.de/resources/ MIR/2015-ISMIR-LetItBee
  • 17. PyTorch ● One of the most well-known deep learning frameworks ● Lauched in 2016 by Facebook ● Features ○ Easy to use python API ○ Dynamic computing graphs ○ Support GPU acceleration ○ automatic gradients calculation ○ Easy for prototyping ● Has been quickly adopted by researchers from many fields
  • 18.
  • 19. CONFERENCE PT 2018 PT 2019 PT GROWTH TF 2018 TF 2019 TF GROWTH CVPR 82 280 240% 116 125 7.7% NAACL 12 66 450% 34 21 -38.2% ACL 26 103 296% 34 33 -2.9% ICLR 24 70 192% 54 53 -1.9% ICML 23 69 200% 40 53 32.5% https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/
  • 20. Developement of torchnmf ● multiplicative update ● deconvolutional class inheritance
  • 22. Gradient descent in Deep Learning can obtain via torch.autogradlearning rate old parameters new parameters
  • 24. Regular NMF v.s. PyTorch NMF ➔ Weight update in regular NMF a. Compute positive component b. Compute negative component c. derive multiplicative update coefficients d. multiplication ➔ Weight update in PyTorch NMF a. Compute loss value b. derive gradients via backward propagation c. Compute positive component d. Compute negative component by subtraction e. derive multiplicative update coefficients f. multiplication ➔ Reduce almost half lines of code -> easy to maintain
  • 25. Extended to convolutional cases Using torch.nn.functional.conv1d/2d/3d and inheriting from NMF base class NMF NMFD NMF2D NMF3D
  • 26.
  • 28.
  • 31. Into the erra of Deep Learning
  • 32. ?
  • 34. How if we combine DL with traditional methods ? ● Ravanelli, Mirco, and Yoshua Bengio. "Speaker recognition from raw waveform with sincnet." 2018 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2018. ● Use Band-passed signal as input feature ● Less parameters to learn, more robust, converge faster, lower error
  • 35.
  • 36. Conclusion of torchnmf ● Advantages compare to others ○ Maintenance is more easy ○ Better support for convolutional cases (especially 3D) ○ Can run on GPU for faster convergence ● News ○ Feature such as batching/controlling sparsity is on the way! ● Next step ○ Full autograd support so it can be integrated in other DL models ○ Documentations ○ upload to PyPI ● Feel free to create PRs or issues! name = "torchnmf" __version__ = '0.2' __maintainer__ = 'Chin-Yun Yu' __email__ = 'ya70201@gmail.com'
  • 37. Q&A