SlideShare a Scribd company logo
1 of 36
HIgh Performance
Computing & Systems LAB
Unsupervised feature learning for audio classif-
ication using convolutional deep belief networks
Honglak Lee Yan Largman Peter Pham Andrew Y. Ng
Thesis Presenter Chung il Kim
Computer Science Departement, Stanford University
Stanford, CA 94305
Advances in Neural Information Processing Systems 22 (NIPS 2009)
High Performance Computing & Systems Lab
Contents
 Abstract & Introduction
 Theory & Algorithm
 Convolutional Deep Belief Networks(CDBN)
 on Shift Invariant Sparse Coding(SISC)
 Unsupervised Feature Learning
 Application to Audio Recognition Tasks
 Speech Recognition
 Music Classification
 Discussion and Conclusion
31th Aug 2017, Paper Seminar 2
High Performance Computing & Systems Lab
1. Abstract & Introduction (1)
 Abstract
 Deep learning approaches
 Build hieratical representations on unlabeled data
 Focusing on unlabeled auditory data
 Using Convolutional deep belief network(CDBN)
 Evaluate auditory data on various audio classification tasks
 RAW
 MFCC
 CDBN(L1, L2)
31th Aug 2017, Paper Seminar 3
High Performance Computing & Systems Lab
1. Abstract & Introduction (2)
 Introduction
 Issue of Audio data recognition
 Toward high dimension and complex
 Previous work[1, 2]
 sparse coding leads to filters correspond to cochlear filters
 Related work[3]
 Efficient sparse coding algorithm for audio classification tasks
– Feature sign search algorithm(FS-EXACT, FS-Window)
– Lagrangian of DFT
31th Aug 2017, Paper Seminar 4
[1] E. C. Smith and M. S. Lewicki. Efficient auditory coding. Nature, 439:978–982, 2006.
[2] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–609, 1996.
[3] R. Grosse, R. Raina, H. Kwong, and A.Y. Ng. Shift-invariant sparse codig for audio classification. In UAI, 2007.
High Performance Computing & Systems Lab
1. Abstract & Introduction (3)
 Introduction
 The limit of those methods
 Applied to learn relatively shallow
 1-layer representations
 Many promising approached [4, 5, 6, 7, 8] usually Image
 Fast
 With energy-based model
 Greedy
 Empirical evaluation
 But Deep learning not applied to auditory data
31th Aug 2017, Paper Seminar 5
[4]G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006.
[5]M. Ranzato, C. Poultney, S. Chopra, and Y. LeCun. Efficient learning of sparse representations with an energy-based model. In NIPS, 2006.
[6]Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS, 2006.
[7]H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In ICML, 2007.
[8]H. Lee, C. Ekanadham, and A. Y. Ng. Sparse deep belief network model for visual area V2. In NIPS, 2008.
High Performance Computing & Systems Lab
1. Abstract & Introduction (4)
 Introduction
 Deep belief network
 Generative probabilitistic model
– Composed 1 visible layer, and many hidden layer
 Well-trained using ‘Greedy Layerwise Training’
 Convolutional deep belief network(CDBN) [9]
 Also trained as greedy, bottom-up fashion
 Good performance in several visual recognition tasks
 CDBN on unlabeled audio data
 evaluate the learned feature representations
– several audio classification tasks
31th Aug 2017, Paper Seminar 6
[9]H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In ICML, 2009.
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (1)
 Convolutional Restricted Boltzmann Machines(CRBMs)
 CDBN, consist of CRBMs block model
31th Aug 2017, Paper Seminar 7
<Figure 1> Image of Convolution Deep Belief Networks
1. Set partial area
2. Get detection through filter
(highly overcomplete, sparse needed)
3. Pooling(usually max-pooling)
4. Greedy layerwise traing
-more than 1
5. Get pattern of visible data.
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (2)
 Convolutional Restricted Boltzmann Machines(CRBMs)
 Extension of ‘regular’ Restricted Boltzmann Machines(RBMs)
 Decrease related dimension
 makes sparsity problem
31th Aug 2017, Paper Seminar 8
<Figure2> dimension down, sparse
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (3)
 CDBNs
 Energy function
 CRBMs’ Probability distribution referred using energy(next page)
31th Aug 2017, Paper Seminar 9
<Formula 1> Energy function of CRBMs in binary(up) and real-valued(down)
nv : No. dimensional array of binary unit
nw : No. dimensional filter array
K : number of filter
nH : No. dimensional array of hidden unit
(nv – nw + 1)
bk : shared bias for each group
c : shared bias for visible units
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (4)
 CDBNs
 Probability distribution
 CRBMs’ Probability distribution referred using energy
31th Aug 2017, Paper Seminar 10
<Formula 2> joint and conditional probability distributions
*v : valid convolution
*f : full convolution
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (5)
 Pooling layer
 Shrink data map
 In classification, usually using max-pooling
31th Aug 2017, Paper Seminar 11
0 0.5 0.5 0.4
0.7 0.1 0.2 0.4
0.9 0.3 0.7 0.5
0.5 0.8 0.2 0
0.7 0.5 0.5
0.9 0.7 0.7
0.9 0.8 0.7
<Picture3> Image of max-pooling
High Performance Computing & Systems Lab
2. Convolutional Deep Belief Network (6)
 Process of CDBNs
 Set partial area
 Get detection through filter (highly overcomplete, sparse needed)
 Pooling(usually max-pooling)
 Greedy layerwise Training more than 1
 Get pattern of visible data.
31th Aug 2017, Paper Seminar 12
<Picture4> Process of CDBNs
https://deeplearning4j.org/kr/convolutionnets
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (1)
 Sparsity
 Typical CRBM is highly overcomplete
 Sparsity penalty term added to log-likelihood
 To solve overfitting problem in deep neural networks
 Avoid full-connectivity
 This algorithm uses LASSO(the Least Absolute Shrinkage and Selection Operator)
31th Aug 2017, Paper Seminar 13
<Formula 4> the objective of sparsity
<Formula 3> the training objective
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (2)
 Two algorithm for solve SISC in audio data
 Coefficient, Figure-Sign Search algorithm
 Efficiency for short signals (x->low-dimensional)
 Not good for over 1minute
31th Aug 2017, Paper Seminar 14
<Pseudo 1> Feature-sign search algorithm 1
R. Grosse, R. Raina, H. Kwong, and A.Y. Ng. Shift-invariant sparse coding for audio classification. In UAI, 2007
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (3)
 Two algorithm for solve SISC in audio data
 Bases, using Lagrangian and DFT
 1st, Discrete Fourier Transform
– signal decompose
 2nd, Set Lagrangian
– To solve optimization
 3rd, using Newton’s method(in this Paper)
31th Aug 2017, Paper Seminar 15
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (4)
 Approach Tasks
 Using LASSO
 Partial differential equation
 Bias ↑, variance ↓ (trade off)
31th Aug 2017, Paper Seminar 16
Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/
<Pseudo 2> Feature-sign search algorithm 2
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (5)
 Approach Tasks
 By resulting ‘unconstrained QP’
 Compute analytical solution
 This is subvector of x
 Using discrete line search(LS), update x with to the point.
 Collect value which coefficient changes sign, and update the lowest one
31th Aug 2017, Paper Seminar 17
Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/
<Pseudo 3> Feature-sign search algorithm 2
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (6)
 Approach Tasks
 Last matching those condition, and repeat it.
31th Aug 2017, Paper Seminar 18
Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/
<Pseudo 2> Feature-sign search algorithm 2
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (7)
 Result of FS search(learning speed)
31th Aug 2017, Paper Seminar 19
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (8)
 Result of FS search(Speech)
 Speech data (TIMIT)
 1 second long, 32 speech signal with basis function
 Filter
 SISC(with FS), MFCC(Mel Frequency Cepstral Coefficient), RAW
31th Aug 2017, Paper Seminar 20
High Performance Computing & Systems Lab
3. on Shift Invariant Sparse Coding (9)
 Result of FS search(Musical genre)
 2-second, 5-way musical genre song.
 Filter
 SISC(with FS), TC(Tzanetakis & Cook)
 MFCC(Mel Frequency Cepstrum Coefficient), RAW
31th Aug 2017, Paper Seminar 21
High Performance Computing & Systems Lab
4. Unsupervised Feature Learning (1)
 Description of TIMIT Data
 For researching speech recognition systems
 American English
 In This Research
 Spectrogram form
 Window size : 20ms
 Overlaps : 10ms
 Using PCA-Whitening(with 80 components)
– To reduce the dimensionality
 Research Contents
 Phonemes
 Speaker gender
31th Aug 2017, Paper Seminar 22
High Performance Computing & Systems Lab
4. Unsupervised Feature Learning (2)
 Layer and Training Setting
 1st layer
 300 bases
 Filter length(nw) : 6
 Max-pooling ratio : 3
31th Aug 2017, Paper Seminar 23
 2nd layer
 300 bases (output of 1st layer)
 Filter length : 6
 Max-pooling ratio : 3
High Performance Computing & Systems Lab
4. Unsupervised Feature Learning (3)
 Phonemes and the CDBN features
31th Aug 2017, Paper Seminar 24
 Analysis
 Vowel(“ah”, “oy”)
 Prominent horizontal bands
 Lower freq.
 “oy”
 Upward slanting pattern
High Performance Computing & Systems Lab
4. Unsupervised Feature Learning (4)
 Phonemes and the CDBN features
31th Aug 2017, Paper Seminar 25
 Analysis
 Fricatives(“s”)
 Energy in the high freq.
 “el”
 High intensity in low freq.
 Low intensity follows in high freq.
High Performance Computing & Systems Lab
4. Unsupervised Feature Learning (5)
 Speaker gender information & CDBN features
 Female, finer horizontal banding pattern in low freq.
 L1, L2 correspond to basis.
31th Aug 2017, Paper Seminar 26
High Performance Computing & Systems Lab
5. Speech Recognition(Speaker ID) (1)
 About bases data
 No. speakers : 168
 Sentenses per speaker : 10
 Total sentenses : 1680
 1. Speaker Identification Test
 10 Random trials
 Training : TIMIT data
 All data expressed as Spectrogram
 RAW, MFCC, CDBN L1, CDBN L2, CDBN L1+L2
 Simple summary statistics for each channel
 Evaluate features using standard supervised classifiers
 SVM(Sub Vector Machine), GDA(Gaussian Discriminant Analysis), KNN
(K-Nearest Neigbor classification)
31th Aug 2017, Paper Seminar 27
High Performance Computing & Systems Lab
5. Speech Recognition(Speaker ID) (2)
 Speaker Identification
31th Aug 2017, Paper Seminar 28
High Performance Computing & Systems Lab
5. Speech Recognition(Speaker ID) (3)
 2. Speaker Gender classification
 Randomly sampled training examples
 200 testing examples
 20 trials
31th Aug 2017, Paper Seminar 29
High Performance Computing & Systems Lab
5. Speech Recognition(Speaker ID) (4)
 3. Phone Classification
 39way phone classification accuracy
 Over 5 random trials
31th Aug 2017, Paper Seminar 30
High Performance Computing & Systems Lab
6. Music Classification (1)
 1. Genre classification
 1st and 2nd layer
 Music data from: ISMIR
 Bases : 300
 Filter length : 10
 Max-pooling ratio : 3
 Randomly sampled 3-second segment(Training or testing sample)
 Genre : 5-way(classical, electirc, jazz, pop and rock)
 20 random trials on each training samples
31th Aug 2017, Paper Seminar 31
High Performance Computing & Systems Lab
6. Music Classification (2)
 2. Artist classification
 1st and 2nd layer (same as genre classification)
 Music data from: ISMIR
 Bases : 300
 Filter length : 10
 Max-pooling ratio : 3
 Randomly sampled 3-second segment(Training or testing sample)
 Genre only classical music
 Only 4-way artist
 Over 20 random trials (in average)
31th Aug 2017, Paper Seminar 32
High Performance Computing & Systems Lab
6. Music Classification (2)
 2. Artist classification
31th Aug 2017, Paper Seminar 33
High Performance Computing & Systems Lab
7. Discussion
 Not suitable on Modern Speech
 Much larger than the TIMIT data set.
 This research’s target
 Restrict amount of the labeled data
 Remains interesting problem
 Deep learning to larger datasets
 More challenging tasks
31th Aug 2017, Paper Seminar 34
High Performance Computing & Systems Lab
8. Conclusion
 Applied CDBN to audio data
 Evaluate on various audio classification tasks
 Not using a large Amount of data
 This learned feature often equaled or surpassed MFCC
 (MFCC hand-tailored to audio data)
 Combining both, achieve higher classification accuracy
 L1 CDBN, high performance on multiple audio recognition tasks
 Hope Inspiring automatically learning deep feature
 In audio data
31th Aug 2017, Paper Seminar 35
High Performance Computing & Systems Lab
Thank you

More Related Content

What's hot

A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyIOSR Journals
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal clusterVenkat Projects
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중datasciencekorea
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa
 
Deep Learning for Speech Recognition in Cortana at AI NEXT Conference
Deep Learning for Speech Recognition in Cortana at AI NEXT ConferenceDeep Learning for Speech Recognition in Cortana at AI NEXT Conference
Deep Learning for Speech Recognition in Cortana at AI NEXT ConferenceBill Liu
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural NetworkJunho Cho
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, SwitzerlandEuroIoTa
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial Ligeng Zhu
 
Improved LSB Steganograhy Technique for grayscale and RGB images
Improved LSB Steganograhy Technique for grayscale and RGB imagesImproved LSB Steganograhy Technique for grayscale and RGB images
Improved LSB Steganograhy Technique for grayscale and RGB imagesIJERA Editor
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsMohid Nabil
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepadeepa4466
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Dmytro Mishkin
 
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATION
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATIONAN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATION
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATIONIJCNCJournal
 

What's hot (17)

20 26
20 26 20 26
20 26
 
A Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image SteganographyA Review of Comparison Techniques of Image Steganography
A Review of Comparison Techniques of Image Steganography
 
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...
 
5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster5.local community detection algorithm based on minimal cluster
5.local community detection algorithm based on minimal cluster
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
G0210032039
G0210032039G0210032039
G0210032039
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
 
Deep Learning for Speech Recognition in Cortana at AI NEXT Conference
Deep Learning for Speech Recognition in Cortana at AI NEXT ConferenceDeep Learning for Speech Recognition in Cortana at AI NEXT Conference
Deep Learning for Speech Recognition in Cortana at AI NEXT Conference
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo,  EPFL, SwitzerlandHybrid neural networks for time series learning by Tian Guo,  EPFL, Switzerland
Hybrid neural networks for time series learning by Tian Guo, EPFL, Switzerland
 
Deep Learning Tutorial
Deep Learning Tutorial Deep Learning Tutorial
Deep Learning Tutorial
 
Improved LSB Steganograhy Technique for grayscale and RGB images
Improved LSB Steganograhy Technique for grayscale and RGB imagesImproved LSB Steganograhy Technique for grayscale and RGB images
Improved LSB Steganograhy Technique for grayscale and RGB images
 
Hk3312911294
Hk3312911294Hk3312911294
Hk3312911294
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
Convolutional neural networks deepa
Convolutional neural networks deepaConvolutional neural networks deepa
Convolutional neural networks deepa
 
Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...Convolutional neural networks for image classification — evidence from Kaggle...
Convolutional neural networks for image classification — evidence from Kaggle...
 
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATION
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATIONAN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATION
AN ADAPTIVE PSEUDORANDOM STEGO-CRYPTO TECHNIQUE FOR DATA COMMUNICATION
 

Similar to [Chung il kim] 0829 thesis

Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Dominic Suciu
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)XAIC
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...CSCJournals
 
deeplearningpresentation-180625071236.pptx
deeplearningpresentation-180625071236.pptxdeeplearningpresentation-180625071236.pptx
deeplearningpresentation-180625071236.pptxJeetDesai14
 
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...Improving Isolated Bangla Compound Character Recognition Through Feature-map ...
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...Pinaki Ranjan Sarkar
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data setsIjripublishers Ijri
 
Data reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataData reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataeSAT Journals
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewVahid Mirjalili
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...mathsjournal
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptbutest
 
IRJET- A Review on Audible Sound Analysis based on State Clustering throu...
IRJET-  	  A Review on Audible Sound Analysis based on State Clustering throu...IRJET-  	  A Review on Audible Sound Analysis based on State Clustering throu...
IRJET- A Review on Audible Sound Analysis based on State Clustering throu...IRJET Journal
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep LearningBrahim HAMADICHAREF
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4arogozhnikov
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...Mitul Panchal
 

Similar to [Chung il kim] 0829 thesis (20)

Computation and Knowledge
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
 
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
Interactive Analysis of Large-Scale Sequencing Genomics Data Sets using a Rea...
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
파이콘 한국 2019 튜토리얼 - 설명가능인공지능이란? (Part 1)
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (1) Issue...
 
deeplearningpresentation-180625071236.pptx
deeplearningpresentation-180625071236.pptxdeeplearningpresentation-180625071236.pptx
deeplearningpresentation-180625071236.pptx
 
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...Improving Isolated Bangla Compound Character Recognition Through Feature-map ...
Improving Isolated Bangla Compound Character Recognition Through Feature-map ...
 
Ijricit 01-002 enhanced replica detection in short time for large data sets
Ijricit 01-002 enhanced replica detection in  short time for large data setsIjricit 01-002 enhanced replica detection in  short time for large data sets
Ijricit 01-002 enhanced replica detection in short time for large data sets
 
Data reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological dataData reduction techniques for high dimensional biological data
Data reduction techniques for high dimensional biological data
 
Large Scale Data Clustering: an overview
Large Scale Data Clustering: an overviewLarge Scale Data Clustering: an overview
Large Scale Data Clustering: an overview
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
 
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
OPTIMIZING SIMILARITY THRESHOLD FOR ABSTRACT SIMILARITY METRIC IN SPEECH DIAR...
 
kantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.pptkantorNSF-NIJ-ISI-03-06-04.ppt
kantorNSF-NIJ-ISI-03-06-04.ppt
 
IRJET- A Review on Audible Sound Analysis based on State Clustering throu...
IRJET-  	  A Review on Audible Sound Analysis based on State Clustering throu...IRJET-  	  A Review on Audible Sound Analysis based on State Clustering throu...
IRJET- A Review on Audible Sound Analysis based on State Clustering throu...
 
Recent developments in Deep Learning
Recent developments in Deep LearningRecent developments in Deep Learning
Recent developments in Deep Learning
 
Machine learning in science and industry — day 4
Machine learning in science and industry — day 4Machine learning in science and industry — day 4
Machine learning in science and industry — day 4
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...
Distribution Modelling and Analytics of Large Spectrum Data: Spectrum Occupan...
 
Deep learning presentation
Deep learning presentationDeep learning presentation
Deep learning presentation
 

Recently uploaded

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Recently uploaded (20)

Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

[Chung il kim] 0829 thesis

  • 1. HIgh Performance Computing & Systems LAB Unsupervised feature learning for audio classif- ication using convolutional deep belief networks Honglak Lee Yan Largman Peter Pham Andrew Y. Ng Thesis Presenter Chung il Kim Computer Science Departement, Stanford University Stanford, CA 94305 Advances in Neural Information Processing Systems 22 (NIPS 2009)
  • 2. High Performance Computing & Systems Lab Contents  Abstract & Introduction  Theory & Algorithm  Convolutional Deep Belief Networks(CDBN)  on Shift Invariant Sparse Coding(SISC)  Unsupervised Feature Learning  Application to Audio Recognition Tasks  Speech Recognition  Music Classification  Discussion and Conclusion 31th Aug 2017, Paper Seminar 2
  • 3. High Performance Computing & Systems Lab 1. Abstract & Introduction (1)  Abstract  Deep learning approaches  Build hieratical representations on unlabeled data  Focusing on unlabeled auditory data  Using Convolutional deep belief network(CDBN)  Evaluate auditory data on various audio classification tasks  RAW  MFCC  CDBN(L1, L2) 31th Aug 2017, Paper Seminar 3
  • 4. High Performance Computing & Systems Lab 1. Abstract & Introduction (2)  Introduction  Issue of Audio data recognition  Toward high dimension and complex  Previous work[1, 2]  sparse coding leads to filters correspond to cochlear filters  Related work[3]  Efficient sparse coding algorithm for audio classification tasks – Feature sign search algorithm(FS-EXACT, FS-Window) – Lagrangian of DFT 31th Aug 2017, Paper Seminar 4 [1] E. C. Smith and M. S. Lewicki. Efficient auditory coding. Nature, 439:978–982, 2006. [2] B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381:607–609, 1996. [3] R. Grosse, R. Raina, H. Kwong, and A.Y. Ng. Shift-invariant sparse codig for audio classification. In UAI, 2007.
  • 5. High Performance Computing & Systems Lab 1. Abstract & Introduction (3)  Introduction  The limit of those methods  Applied to learn relatively shallow  1-layer representations  Many promising approached [4, 5, 6, 7, 8] usually Image  Fast  With energy-based model  Greedy  Empirical evaluation  But Deep learning not applied to auditory data 31th Aug 2017, Paper Seminar 5 [4]G. E. Hinton, S. Osindero, and Y.-W. Teh. A fast learning algorithm for deep belief nets. Neural Computation, 18(7):1527–1554, 2006. [5]M. Ranzato, C. Poultney, S. Chopra, and Y. LeCun. Efficient learning of sparse representations with an energy-based model. In NIPS, 2006. [6]Y. Bengio, P. Lamblin, D. Popovici, and H. Larochelle. Greedy layer-wise training of deep networks. In NIPS, 2006. [7]H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In ICML, 2007. [8]H. Lee, C. Ekanadham, and A. Y. Ng. Sparse deep belief network model for visual area V2. In NIPS, 2008.
  • 6. High Performance Computing & Systems Lab 1. Abstract & Introduction (4)  Introduction  Deep belief network  Generative probabilitistic model – Composed 1 visible layer, and many hidden layer  Well-trained using ‘Greedy Layerwise Training’  Convolutional deep belief network(CDBN) [9]  Also trained as greedy, bottom-up fashion  Good performance in several visual recognition tasks  CDBN on unlabeled audio data  evaluate the learned feature representations – several audio classification tasks 31th Aug 2017, Paper Seminar 6 [9]H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In ICML, 2009.
  • 7. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (1)  Convolutional Restricted Boltzmann Machines(CRBMs)  CDBN, consist of CRBMs block model 31th Aug 2017, Paper Seminar 7 <Figure 1> Image of Convolution Deep Belief Networks 1. Set partial area 2. Get detection through filter (highly overcomplete, sparse needed) 3. Pooling(usually max-pooling) 4. Greedy layerwise traing -more than 1 5. Get pattern of visible data.
  • 8. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (2)  Convolutional Restricted Boltzmann Machines(CRBMs)  Extension of ‘regular’ Restricted Boltzmann Machines(RBMs)  Decrease related dimension  makes sparsity problem 31th Aug 2017, Paper Seminar 8 <Figure2> dimension down, sparse
  • 9. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (3)  CDBNs  Energy function  CRBMs’ Probability distribution referred using energy(next page) 31th Aug 2017, Paper Seminar 9 <Formula 1> Energy function of CRBMs in binary(up) and real-valued(down) nv : No. dimensional array of binary unit nw : No. dimensional filter array K : number of filter nH : No. dimensional array of hidden unit (nv – nw + 1) bk : shared bias for each group c : shared bias for visible units
  • 10. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (4)  CDBNs  Probability distribution  CRBMs’ Probability distribution referred using energy 31th Aug 2017, Paper Seminar 10 <Formula 2> joint and conditional probability distributions *v : valid convolution *f : full convolution
  • 11. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (5)  Pooling layer  Shrink data map  In classification, usually using max-pooling 31th Aug 2017, Paper Seminar 11 0 0.5 0.5 0.4 0.7 0.1 0.2 0.4 0.9 0.3 0.7 0.5 0.5 0.8 0.2 0 0.7 0.5 0.5 0.9 0.7 0.7 0.9 0.8 0.7 <Picture3> Image of max-pooling
  • 12. High Performance Computing & Systems Lab 2. Convolutional Deep Belief Network (6)  Process of CDBNs  Set partial area  Get detection through filter (highly overcomplete, sparse needed)  Pooling(usually max-pooling)  Greedy layerwise Training more than 1  Get pattern of visible data. 31th Aug 2017, Paper Seminar 12 <Picture4> Process of CDBNs https://deeplearning4j.org/kr/convolutionnets
  • 13. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (1)  Sparsity  Typical CRBM is highly overcomplete  Sparsity penalty term added to log-likelihood  To solve overfitting problem in deep neural networks  Avoid full-connectivity  This algorithm uses LASSO(the Least Absolute Shrinkage and Selection Operator) 31th Aug 2017, Paper Seminar 13 <Formula 4> the objective of sparsity <Formula 3> the training objective
  • 14. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (2)  Two algorithm for solve SISC in audio data  Coefficient, Figure-Sign Search algorithm  Efficiency for short signals (x->low-dimensional)  Not good for over 1minute 31th Aug 2017, Paper Seminar 14 <Pseudo 1> Feature-sign search algorithm 1 R. Grosse, R. Raina, H. Kwong, and A.Y. Ng. Shift-invariant sparse coding for audio classification. In UAI, 2007
  • 15. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (3)  Two algorithm for solve SISC in audio data  Bases, using Lagrangian and DFT  1st, Discrete Fourier Transform – signal decompose  2nd, Set Lagrangian – To solve optimization  3rd, using Newton’s method(in this Paper) 31th Aug 2017, Paper Seminar 15
  • 16. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (4)  Approach Tasks  Using LASSO  Partial differential equation  Bias ↑, variance ↓ (trade off) 31th Aug 2017, Paper Seminar 16 Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/ <Pseudo 2> Feature-sign search algorithm 2
  • 17. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (5)  Approach Tasks  By resulting ‘unconstrained QP’  Compute analytical solution  This is subvector of x  Using discrete line search(LS), update x with to the point.  Collect value which coefficient changes sign, and update the lowest one 31th Aug 2017, Paper Seminar 17 Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/ <Pseudo 3> Feature-sign search algorithm 2
  • 18. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (6)  Approach Tasks  Last matching those condition, and repeat it. 31th Aug 2017, Paper Seminar 18 Liang Sun Arizona State University, Efficient Sparse Coding Algorithms, http://slideplayer.com/slide/4953202/ <Pseudo 2> Feature-sign search algorithm 2
  • 19. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (7)  Result of FS search(learning speed) 31th Aug 2017, Paper Seminar 19
  • 20. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (8)  Result of FS search(Speech)  Speech data (TIMIT)  1 second long, 32 speech signal with basis function  Filter  SISC(with FS), MFCC(Mel Frequency Cepstral Coefficient), RAW 31th Aug 2017, Paper Seminar 20
  • 21. High Performance Computing & Systems Lab 3. on Shift Invariant Sparse Coding (9)  Result of FS search(Musical genre)  2-second, 5-way musical genre song.  Filter  SISC(with FS), TC(Tzanetakis & Cook)  MFCC(Mel Frequency Cepstrum Coefficient), RAW 31th Aug 2017, Paper Seminar 21
  • 22. High Performance Computing & Systems Lab 4. Unsupervised Feature Learning (1)  Description of TIMIT Data  For researching speech recognition systems  American English  In This Research  Spectrogram form  Window size : 20ms  Overlaps : 10ms  Using PCA-Whitening(with 80 components) – To reduce the dimensionality  Research Contents  Phonemes  Speaker gender 31th Aug 2017, Paper Seminar 22
  • 23. High Performance Computing & Systems Lab 4. Unsupervised Feature Learning (2)  Layer and Training Setting  1st layer  300 bases  Filter length(nw) : 6  Max-pooling ratio : 3 31th Aug 2017, Paper Seminar 23  2nd layer  300 bases (output of 1st layer)  Filter length : 6  Max-pooling ratio : 3
  • 24. High Performance Computing & Systems Lab 4. Unsupervised Feature Learning (3)  Phonemes and the CDBN features 31th Aug 2017, Paper Seminar 24  Analysis  Vowel(“ah”, “oy”)  Prominent horizontal bands  Lower freq.  “oy”  Upward slanting pattern
  • 25. High Performance Computing & Systems Lab 4. Unsupervised Feature Learning (4)  Phonemes and the CDBN features 31th Aug 2017, Paper Seminar 25  Analysis  Fricatives(“s”)  Energy in the high freq.  “el”  High intensity in low freq.  Low intensity follows in high freq.
  • 26. High Performance Computing & Systems Lab 4. Unsupervised Feature Learning (5)  Speaker gender information & CDBN features  Female, finer horizontal banding pattern in low freq.  L1, L2 correspond to basis. 31th Aug 2017, Paper Seminar 26
  • 27. High Performance Computing & Systems Lab 5. Speech Recognition(Speaker ID) (1)  About bases data  No. speakers : 168  Sentenses per speaker : 10  Total sentenses : 1680  1. Speaker Identification Test  10 Random trials  Training : TIMIT data  All data expressed as Spectrogram  RAW, MFCC, CDBN L1, CDBN L2, CDBN L1+L2  Simple summary statistics for each channel  Evaluate features using standard supervised classifiers  SVM(Sub Vector Machine), GDA(Gaussian Discriminant Analysis), KNN (K-Nearest Neigbor classification) 31th Aug 2017, Paper Seminar 27
  • 28. High Performance Computing & Systems Lab 5. Speech Recognition(Speaker ID) (2)  Speaker Identification 31th Aug 2017, Paper Seminar 28
  • 29. High Performance Computing & Systems Lab 5. Speech Recognition(Speaker ID) (3)  2. Speaker Gender classification  Randomly sampled training examples  200 testing examples  20 trials 31th Aug 2017, Paper Seminar 29
  • 30. High Performance Computing & Systems Lab 5. Speech Recognition(Speaker ID) (4)  3. Phone Classification  39way phone classification accuracy  Over 5 random trials 31th Aug 2017, Paper Seminar 30
  • 31. High Performance Computing & Systems Lab 6. Music Classification (1)  1. Genre classification  1st and 2nd layer  Music data from: ISMIR  Bases : 300  Filter length : 10  Max-pooling ratio : 3  Randomly sampled 3-second segment(Training or testing sample)  Genre : 5-way(classical, electirc, jazz, pop and rock)  20 random trials on each training samples 31th Aug 2017, Paper Seminar 31
  • 32. High Performance Computing & Systems Lab 6. Music Classification (2)  2. Artist classification  1st and 2nd layer (same as genre classification)  Music data from: ISMIR  Bases : 300  Filter length : 10  Max-pooling ratio : 3  Randomly sampled 3-second segment(Training or testing sample)  Genre only classical music  Only 4-way artist  Over 20 random trials (in average) 31th Aug 2017, Paper Seminar 32
  • 33. High Performance Computing & Systems Lab 6. Music Classification (2)  2. Artist classification 31th Aug 2017, Paper Seminar 33
  • 34. High Performance Computing & Systems Lab 7. Discussion  Not suitable on Modern Speech  Much larger than the TIMIT data set.  This research’s target  Restrict amount of the labeled data  Remains interesting problem  Deep learning to larger datasets  More challenging tasks 31th Aug 2017, Paper Seminar 34
  • 35. High Performance Computing & Systems Lab 8. Conclusion  Applied CDBN to audio data  Evaluate on various audio classification tasks  Not using a large Amount of data  This learned feature often equaled or surpassed MFCC  (MFCC hand-tailored to audio data)  Combining both, achieve higher classification accuracy  L1 CDBN, high performance on multiple audio recognition tasks  Hope Inspiring automatically learning deep feature  In audio data 31th Aug 2017, Paper Seminar 35
  • 36. High Performance Computing & Systems Lab Thank you