Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- 論文紹介 Semi-supervised Learning with ... by Seiya Tokui 130984 views
- Deep generative learning_icml_part1 by Scyfer 1351 views
- A mixed generative discriminative b... by Shakas Technologies 239 views
- Semi-supervised Facial Expressions ... by Mohamed Farouk 587 views
- Data Science: A Mindset for Product... by Daniel Tunkelang 6957 views
- Deep Generative Models by Mijung Kim 875 views

2,993 views

Published on

http://www.facebook.com/DeepLearning

https://sites.google.com/site/deeplearning2013/

No Downloads

Total views

2,993

On SlideShare

0

From Embeds

0

Number of Embeds

3

Shares

0

Downloads

0

Comments

0

Likes

10

No embeds

No notes for slide

- 1. eep Learning via Semi-Supervised Embedding Summarized by Shohei Ohsawa The University of Tokyo ohsawa@weblab.t.u-tokyo.ac.jp D
- 2. Summary • Deep Learning via Semi-Supervised Embedding • Jason Weston (NEC Labs America, USA) • Frédéric Ratle (IGAR, University of Lausanne, Switzerland) • Ronan Collobert (NEC Labs America, USA) • Proceedings of the 25th International Conference on Machine Learning (ICML2008) • Enhancing semi-supervised learning to support deep architecture • Citations: 88 1 Title Author Contents Venue Information
- 3. ADGENDA 2 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 4. ADGENDA 3 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 5. 1. Introduction Background • Embedding data into a lower dimensional space are unsupervised dimensionality reduction techniques that have been intensively studied. • Most algorithms are developed with the motivation of producing a useful analysis and visualization tool. 4 Unlabelled data Manifold Embedding
- 6. 1. Introduction Semi-supervised Learning • Recently, the ﬁeld of semi-supervised learning [Chapelle 2006], which has the goal of improving generalization on supervised tasks using unlabeled data, has made use of many of these techniques. • Ex.) researchers have used nonlinear embedding or cluster representations (unsupervised) as features for a supervised classiﬁer, with improved results. 5 Labelled data Unlabelled data
- 7. 1. Introduction Issue of Existing Architectures • Most of these architectures are disjoint and shallow • Joint Methods • Transductive Support Vector Machines (TSVMs) [Vapnik 1998] • LapSVM [Belkin 2006] • their architecture is still shallow. 6 The unsupervised dimensionality reduction algorithm is trained on unlabeled data separately as a ﬁrst step a supervised classiﬁer which has a shallow architecture such as a (kernelized) linear model. • [Chapelle 2003][Chapelle 2005] learn a clustering or a distance measure based on a nonlinear manifold embedding as a ﬁrst step
- 8. 1. Introduction Semi-supervised Learning • Deep architectures seem a natural choice in hard AI tasks which involve several sub-tasks which can be coded into the layers of the architecture. • As argued by several researchers [Hinton 2006][Bengio 2007] semi-supervised learning is also natural in such a setting as otherwise one is not likely to ever have enough labeled data to perform well. 7
- 9. 1. Introduction Deep Architecture for Semi-supervised Learning • Several authors have recently proposed methods for using unlabeled data in deep neural network-based architectures. • These methods either perform a greedy layer-wise pre-training of weights using unlabeled data alone followed by supervised ﬁne-tuning (which can be compared to the disjoint shallow techniques for semi-supervised learning described before), • Or learn unsupervised encodings at multiple levels of the architecture jointly with a supervised signal. The basic setup 8 1. Choose an unsupervised learning algorithm. 2. Choose a model with a deep architecture. 3. The unsupervised learning is plugged into any layers of the architecture 4. Train supervised and unsupervised tasks using the same architecture simultaneously
- 10. 1. Introduction Semi-supervised Learning • The aim is that the unsupervised method will improve accuracy on the task at hand. • However, the unsupervised methods so far proposed for deep architectures are in our opinion somewhat complicated and restricted. • They include: • generative model (a restricted Boltzmann machine) [Hinton 2006] • autoencoders [Bengio 2007] • sparse encoding [Ranzato 2007] • Moreover, in all cases these methods are not compared with, and appear on the surface to be completely different to, algorithms developed by researchers in the ﬁeld of semi- supervised learning. 9
- 11. 1. Introduction Objective • In this presentation, we advocate simpler ways of performing deep learning by leveraging existing ideas from semi-supervised algorithms so far developed in shallow architectures. • In particular, we focus on the idea of combining an embedding-based regularizer with a supervised learner to perform semi-supervised learning • Laplacian SVMs [Belkin et al.,2006] • We show that this method can be: • (i) generalized to multi-layer networks and trained by stochastic gradient descent • (ii) is valid in the deep learning framework given above. 10
- 12. ADGENDA 12 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 13. 2. Semi-supervised Embedding Structure Assumption • Structure Assumption: points within the same structure (such as a cluster or a manifold) are likely to have the same label. • Algorithms • cluster kernels [Chapelle et al., 2003] • LDS [Chapelle & Zien,2005] • label propagation [Zhu & Ghahramani, 2002] • LapSVM [Belkin et al., 2006] • To understand these methods we will ﬁrst review some relevant approaches to linear and nonlinear embedding. 13 Labelled data The labels can be estimated as soon as the points on a same manifold
- 14. 2. Semi-supervised Embedding Embedding Algorithms 14 minimize s.t. Balancing Constant
- 15. 2. Semi-supervised Embedding Embedding Algorithms: Multidimensional Scaling (MDS) • A classical algorithm that attempts to preserve the distance between points, whilst embedding them in a lower dimensional space • MDS is equivalent to PCA if the metric is Euclidean [Williams, 2001] 15 Manifold
- 16. 2. Semi-supervised Embedding Embedding Algorithms: ISOMAP [Tenenbaum 2000] 16 W=1/8
- 17. 2. Semi-supervised Embedding Embedding Algorithms: Laplacian Eigenmaps [Belkin & Niyogi 2003] 17 Laplacian Kernel
- 18. 2. Semi-supervised Embedding Semi-superbised Algorithms 19 • L: Labelled Data • U: Unlabelled Data
- 19. 2. Semi-supervised Embedding Semi-superbised Algorithms: Label Propagation [Zhu & Ghahramani, 2002] 20 RegularizeLoss
- 20. 2. Semi-supervised Embedding Semi-superbised Algorithms: LapSVM [Belkin et al., 2006] 21 1 10
- 21. ADGENDA 22 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 22. 3. Semi-supervised Embedding for Deep Learning Overview • We would like to use the ideas developed in semi-supervised learning for deep learning. Deep learning consists of learning a model with several layers of non-linear mapping. • We will consider multi-layer networks with M layers of hidden units that give a C- dimensional output vector: • wO: the weights for the output layer • typically the kth layer is deﬁned as • S: a non-linear squashing function such as tanh. 23
- 23. 3. Semi-supervised Embedding for Deep Learning Overview • Here, we describe a standard fully connected multi-layer network but prior knowledge about a particular problem could lead one to other network designs. • For example in sequence and image recognition time delay and convolutional networks (TDNNs and CNNs) [Le-Cun et al., 1998] have been very successful. • In those approaches one introduces layers that apply convolutions on their input which take into account locality information in the data, i.e. they learn features from image patches or windows within a sequence. 24
- 24. 3. Semi-supervised Embedding for Deep Learning Three Modes of Embedding in Deep Architectures • The general method we propose for semi-supervised deep learning is to add a semi- supervised regularizer in deep architectures in one of three diﬀerent modes 25 (a) Output (b) Internal (c) Auxiliary
- 25. 3. Semi-supervised Embedding for Deep Learning Three Modes of Embedding in Deep Architectures: (a) Output • Add a semi-supervised loss (regularizer) to the supervised loss on the entire network’s output • This is most similar to the shallow techniques described before 26
- 26. 3. Semi-supervised Embedding for Deep Learning Three Modes of Embedding in Deep Architectures: (b) Internal • Regularize the k th hidden layer (7) directly: 27 • is the output of the network up to the hidden layer.
- 27. 3. Semi-supervised Embedding for Deep Learning Three Modes of Embedding in Deep Architectures: (c) Auxiliary • Create an auxiliary network which shares the ﬁrst k layers of the original network but has a new ﬁnal set of weights: • We train this network to embed unlabeled data simultaneously as we train the original network on labeled data. 28
- 28. 3. Semi-supervised Embedding for Deep Learning Algorithm 30
- 29. 3. Semi-supervised Embedding for Deep Learning Labeling unlabeled data as Neighbors 31
- 30. ADGENDA 36 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 31. Deep Boltzmann Machine • ボルツマンマシンの一種 • RBM を多段に重ねたような形 37 中間層II 中間層I 入力層 ノード値 バイアス エネルギー関数
- 32. Auto-encoder • Auto-encoder framework [Lecaum 1987][Bourland 1988][Hinton 1994] : unsupervised feature construction method の一つ。 • auto-: 「自己の」 auto-encoder を直訳すると自己符号器 • encoder, decoder, reconstruction error の 3 つの要素から構成。 • encoder と decoder の合成写像が入力値を再現するような学習を行う。 • 学習は入力値と出力値の誤差(reconstruction error)を最小化することで行われる。 • この操作によって、入力値をより適切な表現に写像する auto-encoder が得られる。 38 (Auto-)encoder Decoder Reconstruction Representation Vector t-th Input Vector Output Vector Reconstruction Error
- 33. ADGENDA 39 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 34. 5. Experimental Result 43
- 35. ADGENDA 47 1. Introduction 2. Semi-supervised Embedding 3. Semi-supervised Embedding for Deep Learning 4. Existing Approaches to Deep Learning 5. Experimental Result 6. Conclusion
- 36. 6. Conclusion • In this work, we showed how one can improve supervised learning for deep architectures if one jointly learns an embedding task using unlabeled data. • Our results both conﬁrm previous ﬁndings and generalize them. • Researchers using shallow architectures already showed two ways of using embedding to improve generalization • (i) embedding unlabeled data as a separate pre-processing step (i.e., ﬁrst layer training) • (ii) using embedding as a regularizer (i.e., at the output layer). • More importantly, we generalized these approaches to the case where we train a semi- supervised embedding jointly with a supervised deep multi-layer architecture on any (or all) layers of the network 48
- 37. 研究内容: Like Prediction 49
- 38. 研究内容: Like Prediction 50
- 39. ディープ・アーキテクチャを用いたグラフの階層的クラスタリング 51 オートエンコーダ グラフのクラスタ（コミュニ ティ） グラフ
- 40. DISCUSSION 52

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment