SlideShare a Scribd company logo
Gaussian Process Latent Variable Model 
(GPLVM) 
James McMurray 
PhD Student 
Department of Empirical Inference 
11/02/2014
Outline of talk 
Introduction 
Why are latent variable models useful? 
De
nition of a latent variable model 
Graphical Model representation 
PCA recap 
Principal Components Analysis (PCA) 
Probabilistic PCA based methods 
Probabilistic PCA (PPCA) 
Dual PPCA 
GPLVM 
Examples 
Practical points 
Variants 
Conclusion 
Conclusion 
References
Why are latent variable models useful? 
I Data has structure.
Why are latent variable models useful? 
I Observed high-dimensional data often lies on a 
lower-dimensional manifold. 
Example Swiss Roll" dataset
Why are latent variable models useful? 
I The structure in the data means that we don't need such high 
dimensionality to describe it. 
I The lower dimensional space is often easier to work with. 
I Allows for interpolation between observed data points.
De
nition of a latent variable model 
I Assumptions: 
I Assume that the observed variables actually result from a 
smaller set of latent variables. 
I Assume that the observed variables are independent given the 
latent variables. 
I Diers slightly from dimensionality reduction paradigm which 
wishes to
nd a lower-dimensional embedding in the 
high-dimensional space. 
I With the latent variable model we specify the functional form 
of the mapping: 
y = g(x) +  
where x are the latent variables, y are the observed variables 
and  is noise. 
I Obtain dierent latent variable models for dierent 
assumptions on g(x) and
Graphical model representation 
Graphical Model example of Latent Variable Model 
Taken from Neil Lawrence: 
http: // ml. dcs. shef. ac. uk/ gpss/ gpws14/ gp_ gpws14_ session3. pdf
Plan 
Introduction 
Why are latent variable models useful? 
De
nition of a latent variable model 
Graphical Model representation 
PCA recap 
Principal Components Analysis (PCA) 
Probabilistic PCA based methods 
Probabilistic PCA (PPCA) 
Dual PPCA 
GPLVM 
Examples 
Practical points 
Variants 
Conclusion 
Conclusion 
References
Principal Components Analysis (PCA) 
I Returns orthogonal dimensions of maximum variance. 
I Works well if data lies on a plane in the higher dimensional 
space. 
I Linear method (although variants allow non-linear application, 
e.g. kernel PCA). 
Example application of PCA. Taken from 
http: // www. nlpca. org/ pca_ principal_ component_ analysis. html
Plan 
Introduction 
Why are latent variable models useful? 
De
nition of a latent variable model 
Graphical Model representation 
PCA recap 
Principal Components Analysis (PCA) 
Probabilistic PCA based methods 
Probabilistic PCA (PPCA) 
Dual PPCA 
GPLVM 
Examples 
Practical points 
Variants 
Conclusion 
Conclusion 
References
Probabilistic PCA (PPCA) 
I A probabilistic version of PCA. 
I Probabilistic formulation is useful for many reasons: 
I Allows comparison with other techniques via likelihood 
measure. 
I Facilitates statistical testing. 
I Allows application of Bayesian methods. 
I Provides a principled way of handling missing values - via 
Expectation Maximization.
PPCA De
nition 
I Consider a set of centered data of n observations and d 
dimensions: Y = [y1; : : : ; yn]T . 
I We assume this data has a linear relationship with some 
embedded latent space data xn. Where Y 2 RND and 
x 2 RNq. 
I yn = Wxn + n, where xn is the q-dimensional latent variable 
associated with each observation, and W 2 RDq is the 
transformation matrix relating the observed and latent space. 
I We assume a spherical Gaussian distribution for the noise with 
a mean of zero and a covariance of
1I 
I Likelihood for an observation yn is: 
p (ynjxn;W;
) = N 
 
ynjWxn;
1I
PPCA Derivation 
I Marginalise latent variables xn, put a Gaussian prior on W and 
solve using maximum likelihood. 
I The prior used for xn in the integration is a zero mean, unit 
covariance Gaussian distribution: 
p(xn) = N(xnj0; I) 
p(ynjW;
) = 
Z 
p(ynjxn;W;
)p(xn)dxn 
p(ynjW;
) = 
Z 
N 
 
ynjWxn;
1I 
 
N(xnj0; I)dxn 
p(ynjW;
) = N(ynj0;WWT +
1I) 
I Assuming i.i.d. data, the likelihood of the full set is the 
product of the individual probabilities: 
p(Y jW;
) = 
YN 
n=1 
p(ynjW;
)
PPCA Derivation 
I To calculate that marginalisation step we use the summation 
and scaling properties of Gaussians. 
I Sum of Gaussian variables is Gaussian. 
Xn 
i=1 
N(i ; 2 
i )  N 
  
Xn 
i=1 
i ; 
Xn 
i=1 
2 
i 
! 
I Scaling a Gaussian leads to a Gaussian: 
wN(; 2)  N(w;w22) 
I So: 
y = Wx +  ; x  N(0; I) ;   N(0; 2I) 
Wx  N(0;WWT) 
Wx +   N(0;WWT + 2I)

More Related Content

What's hot

jamoviによるデータ分析(5):質的データの連関
jamoviによるデータ分析(5):質的データの連関jamoviによるデータ分析(5):質的データの連関
jamoviによるデータ分析(5):質的データの連関YUNO SHIMIZU
 
PRML輪読#14
PRML輪読#14PRML輪読#14
PRML輪読#14matsuolab
 
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】Yuuki Takano
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networksjaewon lee
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural NetworkMasahiro Suzuki
 
PRML 4.4-4.5.2 ラプラス近似
PRML 4.4-4.5.2 ラプラス近似PRML 4.4-4.5.2 ラプラス近似
PRML 4.4-4.5.2 ラプラス近似KokiTakamiya
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」Keisuke Sugawara
 
PRML復々習レーン#3 3.1.3-3.1.5
PRML復々習レーン#3 3.1.3-3.1.5PRML復々習レーン#3 3.1.3-3.1.5
PRML復々習レーン#3 3.1.3-3.1.5sleepy_yoshi
 
情報幾何勉強会 EMアルゴリズム
情報幾何勉強会 EMアルゴリズム 情報幾何勉強会 EMアルゴリズム
情報幾何勉強会 EMアルゴリズム Shinagawa Seitaro
 
A summary on “On choosing and bounding probability metrics”
A summary on “On choosing and bounding probability metrics”A summary on “On choosing and bounding probability metrics”
A summary on “On choosing and bounding probability metrics”Kota Matsui
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...Sungha Choi
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models수철 박
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説弘毅 露崎
 
Prml 2_3_5
Prml 2_3_5Prml 2_3_5
Prml 2_3_5brownbro
 
スパース推定
スパース推定スパース推定
スパース推定y-uti
 
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...Chung Hyung Jin
 
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)Ryosuke Sasaki
 

What's hot (20)

Prml 10 1
Prml 10 1Prml 10 1
Prml 10 1
 
jamoviによるデータ分析(5):質的データの連関
jamoviによるデータ分析(5):質的データの連関jamoviによるデータ分析(5):質的データの連関
jamoviによるデータ分析(5):質的データの連関
 
PRML輪読#14
PRML輪読#14PRML輪読#14
PRML輪読#14
 
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
【やってみた】リーマン多様体へのグラフ描画アルゴリズムの実装【実装してみた】
 
PR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural NetworksPR 171: Large margin softmax loss for Convolutional Neural Networks
PR 171: Large margin softmax loss for Convolutional Neural Networks
 
(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network(DL hacks輪読)Bayesian Neural Network
(DL hacks輪読)Bayesian Neural Network
 
PRML 4.4-4.5.2 ラプラス近似
PRML 4.4-4.5.2 ラプラス近似PRML 4.4-4.5.2 ラプラス近似
PRML 4.4-4.5.2 ラプラス近似
 
PRML第6章「カーネル法」
PRML第6章「カーネル法」PRML第6章「カーネル法」
PRML第6章「カーネル法」
 
PRML復々習レーン#3 3.1.3-3.1.5
PRML復々習レーン#3 3.1.3-3.1.5PRML復々習レーン#3 3.1.3-3.1.5
PRML復々習レーン#3 3.1.3-3.1.5
 
情報幾何勉強会 EMアルゴリズム
情報幾何勉強会 EMアルゴリズム 情報幾何勉強会 EMアルゴリズム
情報幾何勉強会 EMアルゴリズム
 
A summary on “On choosing and bounding probability metrics”
A summary on “On choosing and bounding probability metrics”A summary on “On choosing and bounding probability metrics”
A summary on “On choosing and bounding probability metrics”
 
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models
 
PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説
 
DeepLearningTutorial
DeepLearningTutorialDeepLearningTutorial
DeepLearningTutorial
 
Prml 2_3_5
Prml 2_3_5Prml 2_3_5
Prml 2_3_5
 
W8PRML5.1-5.3
W8PRML5.1-5.3W8PRML5.1-5.3
W8PRML5.1-5.3
 
スパース推定
スパース推定スパース推定
スパース推定
 
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...
Come-Closer-Diffuse-Faster Accelerating Conditional Diffusion Models for Inve...
 
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
 

Viewers also liked

Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...wl820609
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognitionzukun
 
関東CV勉強会 Kernel PCA (2011.2.19)
関東CV勉強会 Kernel PCA (2011.2.19)関東CV勉強会 Kernel PCA (2011.2.19)
関東CV勉強会 Kernel PCA (2011.2.19)Akisato Kimura
 
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...Shuyo Nakatani
 
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender SystemsWSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender SystemsKotaro Tanahashi
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNETomoki Hayashi
 
AutoEncoderで特徴抽出
AutoEncoderで特徴抽出AutoEncoderで特徴抽出
AutoEncoderで特徴抽出Kai Sasaki
 
非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2ndMika Yoshimura
 
CVIM#11 3. 最小化のための数値計算
CVIM#11 3. 最小化のための数値計算CVIM#11 3. 最小化のための数値計算
CVIM#11 3. 最小化のための数値計算sleepy_yoshi
 
Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Shintaro Fukushima
 
基底変換、固有値・固有ベクトル、そしてその先
基底変換、固有値・固有ベクトル、そしてその先基底変換、固有値・固有ベクトル、そしてその先
基底変換、固有値・固有ベクトル、そしてその先Taketo Sano
 
Hyperoptとその周辺について
Hyperoptとその周辺についてHyperoptとその周辺について
Hyperoptとその周辺についてKeisuke Hosaka
 

Viewers also liked (16)

Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
Dimension Reduction And Visualization Of Large High Dimensional Data Via Inte...
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
Topic Models
Topic ModelsTopic Models
Topic Models
 
Manifold learning with application to object recognition
Manifold learning with application to object recognitionManifold learning with application to object recognition
Manifold learning with application to object recognition
 
関東CV勉強会 Kernel PCA (2011.2.19)
関東CV勉強会 Kernel PCA (2011.2.19)関東CV勉強会 Kernel PCA (2011.2.19)
関東CV勉強会 Kernel PCA (2011.2.19)
 
Self-organizing map
Self-organizing mapSelf-organizing map
Self-organizing map
 
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
[Kim+ ICML2012] Dirichlet Process with Mixed Random Measures : A Nonparametri...
 
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender SystemsWSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
WSDM2016読み会 Collaborative Denoising Auto-Encoders for Top-N Recommender Systems
 
Visualizing Data Using t-SNE
Visualizing Data Using t-SNEVisualizing Data Using t-SNE
Visualizing Data Using t-SNE
 
AutoEncoderで特徴抽出
AutoEncoderで特徴抽出AutoEncoderで特徴抽出
AutoEncoderで特徴抽出
 
LDA入門
LDA入門LDA入門
LDA入門
 
非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd非線形データの次元圧縮 150905 WACODE 2nd
非線形データの次元圧縮 150905 WACODE 2nd
 
CVIM#11 3. 最小化のための数値計算
CVIM#11 3. 最小化のための数値計算CVIM#11 3. 最小化のための数値計算
CVIM#11 3. 最小化のための数値計算
 
Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Numpy scipyで独立成分分析
Numpy scipyで独立成分分析
 
基底変換、固有値・固有ベクトル、そしてその先
基底変換、固有値・固有ベクトル、そしてその先基底変換、固有値・固有ベクトル、そしてその先
基底変換、固有値・固有ベクトル、そしてその先
 
Hyperoptとその周辺について
Hyperoptとその周辺についてHyperoptとその周辺について
Hyperoptとその周辺について
 

Similar to The Gaussian Process Latent Variable Model (GPLVM)

One Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationOne Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationWork-Bench
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemIosif Itkin
 
The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...SSA KPI
 
proposal_pura
proposal_puraproposal_pura
proposal_puraErick Lin
 
A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...Konstantinos Giannakis
 
An approximate possibilistic
An approximate possibilisticAn approximate possibilistic
An approximate possibilisticcsandit
 
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTIONOPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTIONsipij
 
A Robust Method Based On LOVO Functions For Solving Least Squares Problems
A Robust Method Based On LOVO Functions For Solving Least Squares ProblemsA Robust Method Based On LOVO Functions For Solving Least Squares Problems
A Robust Method Based On LOVO Functions For Solving Least Squares ProblemsDawn Cook
 
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...SSA KPI
 
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...Rafael Nogueras
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsYoung-Geun Choi
 
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATION
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATIONLOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATION
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATIONorajjournal
 

Similar to The Gaussian Process Latent Variable Model (GPLVM) (20)

MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
One Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical ComputationOne Algorithm to Rule Them All: How to Automate Statistical Computation
One Algorithm to Rule Them All: How to Automate Statistical Computation
 
Particle filter
Particle filterParticle filter
Particle filter
 
RS
RSRS
RS
 
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light SystemTMPA-2015: Implementing the MetaVCG Approach in the C-light System
TMPA-2015: Implementing the MetaVCG Approach in the C-light System
 
The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...The Sample Average Approximation Method for Stochastic Programs with Integer ...
The Sample Average Approximation Method for Stochastic Programs with Integer ...
 
reportVPLProject
reportVPLProjectreportVPLProject
reportVPLProject
 
proposal_pura
proposal_puraproposal_pura
proposal_pura
 
A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
An approximate possibilistic
An approximate possibilisticAn approximate possibilistic
An approximate possibilistic
 
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTIONOPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION
OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION
 
A Robust Method Based On LOVO Functions For Solving Least Squares Problems
A Robust Method Based On LOVO Functions For Solving Least Squares ProblemsA Robust Method Based On LOVO Functions For Solving Least Squares Problems
A Robust Method Based On LOVO Functions For Solving Least Squares Problems
 
Principal component analysis
Principal component analysisPrincipal component analysis
Principal component analysis
 
post119s1-file3
post119s1-file3post119s1-file3
post119s1-file3
 
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
Efficient Solution of Two-Stage Stochastic Linear Programs Using Interior Poi...
 
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
Self-sampling Strategies for Multimemetic Algorithms in Unstable Computationa...
 
Chap 8. Optimization for training deep models
Chap 8. Optimization for training deep modelsChap 8. Optimization for training deep models
Chap 8. Optimization for training deep models
 
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
CLIM Program: Remote Sensing Workshop, Optimization for Distributed Data Syst...
 
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATION
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATIONLOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATION
LOGNORMAL ORDINARY KRIGING METAMODEL IN SIMULATION OPTIMIZATION
 

Recently uploaded

PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONChetanK57
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsMichel Dumontier
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPirithiRaju
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxmuralinath2
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxmuralinath2
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPirithiRaju
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rockskumarmathi863
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxSultanMuhammadGhauri
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptsreddyrahul
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinossaicprecious19
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard Gill
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthSérgio Sacani
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
 
biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxANONYMOUS
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Sérgio Sacani
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...Subhajit Sahu
 
electrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptxelectrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptxHusna Zaheer
 

Recently uploaded (20)

PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdfPests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
Pests of Green Manures_Bionomics_IPM_Dr.UPR.pdf
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdfPests of sugarcane_Binomics_IPM_Dr.UPR.pdf
Pests of sugarcane_Binomics_IPM_Dr.UPR.pdf
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptxGLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
GLOBAL AND LOCAL SCENARIO OF FOOD AND NUTRITION.pptx
 
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynypptAerodynamics. flippatterncn5tm5ttnj6nmnynyppt
Aerodynamics. flippatterncn5tm5ttnj6nmnynyppt
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
A Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on EarthA Giant Impact Origin for the First Subduction on Earth
A Giant Impact Origin for the First Subduction on Earth
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
biotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptxbiotech-regenration of plants, pharmaceutical applications.pptx
biotech-regenration of plants, pharmaceutical applications.pptx
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
 
electrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptxelectrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptx
 

The Gaussian Process Latent Variable Model (GPLVM)

  • 1. Gaussian Process Latent Variable Model (GPLVM) James McMurray PhD Student Department of Empirical Inference 11/02/2014
  • 2. Outline of talk Introduction Why are latent variable models useful? De
  • 3. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 4. Why are latent variable models useful? I Data has structure.
  • 5. Why are latent variable models useful? I Observed high-dimensional data often lies on a lower-dimensional manifold. Example Swiss Roll" dataset
  • 6. Why are latent variable models useful? I The structure in the data means that we don't need such high dimensionality to describe it. I The lower dimensional space is often easier to work with. I Allows for interpolation between observed data points.
  • 7. De
  • 8. nition of a latent variable model I Assumptions: I Assume that the observed variables actually result from a smaller set of latent variables. I Assume that the observed variables are independent given the latent variables. I Diers slightly from dimensionality reduction paradigm which wishes to
  • 9. nd a lower-dimensional embedding in the high-dimensional space. I With the latent variable model we specify the functional form of the mapping: y = g(x) + where x are the latent variables, y are the observed variables and is noise. I Obtain dierent latent variable models for dierent assumptions on g(x) and
  • 10. Graphical model representation Graphical Model example of Latent Variable Model Taken from Neil Lawrence: http: // ml. dcs. shef. ac. uk/ gpss/ gpws14/ gp_ gpws14_ session3. pdf
  • 11. Plan Introduction Why are latent variable models useful? De
  • 12. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 13. Principal Components Analysis (PCA) I Returns orthogonal dimensions of maximum variance. I Works well if data lies on a plane in the higher dimensional space. I Linear method (although variants allow non-linear application, e.g. kernel PCA). Example application of PCA. Taken from http: // www. nlpca. org/ pca_ principal_ component_ analysis. html
  • 14. Plan Introduction Why are latent variable models useful? De
  • 15. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 16. Probabilistic PCA (PPCA) I A probabilistic version of PCA. I Probabilistic formulation is useful for many reasons: I Allows comparison with other techniques via likelihood measure. I Facilitates statistical testing. I Allows application of Bayesian methods. I Provides a principled way of handling missing values - via Expectation Maximization.
  • 18. nition I Consider a set of centered data of n observations and d dimensions: Y = [y1; : : : ; yn]T . I We assume this data has a linear relationship with some embedded latent space data xn. Where Y 2 RND and x 2 RNq. I yn = Wxn + n, where xn is the q-dimensional latent variable associated with each observation, and W 2 RDq is the transformation matrix relating the observed and latent space. I We assume a spherical Gaussian distribution for the noise with a mean of zero and a covariance of
  • 19. 1I I Likelihood for an observation yn is: p (ynjxn;W;
  • 20. ) = N ynjWxn;
  • 21. 1I
  • 22. PPCA Derivation I Marginalise latent variables xn, put a Gaussian prior on W and solve using maximum likelihood. I The prior used for xn in the integration is a zero mean, unit covariance Gaussian distribution: p(xn) = N(xnj0; I) p(ynjW;
  • 23. ) = Z p(ynjxn;W;
  • 25. ) = Z N ynjWxn;
  • 26. 1I N(xnj0; I)dxn p(ynjW;
  • 28. 1I) I Assuming i.i.d. data, the likelihood of the full set is the product of the individual probabilities: p(Y jW;
  • 29. ) = YN n=1 p(ynjW;
  • 30. )
  • 31. PPCA Derivation I To calculate that marginalisation step we use the summation and scaling properties of Gaussians. I Sum of Gaussian variables is Gaussian. Xn i=1 N(i ; 2 i ) N Xn i=1 i ; Xn i=1 2 i ! I Scaling a Gaussian leads to a Gaussian: wN(; 2) N(w;w22) I So: y = Wx + ; x N(0; I) ; N(0; 2I) Wx N(0;WWT) Wx + N(0;WWT + 2I)
  • 33. nd a solution for W by maximising the likelihood. I Results in an eigenvalue problem. I Turns out that the closed-form solution for W is achieved when W spans the principal sub-space of the data1. I Same solution as PCA: Probabilistic PCA I Can it be extended to capture non-linear features? 1Michael E. Tipping and Christopher M. Bishop. Probabilistic principal component analysis. (1997).
  • 34. Dual PPCA I Similar to previous derivation of PPCA. I But marginalise W and optimise xn. I Same linear-Gaussian relationship between latent variables and data: p(YjX;W;
  • 35. ) = YD d=1 N(yd;:jWxd;:;
  • 36. 1I) I Place a conjugate prior on W: P(W) = YD d=1 N(wd;:j0; I) I Resulting marginal likelihood is: P(Y jX;
  • 37. ) = YD d=1 N(y:;d j0;XXT +
  • 38. 1I)
  • 39. Dual PPCA I Results in equivalent eigenvalue problem to PPCA. I So what is the bene
  • 40. t? I The eigendecomposition is now done on an N q instead of a d q matrix. I Recall marginal likelihood: P(Y jX;
  • 41. ) = YD d=1 N(y:;d j0;XXT +
  • 42. 1I) I The covariance matrix is a covariance function: K = XXT +
  • 43. 1I I This linear kernel can be replaced by other covariance functions for non-linearity. I This is the GPLVM.
  • 44. GPLVM I Each dimension of the marginal distribution can be interpreted as an independent Gaussian Process2. I Dual PPCA is the special case where the output dimensions are assumed to be linear, independent and identically distributed. I GPLVM removes assumption of linearity. I Gaussian prior over the function space. I Choice of covariance function changes family of functions considered. I Popular kernels: I Exponentiated Quadratic (RBF) kernel I Matern kernels I Periodic kernels I Many more... 2Neil Lawrence: Probabilistic non-linear principal component analysis with Gaussian process latent variable models. JMLR (2005)
  • 45. Plan Introduction Why are latent variable models useful? De
  • 46. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 47. Example: Frey Face data Example in GPMat
  • 48. Example: Motion Capture data Taken from Keith Grochow, et al. Style-based inverse kinematics. ACM Transactions on Graphics (TOG). Vol. 23. No. 3. ACM, 2004.
  • 49. Plan Introduction Why are latent variable models useful? De
  • 50. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 51. Practical points I Need to optimise over non-convex objective function. I Achieved using gradient-based methods (scaled conjugate gradients). I Several restarts to attempt to avoid local optima. I Cannot guarantee global optimum. I High computational cost for large datasets. I May need to optimise over most-informative subset of data, the active set for sparsi
  • 53. Practical points Initialisation can have a large eect on the
  • 54. nal results. Eect of poor initialisation on Swiss Roll dataset. PCA left, Isomap right. Taken from Probabilistic non-linear principal component analysis with Gaussian process latent variable models., Neil Lawrence, JMLR (2005).
  • 55. Plan Introduction Why are latent variable models useful? De
  • 56. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 57. Variants I There are a number of variants of the GPLVM. I For example, the GPLVM uses the same covariance function for each output dimension. I This can be changed, for example the Scaled GPLVM which introduces a scaling parameter for each output dimension3. I The Gaussian Process Dynamic Model (GPDM) adds another Gaussian process for dynamical mappings4. I The Bayesian GPLVM approximates integrating over both the latent variables and the mapping function5. 3Keith Grochow, et al. Style-based inverse kinematics. ACM Transactions on Graphics (TOG). Vol. 23. No. 3. ACM, 2004. 4Wang, Jack, Aaron Hertzmann, and David M. Blei. Gaussian process dynamical models. Advances in neural information processing systems. 2005. 5Titsias, Michalis, and Neil Lawrence. Bayesian Gaussian process latent variable model. (2010).
  • 58. Variants I The Shared GPLVM learns mappings from a shared latent space to two separate observational spaces. I Used by Disney Research in their paper Animating Non-Humanoid Characters with Human Motion Data for generating animations for non-human characters from human motion capture data. Shared GPLVM mappings as used by Disney Research I Video
  • 59. Variants I Can also put a Gaussian Process prior on X to produce Deep Gaussian Processes6. I Zhang et al. developed Invariant GPLVM7 - permits interpretation of causal relations between observed variables, by allowing arbitrary noise correlations between the latent variables. I Currently attempting to implement IGPLVM in GPy. 6Damianou, Andreas C., and Neil D. Lawrence. Deep Gaussian Processes. arXiv preprint arXiv:1211.0358 (2012). 7Zhang, K., Scholkopf, B., and Janzing, D. (2010). Invariant Gaussian Process Latent Variable Models and Application in Causal Discovery. UAI 2010.
  • 60. Plan Introduction Why are latent variable models useful? De
  • 61. nition of a latent variable model Graphical Model representation PCA recap Principal Components Analysis (PCA) Probabilistic PCA based methods Probabilistic PCA (PPCA) Dual PPCA GPLVM Examples Practical points Variants Conclusion Conclusion References
  • 62. Conclusion I Implemented in GPy (Python) and GPMat (MATLAB). I Many practical applications - pose modelling, tweening I Especially if smooth interpolation is desireable. I Modeling of confounders. Thanks for your time Questions?
  • 63. References I Neil Lawrence, Gaussian process latent variable models for visualisation of high dimensional data. Advances in neural information processing systems 16.329-336 (2004): 3. I Neil Lawrence, Probabilistic non-linear principal component analysis wit Gaussian process latent variable models. JMLR (2005) I Gaussian Process Winter School, Sheeld 2013: http://ml.dcs.shef.ac.uk/gpss/gpws14/ I WikiCourseNote: http://wikicoursenote.com/wiki/ Probabilistic_PCA_with_GPLVM