PRML 条件付き混合モデル 14.5

Conditional Mixture Models
p14.5.1 Mixtures of linear regression models
p14.5.2 Mixtures of logistic models
p14.5.3 Mixtures of experts

Decision trees
Positive
interpretability
Negative
hard assignment
Splits: axis-aligned of the input space.
Regression:
piecewise-constant predictions
discontinuities at the split boundaries.
<latexit sha1_base64="(null)">(null)</latexit>

Conditional Mixture Models
Positive
soft probabilistic splits of the input space
Splits : functions of all of the input variables
Negative
no interpretability
Fig.14.9
Model A
Model B
Model

Mixture of Experts Model (MoE)
n a fully probabilistic tree-based model
Hierarchical Mixture of Experts (HME)
Ex)

Mixtures of linear regression models

Gaussian Mixture Models
n Clustering (unsupervised Learning)
Fig. 9.5
Observed data
If k = 3
k

Mixtures of linear regression models
n A general of switching regression
, 9 3 79 7 97 3 9 8 7 70 7 A7 1 7 3 3 7 )93 . 7
) 1 (
Linear regression

Learning the parameters
EM algorithm
Parameters
Data set <latexit sha1_base64="(null)">(null)</latexit>

E step
responsibilities
initialized <latexit sha1_base64="(null)">(null)</latexit>
The expectation of the complete-data log likelihood

M step
Fixed <latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit>
Maximize the function about
the constraint
1) <latexit sha1_base64="(null)">(null)</latexit>
the mixing coefficients
Using a Lagrange multiplier method,

M step
2) <latexit sha1_base64="(null)">(null)</latexit>
the parameter of the k-th linear regression model
Linear regression model (3.12)
weighted least squares problem
If k = 3
k
n = 1
<latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit>

M step
the parameter of the k-th linear regression model
matrix notation
n are learned the data of the high responsibility .<latexit sha1_base64="(null)">(null)</latexit>

M step
a precision parameter<latexit sha1_base64="(null)">(null)</latexit>

Example Fig.14.8

The predictive conditional density
No independent of the input

Mixtures of logistic models
logistic regression
Model

Learning the parameters
EM algorithm
Parameters
data set <latexit sha1_base64="(null)">(null)</latexit>
the complete-data log likelihood
}

E step

M step
Fixed <latexit sha1_base64="(null)">(null)</latexit>
Maximize the function about
the mixing coefficients
}

M step
the parameter of the k-th logistic regression model
does not have a closed-form solution.<latexit sha1_base64="(null)">(null)</latexit>
(IRLS) algorithm
n Need the gradient and Hessian

IRLS algorithm
Gradient
Hessian
While not convergence do

A mixture of logistic regression models
true probability of the class label single logistic regression mixture of two logistic regression
Model A
Model B

Mixtures of experts
Gating function:
determine which expert are dominant in which region.
be represented by a linear softmax or sigmoid.
n a mixture of experts model
Expert:
can model in different regions of input space.
predict in their own region.

Gating function
softmaxsigmoid

MoE Linear regression model
sigmoid
MoE
Omg!

Hierarchical Mixture of Experts
HME
Negative
l large number of parameters =>Bayesian HMoE (2003)

5.6 mixture density network
output
n share the hidden units of the neural network.
n the splits of the input space are relaxed, can be nolinear!

Bayesian Hierarchical Mixtures of Experts
Bishop, C. M. and M. Svense ́n (2003). Bayesian hierarchical mixtures of experts. In U. Kjaerulff and C. Meek (Eds.),
Proceedings Nineteenth Conference on Uncertainty in Artificial Intelligence, pp. 57–64. Morgan Kaufmann.
Application: the kinematics of robot arms
inverse problem
two pattern
Input: parameters and angles of the robot arm.
Output : the end effector of the robot arm .

Bayesian Hierarchical Mixtures of Experts
Expert <latexit sha1_base64="(null)">(null)</latexit>
Graphical Model
branch point

Probabilistic tree-based model
: The total number of experts <latexit sha1_base64="(null)">(null)</latexit>

HMoE Model
branch pointExpert
Data set
Likelihood

Joint probability
⇨ Variational Inference

PRML 条件付き混合モデル 14.5

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PRML 条件付き混合モデル 14.5

Similar to PRML 条件付き混合モデル 14.5 (20)

More from tmtm otm

More from tmtm otm (16)

Recently uploaded

Recently uploaded (20)

PRML 条件付き混合モデル 14.5