A principled way to principal components analysis

A principled way to principal
components analysis

Teaching activity objectives
• Visualize large data sets.
• Transform the data to aid in this
visualization.
• Clustering data.
• Implement basic linear algebra operations.
• Connect this operations to neuronal
models and brain function.

Context for the activity
• Homework Assignment in 9.40 Intro to
neural Computation (Sophomore/Junior).
• In-class activity 9.014 Quantitative
Methods and Computational Models in
Neuroscience (1st year PhD).

Data visualization and
performing pca:

MNIST data set
28 by 28 pixels
8-bit gray scale images
These images live in
a 784 dimensional space
http://yann.lecun.com/exdb/mnist/

Can we cluster images in the
pixel space?

One possible visualization
There are more than 300000 possible pairwise pixel plots!!!

Is there a more principled way?
• Represent the data in a new basis set.
• Aids in visualization and potentially in
clustering and dimensionality reduction.
• PCA provides such a basis set by looking
at directions that capture most variance.
• The directions are ranked by decreasing
variance.
• It diagonalizes the covariance matrix.

Pedagogical approach
• Guide them step by step to implement PCA.
• Emphasize visualizations and geometrical
approach/intuition.
• We don’t use the MATLAB canned function
for PCA.
• We want students to get their hands “dirty”.
This helps build confidence and deep
understanding.

PCA Mantra
• Reshape the data to proper format for PCA.
• Center the data performing mean subtraction.
• Construct the data covariance matrix.
• Perform SVD to obtain the eigenvalues and
eigenvectors of the covariance matrix.
• Compute the variance explained per component
and plot it.
• Reshape the eigenvectors and visualize their
images.
• Project the mean subtracted data onto the
eigenvectors basis.

Projections onto the first 2 axes
• The first two PCs capture ~37% of the variance.
• The data forms clear clusters that are almost linearly separable

Building models: Synapses and
PCA

• 1949 book: 'The Organization
of Behavior' Theory about the
neural bases of learning
• Learning takes place at
synapses.
• Synapses get modified, they
get stronger when the pre- and
post- synaptic cells fire
together.
• "Cells that fire together, wire
together"
Hebbian Learning
Donald Hebb

Unstable
Building Hebbian synapses

Erkki Oja
Oja’s rule
A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology,
15:267-273 (1982).
Feedback,forgetting term or regularizer
• Stabilizes the Hebbian rule.
• Leads to a covariance learning rule: the weights
converge to the first eigenvector of the covariance
matrix.
• Similar to power iteration method.

Learning outcomes
• Visualize and manipulate a relatively large and
complex data set.
• Perform PCA by building it step by step.
• Gain an intuition of the geometry involved in a
change of basis and projections.
• Start thinking about basic clustering
algorithms.
• Discuss on dimensionality reduction and other
PCA applications

Learning outcomes (cont)
• Discuss the assumptions, limitations and
shortcomings of applying PCA in different
contexts.
• Build a model of how PCA might actually
take place in neural circuits.
• Follow up: eigenfaces, is the brain doing
PCA to recognize faces?

A principled way to principal components analysis

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (10)

Similar to A principled way to principal components analysis

Similar to A principled way to principal components analysis (20)

More from SERC at Carleton College

More from SERC at Carleton College (20)

A principled way to principal components analysis