Autoencoders, PCA, Variational Autoencoders Explained

•

2 likes•529 views

This document provides an overview of autoencoders and variational autoencoders. It discusses how principal component analysis (PCA) is related to linear autoencoders and can be performed using backpropagation. Deep and nonlinear autoencoders are also covered. The document then introduces variational autoencoders, which combine variational inference with autoencoders to allow for probabilistic latent space modeling. It explains how variational autoencoders are trained using backpropagation through reparameterization to maximize the evidence lower bound.

Data & Analytics

Autoencoders
NN club, March 21 2018
Jonathan Ronen

Agenda
● PCA and linear autoencoders
● Deep and nonlinear autoencoders
● Variational autoencoders

PCA for dimensionality reduction
● The U that maximizes the variance of PC1
● also minimizes the reconstruction error
○ Note: this is not the same as OLS,
which minimizes
There are efficient solvers for this, but we could also use backpropagation

PCA through backpropagation
●
● This is an autoencoder
● If the neurons are linear, it is similar to PCA
○ Caveat: PCs are orthogonal, autoencoded
components are not - but they will span the
same space

Nonlinear autoencoder with 32 hidden neurons

Autoencoders can be deep
ReLu
ReLu
ReLu
ReLu
ReLu
ReLu
ReLu
ReLu
ReLu
ReLu
Sigmoid
Sigmoid
Sigmoid
Sigmoid
Sigmoid

Deep autoencoder (bottleneck of 2)
Guess which one is deep (has intermediate layer)?

Many variations of autoencoders
● Sparse autoencoders
● Denoising autoencoders
● Convolutional autoencoders
○ UNet is a sort of autoencoder
● And more…
● I’d like to introduce Variational Autoencoders

Variational autoencoders
Variational Bayesian Inference
Variational Inference
+
autoencoders
z
x observation
latent variable

Variational Inference (quick overview)
z
x observation
latent variable

Variational Inference (quick overview)
z
x observation
latent variable
problematic...

Variational Inference (quick overview)
z
x observation
latent variable
problematic...
Variational Inference Solution:
Chosen to be a
distribution we can work
with

Side note on
● Information
○ “How many bits do we need to represent event x if we optimized for p(x)?”
● Entropy
○ “What is the expected amount of information in each event drawn from p(x)?” (how many bits?)
● Cross-entropy
○ “What is the expected amount of information in p(x) if we optimized for q(x)?” (how many bits?)
● Kullback-Leibler divergence: “cross-entropy - entropy”
○ “How many more bits will we need to represent events from p(x) if we optimized for q(x)?

Variational Inference (quick overview)
skipping the math...
Maximizing the Evidence LOwer Bound (ELBO)

Variational inference is methods to maximize ELBO
How does it fit in with autoencoders?

What if autoencoders were probabilistic?

What if autoencoders were probabilistic?
Regular autoencoder
Variational autoencoder

Variational Autoencoder loss - negative ELBO
reconstruction error divergence from prior

Backpropagation through VAEs - reparameterizing

Regular autoencoder as a generative model?

Jupyter Notebook with all analysis in this talk
https://nbviewer.jupyter.org/gist/jonathanronen/69902c1a97149ab4aae42e099d1d1367

Further reading
● https://arxiv.org/abs/1312.6114
● https://www.youtube.com/watch?v=uaaqyVS9-rM
● https://www.jeremyjordan.me/variational-autoencoders/
● https://blog.keras.io/building-autoencoders-in-keras.html

What's hot

Deep neural networksSi Haem

Neural NetworksAdri Jovin

Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks

Ensemble methodsChristopher Marker

Convolutional Neural Network (CNN) - image recognitionYUNG-KUEI CHEN

GAN - Theory and ApplicationsEmanuele Ghelfi

Generative adversarial networksYunjey Choi

Genetic AlgorithmFatemeh Karimi

Support vector machineMusa Hawamdah

. An introduction to machine learning and probabilistic ...butest

Deep learningMohamed Loey

Introduction to machine learningSangath babu

Variational AutoencoderMark Chang

Diffusion models beat gans on image synthesisBeerenSahu

Cross validation.pptxYouKnowwho28

Generative Adversarial NetworksMustafa Yagmur

Deep LearningShaikh Shahzad

Deep learningRatnakar Pandey

Explainable AIDinesh V

Introduction to Deep learningMassimiliano Ruocco

What's hot (20)

Deep neural networks

Neural Networks

Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...

Ensemble methods

Convolutional Neural Network (CNN) - image recognition

GAN - Theory and Applications

Generative adversarial networks

Genetic Algorithm

Support vector machine

. An introduction to machine learning and probabilistic ...

Deep learning

Introduction to machine learning

Variational Autoencoder

Diffusion models beat gans on image synthesis

Cross validation.pptx

Generative Adversarial Networks

Deep Learning

Deep learning

Explainable AI

Introduction to Deep learning

Similar to Autoencoders, PCA, Variational Autoencoders Explained

Dictionary Learning in Games - GDC 2014Manchor Ko

Probabilistic programmingEli Gottlieb

ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 6tingyuansenastro

Large scale analysis for spiking dataHassan Nasser

How Computer Workguest5dedf5

Oxford Lectures Part 1Andrea Pasqua

Probabilistic Data Structures and Approximate Solutions Oleksandr PryymakPyData

Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya

Recurrent Neural NetworksCloudxLab

Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya

2.ANN.pptxVariable14

Artificial Neural Network (draft)James Boulie

Deep Learning Tutorial Ligeng Zhu

Matt Purkeypile's Doctoral Dissertation Defense Slidesmpurkeypile

ilp-nlp-slides.pdfFlorentBersani

State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain

Graph Neural Network - IntroductionJungwon Kim

Mcs 031shankhasubhra86

Chord DHTJohn-Alan Simmons

The Search for Gravitational Wavesinside-BigData.com

Similar to Autoencoders, PCA, Variational Autoencoders Explained (20)

Dictionary Learning in Games - GDC 2014

Probabilistic programming

ANU ASTR 4004 / 8004 Astronomical Computing : Lecture 6

Large scale analysis for spiking data

How Computer Work

Oxford Lectures Part 1

Probabilistic Data Structures and Approximate Solutions Oleksandr Pryymak

Deep Neural Networks (D1L2 Insight@DCU Machine Learning Workshop 2017)

Recurrent Neural Networks

Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)

2.ANN.pptx

Artificial Neural Network (draft)

Deep Learning Tutorial

Matt Purkeypile's Doctoral Dissertation Defense Slides

ilp-nlp-slides.pdf

State of the art time-series analysis with deep learning by Javier Ordóñez at...

Graph Neural Network - Introduction

Mcs 031

Chord DHT

The Search for Gravitational Waves

Recently uploaded

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone

SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1

Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann

Insurance Churn Prediction Data Analysis ProjectBoston Institute of Analytics

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics

Cyber awareness ppt on the recorded dataTecnoIncentive

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics

Data Analysis Project: Stroke PredictionBoston Institute of Analytics

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter

Real-Time AI Streaming - AI Max PrincetonTimothy Spann

Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy

convolutional neural network and its applications.pdfSubhamKumar3239

Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy

Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali

Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson

Easter Eggs From Star Wars and in cars 1 and 217djon017

Principles and Practices of Data VisualizationKianJazayeri1

Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics

Recently uploaded (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024

SMOTE and K-Fold Cross Validation-Presentation.pptx

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines

Insurance Churn Prediction Data Analysis Project

Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT

Cyber awareness ppt on the recorded data

Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...

Data Analysis Project: Stroke Prediction

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...

Real-Time AI Streaming - AI Max Princeton

Student profile product demonstration on grades, ability, well-being and mind...

convolutional neural network and its applications.pdf

Student Profile Sample report on improving academic performance by uniting gr...

Decoding Patterns: Customer Churn Prediction Data Analysis Project

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...

Defining Constituents, Data Vizzes and Telling a Data Story

Easter Eggs From Star Wars and in cars 1 and 2

Principles and Practices of Data Visualization

Bank Loan Approval Analysis: A Comprehensive Data Analysis Project

Autoencoders, PCA, Variational Autoencoders Explained

1. Autoencoders NN club, March 21 2018 Jonathan Ronen

2. Agenda ● PCA and linear autoencoders ● Deep and nonlinear autoencoders ● Variational autoencoders

3. PCA for dimensionality reduction

4. PCA for dimensionality reduction ● The U that maximizes the variance of PC1 ● also minimizes the reconstruction error ○ Note: this is not the same as OLS, which minimizes There are efficient solvers for this, but we could also use backpropagation

5. PCA through backpropagation ● ● This is an autoencoder ● If the neurons are linear, it is similar to PCA ○ Caveat: PCs are orthogonal, autoencoded components are not - but they will span the same space

6. PCA vs linear autoencoders for MNIST

7. PCA vs linear autoencoders for MNIST

8. Autoencoders can be nonlinear

9. Nonlinear autoencoder with 32 hidden neurons

10. Autoencoders can be deep ReLu ReLu ReLu ReLu ReLu ReLu ReLu ReLu ReLu ReLu Sigmoid Sigmoid Sigmoid Sigmoid Sigmoid

11. Deep autoencoder (bottleneck of 2) Guess which one is deep (has intermediate layer)?

12. Many variations of autoencoders ● Sparse autoencoders ● Denoising autoencoders ● Convolutional autoencoders ○ UNet is a sort of autoencoder ● And more… ● I’d like to introduce Variational Autoencoders

13. Variational autoencoders Variational Bayesian Inference Variational Inference + autoencoders z x observation latent variable

14. Variational Inference (quick overview) z x observation latent variable

15. Variational Inference (quick overview) z x observation latent variable problematic...

16. Variational Inference (quick overview) z x observation latent variable problematic... Variational Inference Solution: Chosen to be a distribution we can work with

17. Side note on ● Information ○ “How many bits do we need to represent event x if we optimized for p(x)?” ● Entropy ○ “What is the expected amount of information in each event drawn from p(x)?” (how many bits?) ● Cross-entropy ○ “What is the expected amount of information in p(x) if we optimized for q(x)?” (how many bits?) ● Kullback-Leibler divergence: “cross-entropy - entropy” ○ “How many more bits will we need to represent events from p(x) if we optimized for q(x)?

18. Variational Inference (quick overview) skipping the math... Maximizing the Evidence LOwer Bound (ELBO)

19. Variational inference is methods to maximize ELBO How does it fit in with autoencoders?

20. What if autoencoders were probabilistic?

21. What if autoencoders were probabilistic? Regular autoencoder Variational autoencoder

22. Variational Autoencoder loss - negative ELBO reconstruction error divergence from prior

23. Backpropagation through VAEs sampling

24. Backpropagation through VAEs - reparameterizing

25. VAE 2d embedding

26. VAEs are a generative model

27. Regular autoencoder as a generative model?

28. Jupyter Notebook with all analysis in this talk https://nbviewer.jupyter.org/gist/jonathanronen/69902c1a97149ab4aae42e099d1d1367

29. Further reading ● https://arxiv.org/abs/1312.6114 ● https://www.youtube.com/watch?v=uaaqyVS9-rM ● https://www.jeremyjordan.me/variational-autoencoders/ ● https://blog.keras.io/building-autoencoders-in-keras.html

Autoencoders, PCA, Variational Autoencoders Explained

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Autoencoders, PCA, Variational Autoencoders Explained

Similar to Autoencoders, PCA, Variational Autoencoders Explained (20)

Recently uploaded

Recently uploaded (20)

Autoencoders, PCA, Variational Autoencoders Explained