SlideShare a Scribd company logo
Automatic variational inference
with
latent categorical variables
Tomasz Kuśmierczyk
2020-04-15
Resources
Mixture of discrete normalizing flows for variational inference:
● ArXiv text: https://arxiv.org/abs/2006.15568
● GitHub code:
https://github.com/tkusmierczyk/mixture_of_discrete_normalizing_flows
Discrete normalizing flows:
● ArXiv text: https://arxiv.org/abs/1905.10347
● GitHub implementation:
https://github.com/google/edward2/blob/master/edward2/tensorflow/layers/discrete_flo
ws.py
Agile AI with Probabilistic Programming
● easy specification & model development
● scalability thanks to variational inference
● handling latent discrete variables?
Latent categorical variables
● allocation models with hand-crafted algorithms
○ mixture models
○ hidden markov models
○ topic models
● expert’s knowledge encoding
● … ?
1D categorical distribution specification
(unordered) set of categories x∈ {a,b,c,d} → probability p
p(x=a) = ½
p(x=b) = 0
p(x=c) = ½
p(x=d) = 0
2D categorical distribution specification
p(x=a) = ½
p(x=b) = 0
p(x=c) = ½
p(x=d) = 0
0.25 0.20 0.05 0
0 0 0 0
0.20 0.05 0.05 0.20
0 0 0 0
e f g h
z =
joint distribution
p(x, z)
0.50 0.40 0.10 0
0 0 0 0
0.40 0.10 0.10 0.40
0 0 0 0
e f g h
z =
conditional distribution
p(z | x)
or
Toy model
https://en.wikipedia.org/wiki/Bayesian_network#/media/File:SimpleBayesNet.svg
Automatic Variational Inference
Why: efficient inference of approximate posteriors q without tedious math
qx(x) ≈ p(x|D)
Reparametrized ELBO:
Requirements for q to perform automatic VI with reparametrization gradients:
● generating sampling differentiable w.r.t distribution parameters
● log-probability of samples
Reparametrization:
For example: normal distribution with parameters λ = (μ, σ):
fλ(u) = μ + σ u , u ~ N(0, 1)
Normalizing flows:
● parameters λ do not have this interpretation
Reparametrization vs normalizing flows
, u ~ pu
Discrete flows
, u ~ pu
fλ
sampling:
probability evaluation:
Individual flow vs. mixture of flows
b-th flow
mixing weightsum over flows
Accuracy of B-flows
Base distributions with probability mass concentrated at exactly one category:
p(ub=c)=1
Equal mixing weights:
ρb
= 1/B
➔
each flow allocates probability 1/B
➔
| true probability for category k - approximation | ⋜ 1/B
➔
works well for concentrated distributions
➔
fails for uniform distribution
Multivariate categorical distributions
p(x) = p(x1) p(x2 | x1) p(x3 | x2, x1) … p(xd | xd-1, xd-2, …, x1) ...
fd(u) = (μd + σd u) mod Kd
where
μd := μd(xd-1, xd-2, …, x1, * )
σd := σd(xd-1, xd-2, …, x1, * )
neural networks
trained with
straight-through
estimator of argmax
Practical probability evaluation in
entropy term
assuming independence:
full covariance:
Toy example
p( … | grass wet = yes ) sprinkler = no sprinkler = yes
rain = no 0.000 0.642
rain = yes 0.353 0.004
p( … | grass wet = yes ) sprinkler = no sprinkler = yes
rain = no 0.000 0.600
rain = yes 0.400 0.000
true posterior
approximate posterior (10 flows, found in 15 iterations)
Experiments
● gaussian mixture model
● (large) bayesian networks
● (higher order) hidden markov model
● variational autoencoder

More Related Content

What's hot

Formal methods 4 - Z notation
Formal methods   4 - Z notationFormal methods   4 - Z notation
Formal methods 4 - Z notation
Vlad Patryshev
 
Elm talk bayhac2015
Elm talk bayhac2015Elm talk bayhac2015
Elm talk bayhac2015
Sergei Winitzki
 
Theory of Automata and formal languages Unit 3
Theory of Automata and formal languages Unit 3Theory of Automata and formal languages Unit 3
Theory of Automata and formal languages Unit 3
Abhimanyu Mishra
 
Integration+using+u-substitutions
Integration+using+u-substitutionsIntegration+using+u-substitutions
Integration+using+u-substitutionstutorcircle1
 
Cs6503 theory of computation november december 2015 be cse anna university q...
Cs6503 theory of computation november december 2015  be cse anna university q...Cs6503 theory of computation november december 2015  be cse anna university q...
Cs6503 theory of computation november december 2015 be cse anna university q...
appasami
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015
appasami
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paper
appasami
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016
appasami
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Peng Cheng
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
Sri Ambati
 
Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017
appasami
 
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdfComputing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Polytechnique Montréal
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papers
appasami
 
Dynamic Program Problems
Dynamic Program ProblemsDynamic Program Problems
Dynamic Program ProblemsRanjit Sasmal
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
DBOnto
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
Alexander Jung
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
appasami
 
Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2
Abhimanyu Mishra
 
Model toc
Model tocModel toc
Model toc
GUNASUNDARI C
 
Minimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part IMinimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part I
Michel Alves
 

What's hot (20)

Formal methods 4 - Z notation
Formal methods   4 - Z notationFormal methods   4 - Z notation
Formal methods 4 - Z notation
 
Elm talk bayhac2015
Elm talk bayhac2015Elm talk bayhac2015
Elm talk bayhac2015
 
Theory of Automata and formal languages Unit 3
Theory of Automata and formal languages Unit 3Theory of Automata and formal languages Unit 3
Theory of Automata and formal languages Unit 3
 
Integration+using+u-substitutions
Integration+using+u-substitutionsIntegration+using+u-substitutions
Integration+using+u-substitutions
 
Cs6503 theory of computation november december 2015 be cse anna university q...
Cs6503 theory of computation november december 2015  be cse anna university q...Cs6503 theory of computation november december 2015  be cse anna university q...
Cs6503 theory of computation november december 2015 be cse anna university q...
 
CS2303 Theory of computation April may 2015
CS2303 Theory of computation April may  2015CS2303 Theory of computation April may  2015
CS2303 Theory of computation April may 2015
 
Cs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paperCs6503 theory of computation may june 2016 be cse anna university question paper
Cs6503 theory of computation may june 2016 be cse anna university question paper
 
Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016Cs6503 theory of computation november december 2016
Cs6503 theory of computation november december 2016
 
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
Shape Safety in Tensor Programming is Easy for a Theorem Prover -SBTB 2021
 
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen BoydH2O World - Consensus Optimization and Machine Learning - Stephen Boyd
H2O World - Consensus Optimization and Machine Learning - Stephen Boyd
 
Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017Cs6503 theory of computation april may 2017
Cs6503 theory of computation april may 2017
 
Computing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdfComputing Information Flow Using Symbolic-Model-Checking_.pdf
Computing Information Flow Using Symbolic-Model-Checking_.pdf
 
Cs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papersCs2303 theory of computation all anna University question papers
Cs2303 theory of computation all anna University question papers
 
Dynamic Program Problems
Dynamic Program ProblemsDynamic Program Problems
Dynamic Program Problems
 
PAGOdA poster
PAGOdA posterPAGOdA poster
PAGOdA poster
 
Backpropagation
BackpropagationBackpropagation
Backpropagation
 
Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015Cs2303 theory of computation november december 2015
Cs2303 theory of computation november december 2015
 
Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2
 
Model toc
Model tocModel toc
Model toc
 
Minimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part IMinimal Introduction to C++ - Part I
Minimal Introduction to C++ - Part I
 

Similar to Automatic variational inference with latent categorical variables

VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
Kenta Oono
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.ppt
Indra Hermawan
 
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights ReservedMachine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Jonathan Mitchell
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Huang Po Chun
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Daniel Lewis
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
岳華 杜
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
npinto
 
Iwsm2014 an analogy-based approach to estimation of software development ef...
Iwsm2014   an analogy-based approach to estimation of software development ef...Iwsm2014   an analogy-based approach to estimation of software development ef...
Iwsm2014 an analogy-based approach to estimation of software development ef...
Nesma
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
Altoros
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Modelszukun
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes ClassiferAaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
AminaRepo
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Bharat Khatri
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
arogozhnikov
 
"That scripting language called Prolog"
"That scripting language called Prolog""That scripting language called Prolog"
"That scripting language called Prolog"
Sergei Winitzki
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
Pavithra Thippanaik
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Ukraine
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsbutest
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Universitat Politècnica de Catalunya
 

Similar to Automatic variational inference with latent categorical variables (20)

VAE-type Deep Generative Models
VAE-type Deep Generative ModelsVAE-type Deep Generative Models
VAE-type Deep Generative Models
 
lecture15-supervised.ppt
lecture15-supervised.pptlecture15-supervised.ppt
lecture15-supervised.ppt
 
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights ReservedMachine learning pt.1: Artificial Neural Networks ® All Rights Reserved
Machine learning pt.1: Artificial Neural Networks ® All Rights Reserved
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
 
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
Piotr Mirowski - Review Autoencoders (Deep Learning) - CIUUK14
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
 
Iwsm2014 an analogy-based approach to estimation of software development ef...
Iwsm2014   an analogy-based approach to estimation of software development ef...Iwsm2014   an analogy-based approach to estimation of software development ef...
Iwsm2014 an analogy-based approach to estimation of software development ef...
 
Deep Learning for Cyber Security
Deep Learning for Cyber SecurityDeep Learning for Cyber Security
Deep Learning for Cyber Security
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Models
 
Cluster analysis
Cluster analysisCluster analysis
Cluster analysis
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes ClassiferAaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
Aaa ped-12-Supervised Learning: Support Vector Machines & Naive Bayes Classifer
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4MLHEP 2015: Introductory Lecture #4
MLHEP 2015: Introductory Lecture #4
 
"That scripting language called Prolog"
"That scripting language called Prolog""That scripting language called Prolog"
"That scripting language called Prolog"
 
Instance Based Learning in Machine Learning
Instance Based Learning in Machine LearningInstance Based Learning in Machine Learning
Instance Based Learning in Machine Learning
 
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
GlobalLogic Machine Learning Webinar “Advanced Statistical Methods for Linear...
 
Software tookits for machine learning and graphical models
Software tookits for machine learning and graphical modelsSoftware tookits for machine learning and graphical models
Software tookits for machine learning and graphical models
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
 

More from Tomasz Kusmierczyk

Priors for BNNs
Priors for BNNsPriors for BNNs
Priors for BNNs
Tomasz Kusmierczyk
 
Overconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNsOverconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNs
Tomasz Kusmierczyk
 
On the Causal Effect of Digital Badges
On the Causal Effect of Digital BadgesOn the Causal Effect of Digital Badges
On the Causal Effect of Digital Badges
Tomasz Kusmierczyk
 
What are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake informationWhat are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake information
Tomasz Kusmierczyk
 
Sampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo TechniquesSampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo Techniques
Tomasz Kusmierczyk
 
Probabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant ModelsProbabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant Models
Tomasz Kusmierczyk
 
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Tomasz Kusmierczyk
 

More from Tomasz Kusmierczyk (7)

Priors for BNNs
Priors for BNNsPriors for BNNs
Priors for BNNs
 
Overconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNsOverconfidence and subnetwork Inference for BNNs
Overconfidence and subnetwork Inference for BNNs
 
On the Causal Effect of Digital Badges
On the Causal Effect of Digital BadgesOn the Causal Effect of Digital Badges
On the Causal Effect of Digital Badges
 
What are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake informationWhat are the negative effects of social media?: fighting fake information
What are the negative effects of social media?: fighting fake information
 
Sampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo TechniquesSampling and Markov Chain Monte Carlo Techniques
Sampling and Markov Chain Monte Carlo Techniques
 
Probabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant ModelsProbabilistic Models in Recommender Systems: Time Variant Models
Probabilistic Models in Recommender Systems: Time Variant Models
 
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
Mining Correlations on Massive Bursty Time Series Collection (DASFAA2015)
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Automatic variational inference with latent categorical variables

  • 1. Automatic variational inference with latent categorical variables Tomasz Kuśmierczyk 2020-04-15
  • 2. Resources Mixture of discrete normalizing flows for variational inference: ● ArXiv text: https://arxiv.org/abs/2006.15568 ● GitHub code: https://github.com/tkusmierczyk/mixture_of_discrete_normalizing_flows Discrete normalizing flows: ● ArXiv text: https://arxiv.org/abs/1905.10347 ● GitHub implementation: https://github.com/google/edward2/blob/master/edward2/tensorflow/layers/discrete_flo ws.py
  • 3. Agile AI with Probabilistic Programming ● easy specification & model development ● scalability thanks to variational inference ● handling latent discrete variables?
  • 4. Latent categorical variables ● allocation models with hand-crafted algorithms ○ mixture models ○ hidden markov models ○ topic models ● expert’s knowledge encoding ● … ?
  • 5. 1D categorical distribution specification (unordered) set of categories x∈ {a,b,c,d} → probability p p(x=a) = ½ p(x=b) = 0 p(x=c) = ½ p(x=d) = 0
  • 6. 2D categorical distribution specification p(x=a) = ½ p(x=b) = 0 p(x=c) = ½ p(x=d) = 0 0.25 0.20 0.05 0 0 0 0 0 0.20 0.05 0.05 0.20 0 0 0 0 e f g h z = joint distribution p(x, z) 0.50 0.40 0.10 0 0 0 0 0 0.40 0.10 0.10 0.40 0 0 0 0 e f g h z = conditional distribution p(z | x) or
  • 8. Automatic Variational Inference Why: efficient inference of approximate posteriors q without tedious math qx(x) ≈ p(x|D) Reparametrized ELBO: Requirements for q to perform automatic VI with reparametrization gradients: ● generating sampling differentiable w.r.t distribution parameters ● log-probability of samples
  • 9. Reparametrization: For example: normal distribution with parameters λ = (μ, σ): fλ(u) = μ + σ u , u ~ N(0, 1) Normalizing flows: ● parameters λ do not have this interpretation Reparametrization vs normalizing flows , u ~ pu
  • 10. Discrete flows , u ~ pu fλ sampling: probability evaluation:
  • 11. Individual flow vs. mixture of flows b-th flow mixing weightsum over flows
  • 12. Accuracy of B-flows Base distributions with probability mass concentrated at exactly one category: p(ub=c)=1 Equal mixing weights: ρb = 1/B ➔ each flow allocates probability 1/B ➔ | true probability for category k - approximation | ⋜ 1/B ➔ works well for concentrated distributions ➔ fails for uniform distribution
  • 13. Multivariate categorical distributions p(x) = p(x1) p(x2 | x1) p(x3 | x2, x1) … p(xd | xd-1, xd-2, …, x1) ... fd(u) = (μd + σd u) mod Kd where μd := μd(xd-1, xd-2, …, x1, * ) σd := σd(xd-1, xd-2, …, x1, * ) neural networks trained with straight-through estimator of argmax
  • 14. Practical probability evaluation in entropy term assuming independence: full covariance:
  • 15. Toy example p( … | grass wet = yes ) sprinkler = no sprinkler = yes rain = no 0.000 0.642 rain = yes 0.353 0.004 p( … | grass wet = yes ) sprinkler = no sprinkler = yes rain = no 0.000 0.600 rain = yes 0.400 0.000 true posterior approximate posterior (10 flows, found in 15 iterations)
  • 16. Experiments ● gaussian mixture model ● (large) bayesian networks ● (higher order) hidden markov model ● variational autoencoder