SlideShare a Scribd company logo
Markov Chain Monte Carlo Methods
Applications in Machine Learning
Andres Mendez-Vazquez
June 1, 2017
1 / 61
Images/cinvestav-
Outline
1 Introduction
The Main Reason
Examples of Application
Basically
2 The Monte Carlo Method
FERMIAC and ENIAC Computers
Immediate Applications
3 Markov Chains
Introduction
Enters Perron-Frobenious Theorem
Enter Google’s Page Rank
4 Markov Chain Monte Carlo Methods
Combining the Power of the two Methods
5 Metropolis Hastings
Introduction
A General Idea
Applications in Machine Learning
6 The Gibbs Sampler
Introduction
The Simplest Algorithm
Applications in Machine Learning
2 / 61
Images/cinvestav-
Chance
There are many phenomenas that introduce chance in their models
Therefore
Why not use SAMPLING to understand those
phenomenas?
4 / 61
Images/cinvestav-
Thus
Markov Chain Monte Carlo (MCMC) Methods
Algorithms that use Markov Chains to achieve samples of a target
phenomena!!!
Thus
They are computer based simulations able to obtain samples of π (x).
5 / 61
Images/cinvestav-
The Reason
There are several high dimensional problems
For Example, computing the volume of a convex body in d dimensions.
It is the only known general approach for providing a solution within a
reasonable time, O dk .
Therefore
MCMC plays significant role in statistics, econometrics, physics and
computing science.
6 / 61
Images/cinvestav-
Examples of Application
Bayesian Inference and Learning
Given some unknown variables X1, X2, ..., XK and data Y , we want to
know properties about Y
Graphically
8 / 61
Images/cinvestav-
Deep Learning
Restricted Boltzmann Machines use MCMC to optimize their weights
E (v, h) = −
i
aivi −
j
bjhj −
i j
viwi,jhj
9 / 61
Images/cinvestav-
What do we want?
Given a probability distribution of interest
π (x) , x ∈ RN
Which has the following structure
π (x) =
1
Z
h (x)
where h (x) is a PDF and Z is a unknown normalization constant.
Thus,
We want to understand such distribution!!!
11 / 61
Images/cinvestav-
The beginning of Monte Carlo Methods
1945 two events change the world forever
The successful nuclear test at Alamogordo.
The building of the first electronic computer, ENIAC.
Pushed for the creation of the Monte Carlo Methods
Original idea came from Stan Ulman... He loved relaxing by playing
poker and solitary!!!
Stan had an uncle who borrowed money from relatives because he
“just had to go to Monte Carlo.”
13 / 61
Images/cinvestav-
Together with Von Neumann
The Guy Behind the Minimax Algorithm
They started to develop an idea to trace the path of neutrons in a
spherical reactor.
14 / 61
Images/cinvestav-
Thus
We have then
At each stage a sequence of decisions has to be made based on statistical
probabilities appropriate to the physical and geometric factors.
For this, we only need a source of uniform random numbers!!!
Because it is possible to use the inverse of the cumulative of the target
function to obtain the necessary samples!!!
15 / 61
Images/cinvestav-
Thus, the FERMIAC was born!!!
When the ENIAC was moved and it was necessary to keep generating
target statistics
16 / 61
Images/cinvestav-
However
Once the ENIAC went on-line again
It took two months to have the basic controls for the Monte-Carlo
One fortnight for the last phases of the implementation.
Then, the tests were ran
And Monte Carlo was born!!!
17 / 61
Images/cinvestav-
Look At This, Von Neumann’s Programs
Design First and Programming After
18 / 61
Images/cinvestav-
Monte Carlo Integration
We can get integral of complex functions
I =
Ω
sin ln (x + y + 1)dxdy
Where Ω is a disk with
(x, y) | x −
1
2
2
+ y −
1
2
2
≤
1
4
We only need a source of randomly uniform points at that area
I ≈ Volume of Ω × Average Value of f in Ω
20 / 61
Images/cinvestav-
Getting the First Moment!!!
The goal is to compute the following expectation
E [f] = f (z) p (z) dz
Solution
Obtain a set of samples z(i) where i = 1, ..., N drawn independently from
p(z)
Approximate the expectation as
E [f] ≈ E [f] =
1
N
N
i=1
f z(i)
21 / 61
Images/cinvestav-
Clustering Using Stochastic Process
We use the following process
Imagine a Chinese restaurant with an infinite number of circular tables, each with
infinite capacity!!!
Customer ONE sits at the first table
The next customer either sits at the same table as customer ONE
Or the next table
Something like this
22 / 61
Images/cinvestav-
Thus, it is possible to build and entire Random Process
Simply Asking
p (customer i assigned to table j|D, α) =
f (dij) if j = i
α if i = j
Where
D is the distance between customers
with a similarity dij = d (ci, cj)
Let see the code
There we have a series of nice ideas.
23 / 61
Images/cinvestav-
Markov Chains
The random process Xt ∈ S for t = 1, 2, ..., T has a Markov property
If and only if
p (XT |XT−1, XT−2, ..., X1) = p (XT |XT−1)
Finite-State Discrete Time Markov Chains
It can be completely specified by the transition matrix.
P = [pij] with pij = P [Xt = j|Xt−1 = i]
25 / 61
Images/cinvestav-
Example
We have the following transition matrix







0.2 01 0.1 0.1 0.5
0.4 01 0.1 0.2 0.1
0.1 0.3 0.2 0.1 0.1
0.2 0.4 0.2 0.3 0.2
0.1 0.1 0.4 0.3 0.1







Graphically
1
5
2
3
4
26 / 61
Images/cinvestav-
What kind of Markov Chain do we like to study?
Ergodic
A Markov chain is called an ergodic chain if it is possible to go from every
state to every state (not necessarily in one move).
Aperiodic
A state i has period k if any return to state i must occur in multiples of k
time steps.
k = gcd {n > 0|P (Xn = i|X0 = i) > 0}
27 / 61
Images/cinvestav-
Therefore
Thus
If k = 1, then the state is said to be aperiodic.
Otherwise (k > 1), the state is said to be periodic with period k.
A Markov chain is aperiodic if every state is aperiodic
28 / 61
Images/cinvestav-
The Theorem
Perron–Frobenius Theorem
Let A be a positive square matrix. Then:
a. ρ(A) is an eigenvalue, and it has a positive eigenvector.
b. ρ(A) is the only eigenvalue on the disc |λ| = ρ(A).
30 / 61
Images/cinvestav-
This and other theorems allows to calculate something
quite interesting
Using the Power method
The method is described by the recurrence relation
w(i+1)
=
Tw(i)
Tw(i)
where Tw(i) =
√
w(i)tTtTw(i)
Then
The sub-sequence {wki
}∞
i=1 converges to an eigenvector associated with
the dominant eigenvalue.
31 / 61
Images/cinvestav-
Long Ago in a Long Forgotten Land
Dozens of Companies fought for the Search Landscape
American On-Line
Netscape
Yahoo
Infoseek
Lycos
Altavista
33 / 61
Images/cinvestav-
This is Old
For Example
34 / 61
Images/cinvestav-
Enters Larry Paige and Sergey Brin (Circa 1996)
They Invented the Google Matrix (A Misspelling of Googol = 10100
)
G = αS + (1 − α) 1v
Where
S is a modified version of an adjacency matrix by converting the
number of links into a probability
35 / 61
Images/cinvestav-
In addition
Also
1 is the Column Vector of ones.
v is a row vector of probabilities
v =
1
n
,
1
n
, ...,
1
n
(At the initial experiments)
The Matrix n × n, (1 − α) 1v








(1 − α) 1
n (1 − α) 1
n · · · (1 − α) 1
n
(1 − α) 1
n (1 − α) 1
n · · · (1 − α) 1
n
...
... ... ...
(1 − α) 1
n (1 − α) 1
n · · · (1 − α) 1
n








36 / 61
Images/cinvestav-
The Dampening Factor
Finally α
In the Google matrix indicates that Random Web surfers move to a
different web-page by some means other than selecting a link with
probability 1 − α
37 / 61
Images/cinvestav-
The Algorithm was the Edge!!!
First Google Server
38 / 61
Images/cinvestav-
After the introduction of the algorithm,
THE END for everybody else!!!
39 / 61
Images/cinvestav-
Now Imagine the following
You have a target distribution π that you want to sample
You can use a generative q distribution that you know to try to generate
the necessary samples.
Then, you have a process like this
1 Sample x ∼ q (x).
2 Use the functional form of the target distribution π to
Accept or Reject the sample x as being generated by π (x)
41 / 61
Images/cinvestav-
Basically
Markov Condition
The generation of x does not depend on previous states!!!
42 / 61
Images/cinvestav-
Further!!!
Monte Carlo Method
The algebraic and geometric properties of the target distribution helps
to accept or reject the sample!!!
43 / 61
Images/cinvestav-
History - The not so great remarks...
Metropolis
Generalized
−→ Metropolis-Hasitngs
Special Case
−→ Gibbs Sampling
All developments are done in Computational Physics.
The Landmark 1953 Paper N. Metropolis, A. Rosenbluth, M.
Rosenbluth, A. Teller, and E. Teller:
“Equation of state calculations by fast computing machines, Journal of
Chemical Physics.”
There is a quote by A. Rosenbluth
“Metropolis played no role in its development other than providing
computer time!”
After all, Metropolis was the supervisor in Los Alamos National Lab.
45 / 61
Images/cinvestav-
Steps of the Metropolis-Hastings
An M-H steps involves the following
A M-H step uses
1 The Target/Invariant Distribution l (x)
2 The Proposal/Sampling Distribution q (x |x)
Then
It involves sampling a candidate value x given the current value x
according to q (x |x)
The Markov chain then moves towards x
With acceptance probability
A (x, x ) = min 1,
l (x ) q (x|x )
l (x) q (x |x)
47 / 61
Images/cinvestav-
Logistic Regression
We know
l (w|x, y) =
n
i=1


exp wT xi
1 + exp {wT xi}


yi
1
1 + exp {wT xi}
1−yi
(1)
In our case, we use as q a Multi-variate Gaussian
π (w) ∼ N (µ, Σ)
49 / 61
Images/cinvestav-
Logistic Regression
The Markov chain then moves towards x
With acceptance probability
A (x, x ) = min 1,
l (x )
l (x)
50 / 61
Images/cinvestav-
Thus, we trigger the process of Sampling
Then, we look at the modes in the distribution
51 / 61
Images/cinvestav-
Let’s Take a Look at the Program
It is not so complex
A little bit of work!!!
52 / 61
Images/cinvestav-
The Assumptions
Suppose
We have an n-dimensional vector x.
The expressions for the full conditionals
p (xj|x1, ..., xj−1, xj+1, ..., xn)
Here, we have the following proposal distribution
q x |x(i)
=



p xj |x
(i)
−j if x−j = x
(i)
−j
0 otherwise
54 / 61
Images/cinvestav-
The Gibbs Sampler
The Algorithm
1 Init x0,1:n
2 For i = 0 to N − 1
Sample x
(i+1)
1 ∼ p x1|x
(i)
2 , ..., x
(i)
n
Sample x
(i+1)
2 ∼ p x2|x
(i)
1 , x
(i)
3 , ..., x
(i)
n
· · ·
Sample x
(i+1)
j ∼ p xj|x
(i)
1 , ..., x
(i)
j−1, x
(i)
j+1, ..., x
(i)
n
· · ·
Sample x
(i+1)
n ∼ p xn|x
(i)
1 , ..., x
(i)
n−1
56 / 61
Images/cinvestav-
Latent Dirichlet Allocation
It is an algorithm for finding topics composed by sets of words
You require to have documents!!!
Data consists
Of documents di consisting of a set of words wi
In a universe of W = {w1, ..., wn} words.
58 / 61
Images/cinvestav-
Then
We want to find the mixture of words to topics
Thus, we can easily do this by counting:
Counts of topic k in document d
The distribution of topics in a document
Counts of word v in document d
The distribution of words in a document
Thus, we can compute the probability of topics Zi (Gibbs Term)
p (Zi|Z−i, W)
59 / 61
Images/cinvestav-
For this
We need to introduce some extra terms
Ωd,k - count of topic k in document d.
Ψk,v- counts of word v in document d.
Thus the Gibbs Term
p (Zi|Z−i, W) =
Ψ−i
k,v + β
v Ψ−i
k,v + Nv · β
×
Ω−i
d,k + α
k Ω−i
d,k + Kα
With
Nv = number of different words.
β = Renovation Dirichlet Parameter for words
K = Number of topics
α= Renovation Dirichlet Parameter for topics
60 / 61
Images/cinvestav-
So, we have the following code
We have
The Following!!!
61 / 61

More Related Content

What's hot

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
The Statistical and Applied Mathematical Sciences Institute
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
Pei-Che Chang
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
PMF BPMF and BPTF
PMF BPMF and BPTFPMF BPMF and BPTF
PMF BPMF and BPTF
Pei-Che Chang
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
Christian Robert
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
Cheatsheet recurrent-neural-networks
Cheatsheet recurrent-neural-networksCheatsheet recurrent-neural-networks
Cheatsheet recurrent-neural-networks
Steve Nouri
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricks
Steve Nouri
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
Deep Learning JP
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
tuxette
 
Cheatsheet deep-learning
Cheatsheet deep-learningCheatsheet deep-learning
Cheatsheet deep-learning
Steve Nouri
 
Chapter 24 aoa
Chapter 24 aoaChapter 24 aoa
Chapter 24 aoa
Hanif Durad
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
Masahiro Suzuki
 
Assignment 2 daa
Assignment 2 daaAssignment 2 daa
Assignment 2 daa
gaurav201196
 

What's hot (20)

Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
Distributed ADMM
Distributed ADMMDistributed ADMM
Distributed ADMM
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
PMF BPMF and BPTF
PMF BPMF and BPTFPMF BPMF and BPTF
PMF BPMF and BPTF
 
MCMC and likelihood-free methods
MCMC and likelihood-free methodsMCMC and likelihood-free methods
MCMC and likelihood-free methods
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Cheatsheet recurrent-neural-networks
Cheatsheet recurrent-neural-networksCheatsheet recurrent-neural-networks
Cheatsheet recurrent-neural-networks
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricks
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
A review on structure learning in GNN
A review on structure learning in GNNA review on structure learning in GNN
A review on structure learning in GNN
 
Cheatsheet deep-learning
Cheatsheet deep-learningCheatsheet deep-learning
Cheatsheet deep-learning
 
Chapter 24 aoa
Chapter 24 aoaChapter 24 aoa
Chapter 24 aoa
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Variational Inference with Rényi Divergence
 
Assignment 2 daa
Assignment 2 daaAssignment 2 daa
Assignment 2 daa
 

Similar to Markov chain monte_carlo_methods_for_machine_learning

Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
HaibinSu2
 
thesis_final_draft
thesis_final_draftthesis_final_draft
thesis_final_draftBill DeRose
 
Cuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An IntroductionCuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An Introduction
Xin-She Yang
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017
Austin Benson
 
Preparation Data Structures 11 graphs
Preparation Data Structures 11 graphsPreparation Data Structures 11 graphs
Preparation Data Structures 11 graphs
Andres Mendez-Vazquez
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.ppt
VivekDixit100
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
Umberto Picchini
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
Pierre Jacob
 
backtracking 8 Queen.pptx
backtracking 8 Queen.pptxbacktracking 8 Queen.pptx
backtracking 8 Queen.pptx
JoshipavanEdduluru1
 
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Association for Computational Linguistics
 
Corisco - 2015
Corisco - 2015Corisco - 2015
Corisco - 2015
Nicolau Werneck
 
Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
Rezzy Caraka
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Amro Elfeki
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
36rajneekant
 
BackTracking Algorithm: Technique and Examples
BackTracking Algorithm: Technique and ExamplesBackTracking Algorithm: Technique and Examples
BackTracking Algorithm: Technique and Examples
Fahim Ferdous
 

Similar to Markov chain monte_carlo_methods_for_machine_learning (20)

Monte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptxMonte Carlo Berkeley.pptx
Monte Carlo Berkeley.pptx
 
thesis_final_draft
thesis_final_draftthesis_final_draft
thesis_final_draft
 
xldb-2015
xldb-2015xldb-2015
xldb-2015
 
Technical
TechnicalTechnical
Technical
 
mcmc
mcmcmcmc
mcmc
 
Cuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An IntroductionCuckoo Search Algorithm: An Introduction
Cuckoo Search Algorithm: An Introduction
 
Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017Spacey random walks from Householder Symposium XX 2017
Spacey random walks from Householder Symposium XX 2017
 
Preparation Data Structures 11 graphs
Preparation Data Structures 11 graphsPreparation Data Structures 11 graphs
Preparation Data Structures 11 graphs
 
Semi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.pptSemi-Classical Transport Theory.ppt
Semi-Classical Transport Theory.ppt
 
Stratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computationStratified sampling and resampling for approximate Bayesian computation
Stratified sampling and resampling for approximate Bayesian computation
 
Markov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing themMarkov chain Monte Carlo methods and some attempts at parallelizing them
Markov chain Monte Carlo methods and some attempts at parallelizing them
 
backtracking 8 Queen.pptx
backtracking 8 Queen.pptxbacktracking 8 Queen.pptx
backtracking 8 Queen.pptx
 
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
Jindřich Libovický - 2017 - Attention Strategies for Multi-Source Sequence-...
 
Corisco - 2015
Corisco - 2015Corisco - 2015
Corisco - 2015
 
Teknik Simulasi
Teknik SimulasiTeknik Simulasi
Teknik Simulasi
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
 
DimensionalityReduction.pptx
DimensionalityReduction.pptxDimensionalityReduction.pptx
DimensionalityReduction.pptx
 
BackTracking Algorithm: Technique and Examples
BackTracking Algorithm: Technique and ExamplesBackTracking Algorithm: Technique and Examples
BackTracking Algorithm: Technique and Examples
 
Back tracking
Back trackingBack tracking
Back tracking
 
algorithm Unit 4
algorithm Unit 4 algorithm Unit 4
algorithm Unit 4
 

More from Andres Mendez-Vazquez

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
Andres Mendez-Vazquez
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
Andres Mendez-Vazquez
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
Andres Mendez-Vazquez
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
Andres Mendez-Vazquez
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
Andres Mendez-Vazquez
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
Andres Mendez-Vazquez
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
Andres Mendez-Vazquez
 
Zetta global
Zetta globalZetta global
Zetta global
Andres Mendez-Vazquez
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
Andres Mendez-Vazquez
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
Andres Mendez-Vazquez
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
Andres Mendez-Vazquez
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
Andres Mendez-Vazquez
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
Andres Mendez-Vazquez
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
Andres Mendez-Vazquez
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
Andres Mendez-Vazquez
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
Andres Mendez-Vazquez
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
Andres Mendez-Vazquez
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
Andres Mendez-Vazquez
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems Syllabus
Andres Mendez-Vazquez
 
Introduction Machine Learning Syllabus
Introduction Machine Learning SyllabusIntroduction Machine Learning Syllabus
Introduction Machine Learning Syllabus
Andres Mendez-Vazquez
 

More from Andres Mendez-Vazquez (20)

2.03 bayesian estimation
2.03 bayesian estimation2.03 bayesian estimation
2.03 bayesian estimation
 
05 linear transformations
05 linear transformations05 linear transformations
05 linear transformations
 
01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors01.04 orthonormal basis_eigen_vectors
01.04 orthonormal basis_eigen_vectors
 
01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues01.03 squared matrices_and_other_issues
01.03 squared matrices_and_other_issues
 
01.02 linear equations
01.02 linear equations01.02 linear equations
01.02 linear equations
 
01.01 vector spaces
01.01 vector spaces01.01 vector spaces
01.01 vector spaces
 
05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation05 backpropagation automatic_differentiation
05 backpropagation automatic_differentiation
 
Zetta global
Zetta globalZetta global
Zetta global
 
01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning01 Introduction to Neural Networks and Deep Learning
01 Introduction to Neural Networks and Deep Learning
 
25 introduction reinforcement_learning
25 introduction reinforcement_learning25 introduction reinforcement_learning
25 introduction reinforcement_learning
 
Neural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning SyllabusNeural Networks and Deep Learning Syllabus
Neural Networks and Deep Learning Syllabus
 
Introduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabusIntroduction to artificial_intelligence_syllabus
Introduction to artificial_intelligence_syllabus
 
Ideas 09 22_2018
Ideas 09 22_2018Ideas 09 22_2018
Ideas 09 22_2018
 
Ideas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data SciencesIdeas about a Bachelor in Machine Learning/Data Sciences
Ideas about a Bachelor in Machine Learning/Data Sciences
 
Analysis of Algorithms Syllabus
Analysis of Algorithms  SyllabusAnalysis of Algorithms  Syllabus
Analysis of Algorithms Syllabus
 
18.1 combining models
18.1 combining models18.1 combining models
18.1 combining models
 
17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension17 vapnik chervonenkis dimension
17 vapnik chervonenkis dimension
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 
Introduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems SyllabusIntroduction Mathematics Intelligent Systems Syllabus
Introduction Mathematics Intelligent Systems Syllabus
 
Introduction Machine Learning Syllabus
Introduction Machine Learning SyllabusIntroduction Machine Learning Syllabus
Introduction Machine Learning Syllabus
 

Recently uploaded

Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
Jayaprasanna4
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
Kamal Acharya
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
bakpo1
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
Kamal Acharya
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
JoytuBarua2
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 

Recently uploaded (20)

Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
ethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.pptethical hacking in wireless-hacking1.ppt
ethical hacking in wireless-hacking1.ppt
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdfCOLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
COLLEGE BUS MANAGEMENT SYSTEM PROJECT REPORT.pdf
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
一比一原版(SFU毕业证)西蒙菲莎大学毕业证成绩单如何办理
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Planning Of Procurement o different goods and services
Planning Of Procurement o different goods and servicesPlanning Of Procurement o different goods and services
Planning Of Procurement o different goods and services
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 

Markov chain monte_carlo_methods_for_machine_learning

  • 1. Markov Chain Monte Carlo Methods Applications in Machine Learning Andres Mendez-Vazquez June 1, 2017 1 / 61
  • 2. Images/cinvestav- Outline 1 Introduction The Main Reason Examples of Application Basically 2 The Monte Carlo Method FERMIAC and ENIAC Computers Immediate Applications 3 Markov Chains Introduction Enters Perron-Frobenious Theorem Enter Google’s Page Rank 4 Markov Chain Monte Carlo Methods Combining the Power of the two Methods 5 Metropolis Hastings Introduction A General Idea Applications in Machine Learning 6 The Gibbs Sampler Introduction The Simplest Algorithm Applications in Machine Learning 2 / 61
  • 3. Images/cinvestav- Chance There are many phenomenas that introduce chance in their models Therefore Why not use SAMPLING to understand those phenomenas? 4 / 61
  • 4. Images/cinvestav- Thus Markov Chain Monte Carlo (MCMC) Methods Algorithms that use Markov Chains to achieve samples of a target phenomena!!! Thus They are computer based simulations able to obtain samples of π (x). 5 / 61
  • 5. Images/cinvestav- The Reason There are several high dimensional problems For Example, computing the volume of a convex body in d dimensions. It is the only known general approach for providing a solution within a reasonable time, O dk . Therefore MCMC plays significant role in statistics, econometrics, physics and computing science. 6 / 61
  • 6. Images/cinvestav- Examples of Application Bayesian Inference and Learning Given some unknown variables X1, X2, ..., XK and data Y , we want to know properties about Y Graphically 8 / 61
  • 7. Images/cinvestav- Deep Learning Restricted Boltzmann Machines use MCMC to optimize their weights E (v, h) = − i aivi − j bjhj − i j viwi,jhj 9 / 61
  • 8. Images/cinvestav- What do we want? Given a probability distribution of interest π (x) , x ∈ RN Which has the following structure π (x) = 1 Z h (x) where h (x) is a PDF and Z is a unknown normalization constant. Thus, We want to understand such distribution!!! 11 / 61
  • 9. Images/cinvestav- The beginning of Monte Carlo Methods 1945 two events change the world forever The successful nuclear test at Alamogordo. The building of the first electronic computer, ENIAC. Pushed for the creation of the Monte Carlo Methods Original idea came from Stan Ulman... He loved relaxing by playing poker and solitary!!! Stan had an uncle who borrowed money from relatives because he “just had to go to Monte Carlo.” 13 / 61
  • 10. Images/cinvestav- Together with Von Neumann The Guy Behind the Minimax Algorithm They started to develop an idea to trace the path of neutrons in a spherical reactor. 14 / 61
  • 11. Images/cinvestav- Thus We have then At each stage a sequence of decisions has to be made based on statistical probabilities appropriate to the physical and geometric factors. For this, we only need a source of uniform random numbers!!! Because it is possible to use the inverse of the cumulative of the target function to obtain the necessary samples!!! 15 / 61
  • 12. Images/cinvestav- Thus, the FERMIAC was born!!! When the ENIAC was moved and it was necessary to keep generating target statistics 16 / 61
  • 13. Images/cinvestav- However Once the ENIAC went on-line again It took two months to have the basic controls for the Monte-Carlo One fortnight for the last phases of the implementation. Then, the tests were ran And Monte Carlo was born!!! 17 / 61
  • 14. Images/cinvestav- Look At This, Von Neumann’s Programs Design First and Programming After 18 / 61
  • 15. Images/cinvestav- Monte Carlo Integration We can get integral of complex functions I = Ω sin ln (x + y + 1)dxdy Where Ω is a disk with (x, y) | x − 1 2 2 + y − 1 2 2 ≤ 1 4 We only need a source of randomly uniform points at that area I ≈ Volume of Ω × Average Value of f in Ω 20 / 61
  • 16. Images/cinvestav- Getting the First Moment!!! The goal is to compute the following expectation E [f] = f (z) p (z) dz Solution Obtain a set of samples z(i) where i = 1, ..., N drawn independently from p(z) Approximate the expectation as E [f] ≈ E [f] = 1 N N i=1 f z(i) 21 / 61
  • 17. Images/cinvestav- Clustering Using Stochastic Process We use the following process Imagine a Chinese restaurant with an infinite number of circular tables, each with infinite capacity!!! Customer ONE sits at the first table The next customer either sits at the same table as customer ONE Or the next table Something like this 22 / 61
  • 18. Images/cinvestav- Thus, it is possible to build and entire Random Process Simply Asking p (customer i assigned to table j|D, α) = f (dij) if j = i α if i = j Where D is the distance between customers with a similarity dij = d (ci, cj) Let see the code There we have a series of nice ideas. 23 / 61
  • 19. Images/cinvestav- Markov Chains The random process Xt ∈ S for t = 1, 2, ..., T has a Markov property If and only if p (XT |XT−1, XT−2, ..., X1) = p (XT |XT−1) Finite-State Discrete Time Markov Chains It can be completely specified by the transition matrix. P = [pij] with pij = P [Xt = j|Xt−1 = i] 25 / 61
  • 20. Images/cinvestav- Example We have the following transition matrix        0.2 01 0.1 0.1 0.5 0.4 01 0.1 0.2 0.1 0.1 0.3 0.2 0.1 0.1 0.2 0.4 0.2 0.3 0.2 0.1 0.1 0.4 0.3 0.1        Graphically 1 5 2 3 4 26 / 61
  • 21. Images/cinvestav- What kind of Markov Chain do we like to study? Ergodic A Markov chain is called an ergodic chain if it is possible to go from every state to every state (not necessarily in one move). Aperiodic A state i has period k if any return to state i must occur in multiples of k time steps. k = gcd {n > 0|P (Xn = i|X0 = i) > 0} 27 / 61
  • 22. Images/cinvestav- Therefore Thus If k = 1, then the state is said to be aperiodic. Otherwise (k > 1), the state is said to be periodic with period k. A Markov chain is aperiodic if every state is aperiodic 28 / 61
  • 23. Images/cinvestav- The Theorem Perron–Frobenius Theorem Let A be a positive square matrix. Then: a. ρ(A) is an eigenvalue, and it has a positive eigenvector. b. ρ(A) is the only eigenvalue on the disc |λ| = ρ(A). 30 / 61
  • 24. Images/cinvestav- This and other theorems allows to calculate something quite interesting Using the Power method The method is described by the recurrence relation w(i+1) = Tw(i) Tw(i) where Tw(i) = √ w(i)tTtTw(i) Then The sub-sequence {wki }∞ i=1 converges to an eigenvector associated with the dominant eigenvalue. 31 / 61
  • 25. Images/cinvestav- Long Ago in a Long Forgotten Land Dozens of Companies fought for the Search Landscape American On-Line Netscape Yahoo Infoseek Lycos Altavista 33 / 61
  • 27. Images/cinvestav- Enters Larry Paige and Sergey Brin (Circa 1996) They Invented the Google Matrix (A Misspelling of Googol = 10100 ) G = αS + (1 − α) 1v Where S is a modified version of an adjacency matrix by converting the number of links into a probability 35 / 61
  • 28. Images/cinvestav- In addition Also 1 is the Column Vector of ones. v is a row vector of probabilities v = 1 n , 1 n , ..., 1 n (At the initial experiments) The Matrix n × n, (1 − α) 1v         (1 − α) 1 n (1 − α) 1 n · · · (1 − α) 1 n (1 − α) 1 n (1 − α) 1 n · · · (1 − α) 1 n ... ... ... ... (1 − α) 1 n (1 − α) 1 n · · · (1 − α) 1 n         36 / 61
  • 29. Images/cinvestav- The Dampening Factor Finally α In the Google matrix indicates that Random Web surfers move to a different web-page by some means other than selecting a link with probability 1 − α 37 / 61
  • 30. Images/cinvestav- The Algorithm was the Edge!!! First Google Server 38 / 61
  • 31. Images/cinvestav- After the introduction of the algorithm, THE END for everybody else!!! 39 / 61
  • 32. Images/cinvestav- Now Imagine the following You have a target distribution π that you want to sample You can use a generative q distribution that you know to try to generate the necessary samples. Then, you have a process like this 1 Sample x ∼ q (x). 2 Use the functional form of the target distribution π to Accept or Reject the sample x as being generated by π (x) 41 / 61
  • 33. Images/cinvestav- Basically Markov Condition The generation of x does not depend on previous states!!! 42 / 61
  • 34. Images/cinvestav- Further!!! Monte Carlo Method The algebraic and geometric properties of the target distribution helps to accept or reject the sample!!! 43 / 61
  • 35. Images/cinvestav- History - The not so great remarks... Metropolis Generalized −→ Metropolis-Hasitngs Special Case −→ Gibbs Sampling All developments are done in Computational Physics. The Landmark 1953 Paper N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller: “Equation of state calculations by fast computing machines, Journal of Chemical Physics.” There is a quote by A. Rosenbluth “Metropolis played no role in its development other than providing computer time!” After all, Metropolis was the supervisor in Los Alamos National Lab. 45 / 61
  • 36. Images/cinvestav- Steps of the Metropolis-Hastings An M-H steps involves the following A M-H step uses 1 The Target/Invariant Distribution l (x) 2 The Proposal/Sampling Distribution q (x |x) Then It involves sampling a candidate value x given the current value x according to q (x |x) The Markov chain then moves towards x With acceptance probability A (x, x ) = min 1, l (x ) q (x|x ) l (x) q (x |x) 47 / 61
  • 37. Images/cinvestav- Logistic Regression We know l (w|x, y) = n i=1   exp wT xi 1 + exp {wT xi}   yi 1 1 + exp {wT xi} 1−yi (1) In our case, we use as q a Multi-variate Gaussian π (w) ∼ N (µ, Σ) 49 / 61
  • 38. Images/cinvestav- Logistic Regression The Markov chain then moves towards x With acceptance probability A (x, x ) = min 1, l (x ) l (x) 50 / 61
  • 39. Images/cinvestav- Thus, we trigger the process of Sampling Then, we look at the modes in the distribution 51 / 61
  • 40. Images/cinvestav- Let’s Take a Look at the Program It is not so complex A little bit of work!!! 52 / 61
  • 41. Images/cinvestav- The Assumptions Suppose We have an n-dimensional vector x. The expressions for the full conditionals p (xj|x1, ..., xj−1, xj+1, ..., xn) Here, we have the following proposal distribution q x |x(i) =    p xj |x (i) −j if x−j = x (i) −j 0 otherwise 54 / 61
  • 42. Images/cinvestav- The Gibbs Sampler The Algorithm 1 Init x0,1:n 2 For i = 0 to N − 1 Sample x (i+1) 1 ∼ p x1|x (i) 2 , ..., x (i) n Sample x (i+1) 2 ∼ p x2|x (i) 1 , x (i) 3 , ..., x (i) n · · · Sample x (i+1) j ∼ p xj|x (i) 1 , ..., x (i) j−1, x (i) j+1, ..., x (i) n · · · Sample x (i+1) n ∼ p xn|x (i) 1 , ..., x (i) n−1 56 / 61
  • 43. Images/cinvestav- Latent Dirichlet Allocation It is an algorithm for finding topics composed by sets of words You require to have documents!!! Data consists Of documents di consisting of a set of words wi In a universe of W = {w1, ..., wn} words. 58 / 61
  • 44. Images/cinvestav- Then We want to find the mixture of words to topics Thus, we can easily do this by counting: Counts of topic k in document d The distribution of topics in a document Counts of word v in document d The distribution of words in a document Thus, we can compute the probability of topics Zi (Gibbs Term) p (Zi|Z−i, W) 59 / 61
  • 45. Images/cinvestav- For this We need to introduce some extra terms Ωd,k - count of topic k in document d. Ψk,v- counts of word v in document d. Thus the Gibbs Term p (Zi|Z−i, W) = Ψ−i k,v + β v Ψ−i k,v + Nv · β × Ω−i d,k + α k Ω−i d,k + Kα With Nv = number of different words. β = Renovation Dirichlet Parameter for words K = Number of topics α= Renovation Dirichlet Parameter for topics 60 / 61
  • 46. Images/cinvestav- So, we have the following code We have The Following!!! 61 / 61