The document discusses the multivariate Gaussian probability distribution. It defines the distribution and provides its probability density function. It then discusses various properties including: functions of Gaussian variables such as linear transformations and addition; the characteristic function and how to calculate moments; marginalization and conditional distributions. It also provides some tips and tricks for working with Gaussian distributions including how to calculate products.
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the section cover
- EM algorithm for HMM
- Forward-Backward Algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・隠れマルコフモデルに対するEMアルゴリズムのEステップ
・フォワード-バックワードアルゴリズム
NP completeness. Classes P and NP are two frequently studied classes of problems in computer science. Class P is the set of all problems that can be solved by a deterministic Turing machine in polynomial time.
AI Greedy & A* Informed Search Strategies by ExampleAhmed Gad
Explaining how informed search strategies in Artificial Intelligence (AI) works by an example.
Two informed search strategies are explained by an example:
Greedy Best-First Search.
A* Search.
Find me on:
AFCIT
http://www.afcit.xyz
YouTube
https://www.youtube.com/channel/UCuewOYbBXH5gwhfOrQOZOdw
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad/
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://www.academia.edu/
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12/
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad/
The presentation material for the reading club of Pattern Recognition and Machine Learning by Bishop.
The contents of the section cover
- EM algorithm for HMM
- Forward-Backward Algorithm
-------------------------------------------------------------------------
研究室でのBishop著『パターン認識と機械学習』(PRML)の輪講用発表資料(ぜんぶ英語)です。
担当範囲は
・隠れマルコフモデルに対するEMアルゴリズムのEステップ
・フォワード-バックワードアルゴリズム
NP completeness. Classes P and NP are two frequently studied classes of problems in computer science. Class P is the set of all problems that can be solved by a deterministic Turing machine in polynomial time.
AI Greedy & A* Informed Search Strategies by ExampleAhmed Gad
Explaining how informed search strategies in Artificial Intelligence (AI) works by an example.
Two informed search strategies are explained by an example:
Greedy Best-First Search.
A* Search.
Find me on:
AFCIT
http://www.afcit.xyz
YouTube
https://www.youtube.com/channel/UCuewOYbBXH5gwhfOrQOZOdw
Google Plus
https://plus.google.com/u/0/+AhmedGadIT
SlideShare
https://www.slideshare.net/AhmedGadFCIT
LinkedIn
https://www.linkedin.com/in/ahmedfgad/
ResearchGate
https://www.researchgate.net/profile/Ahmed_Gad13
Academia
https://www.academia.edu/
Google Scholar
https://scholar.google.com.eg/citations?user=r07tjocAAAAJ&hl=en
Mendelay
https://www.mendeley.com/profiles/ahmed-gad12/
ORCID
https://orcid.org/0000-0003-1978-8574
StackOverFlow
http://stackoverflow.com/users/5426539/ahmed-gad
Twitter
https://twitter.com/ahmedfgad
Facebook
https://www.facebook.com/ahmed.f.gadd
Pinterest
https://www.pinterest.com/ahmedfgad/
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
PPT on Analysis Of Algorithms.
The ppt includes Algorithms,notations,analysis,analysis of algorithms,theta notation, big oh notation, omega notation, notation graphs
A brief introduction of Artificial neural network by exampleMrinmoy Majumder
A simple introduction with a solved example about artificial neural networks.Beginners can use this tutorial to gain a basic understanding about the ANN architecture and the process by which ANN model is developed for practical problem solving.The example in the tutorial describe the way ANN models are developed.ANN is widely popular and used in various artificial intelligence and internet of things projects.
This presentation introduces the Boosting ensemble method for machine learning. It's objective is to compare Boosting to the Random Forest ensemble method, explain the difference between AdaBoost and Gradient Boosting and annotate the pseudo-code for each algorithm for classification and regression, respectively. Kirkwood gave this instructional demonstration while applying to be a Data Scientist in Residence at Galvanize, Boulder.
A Note on “ Geraghty contraction type mappings”IOSRJM
In this paper, a fixed point result for Geraghty contraction type mappings has been proved. Karapiner [2] assumes to be continuous. In this paper, the continuity condition of has been replaced by a weaker condition and fixed point result has been proved. Thus the result proved generalizes many known results in the literature [2-7].
Artificial Intelligence: Introduction, Typical Applications. State Space Search: Depth Bounded
DFS, Depth First Iterative Deepening. Heuristic Search: Heuristic Functions, Best First Search,
Hill Climbing, Variable Neighborhood Descent, Beam Search, Tabu Search. Optimal Search: A
*
algorithm, Iterative Deepening A*
, Recursive Best First Search, Pruning the CLOSED and OPEN
Lists
PPT on Analysis Of Algorithms.
The ppt includes Algorithms,notations,analysis,analysis of algorithms,theta notation, big oh notation, omega notation, notation graphs
A brief introduction of Artificial neural network by exampleMrinmoy Majumder
A simple introduction with a solved example about artificial neural networks.Beginners can use this tutorial to gain a basic understanding about the ANN architecture and the process by which ANN model is developed for practical problem solving.The example in the tutorial describe the way ANN models are developed.ANN is widely popular and used in various artificial intelligence and internet of things projects.
This presentation introduces the Boosting ensemble method for machine learning. It's objective is to compare Boosting to the Random Forest ensemble method, explain the difference between AdaBoost and Gradient Boosting and annotate the pseudo-code for each algorithm for classification and regression, respectively. Kirkwood gave this instructional demonstration while applying to be a Data Scientist in Residence at Galvanize, Boulder.
A Note on “ Geraghty contraction type mappings”IOSRJM
In this paper, a fixed point result for Geraghty contraction type mappings has been proved. Karapiner [2] assumes to be continuous. In this paper, the continuity condition of has been replaced by a weaker condition and fixed point result has been proved. Thus the result proved generalizes many known results in the literature [2-7].
The Gaussian or normal distribution is one of the most widely used in statistics. Estimating its parameters using
Bayesian inference and conjugate priors is also widely used. The use of conjugate priors allows all the results to be
derived in closed form. Unfortunately, different books use different conventions on how to parameterize the various
distributions (e.g., put the prior on the precision or the variance, use an inverse gamma or inverse chi-squared, etc),
which can be very confusing for the student. In this report, we summarize all of the most commonly used forms. We
provide detailed derivations for some of these results; the rest can be obtained by simple reparameterization. See the
appendix for the definition the distributions that are used.
A brief discussion of Multivariate Gaussin, Rayleigh & Rician distributions
Prof. H.Amindavar complementary notes for the first session of "Advanced communications theory" course, Spring 2021
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...Alexander Litvinenko
Talk presented on SIAM IS 2022 conference.
Very often, in the course of uncertainty quantification tasks or
data analysis, one has to deal with high-dimensional random variables (RVs)
(with values in $\Rd$). Just like any other RV,
a high-dimensional RV can be described by its probability density (\pdf) and/or
by the corresponding probability characteristic functions (\pcf),
or a more general representation as
a function of other, known, random variables.
Here the interest is mainly to compute characterisations like the entropy, the Kullback-Leibler, or more general
$f$-divergences. These are all computed from the \pdf, which is often not available directly,
and it is a computational challenge to even represent it in a numerically
feasible fashion in case the dimension $d$ is even moderately large. It
is an even stronger numerical challenge to then actually compute said characterisations
in the high-dimensional case.
In this regard, in order to achieve a computationally feasible task, we propose
to approximate density by a low-rank tensor.
Fixed Point Theorem in Fuzzy Metric Space Using (CLRg) Propertyinventionjournals
The object of this paper is to establish a common fixed point theorem for semi-compatible pair of self maps by using CLRg Property in fuzzy metric space.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
The Multivariate Gaussian Probability Distribution
1. The Multivariate Gaussian Probability
Distribution
Peter Ahrendt
IMM, Technical University of Denmark
mail : pa@imm.dtu.dk, web : www.imm.dtu.dk/∼pa
January 7, 2005
3. Chapter 1
Definition
The definition of a multivariate gaussian probability distribution can be stated
in several equivalent ways. A random vector X = [X1X2 . . . XN ] can be said to
belong to a multivariate gaussian distribution if one of the following statements
is true.
• Any linear combination Y = a1X1 + a2X2 + . . . + aN XN , ai ∈ R is a
(univariate) gaussian distribution.
• There exists a random vector Z = [Z1, . . . , ZM ] with components that are
independent and standard normal distributed, a vector µ = [µ1, . . . , µN ]
and an N-by-M matrix A such that X = AZ + µ.
• There exists a vector µ and a symmetric, positive semi-definite matrix Γ
such that the characteristic function of X can be written φx(t) ≡ heitT
X
i =
eiµT
t− 1
2 tT
Γt
.
Under the assumption that the covariance matrix Σ is non-singular, the prob-
ability density function (pdf) can be written as :
Nx(µ, Σ) =
1
p
(2π)d|Σ|
exp
µ
−
1
2
(x − µ)T
Σ−1
(x − µ)
¶
= |2πΣ|− 1
2 exp
µ
−
1
2
(x − µ)T
Σ−1
(x − µ)
¶ (1.1)
Then µ is the mean value, Σ is the covariance matrix and | · | denote the
determinant. Note that it is possible to have multivariate gaussian distributions
with singular covariance matrix and then the above expression cannot be used
for the pdf. In the following, however, non-singular covariance matrices will be
assumed.
In the limit of one dimension, the familiar expression of the univariate gaussian
pdf is found.
2
4. Nx(µ, σ2
) =
1
√
2πσ2
exp
µ
−
(x − µ)2
2σ2
¶
=
1
√
2πσ2
exp
µ
−
1
2
(x − µ)σ−2
(x − µ)
¶
(1.2)
Neither of them have a closed-form expression for the cumulative density func-
tion.
Symmetries
It is noted that in the one-dimensional case there is a symmetry in the pdf.
Nx(µ, σ2
) which is centered on µ. This can be seen by looking at ”contour
lines”, i.e. setting the exponent −(x−µ)2
2σ2 = c. It is seen that σ determines the
width of the distribution.
In the multivariate case, it is similarly useful to look at −1
2 (x − µ)T
Σ−1
(x − µ) = c.
This is a quadratic form and geometrically the contour curves (for fixed c) are hy-
perellipsoids. In 2D, this is normal ellipsoids with the form (x−x0
a )2
+(y−y0
b )2
=
r2
, which gives symmetries along the principal axes. Similarly, the hyperellip-
soids show symmetries along their principal axes.
Notation: If a random variable X has a gaussian distribution, it is written as
X∼ N(µ, Σ). The probability density function of this variable is then given by
Nx(µ, Σ).
3
5. Chapter 2
Functions of Gaussian
Variables
Linear transformation and addition of variables
Let A, B ∈ Mc·d
and c ∈ Rd
. Let X ∼ N(µx, Σx) and Y ∼ N(µy, Σy) be
independent variables. Then
Z = AX + BY + c ∼ N(Aµx + Bµy + c, AΣxAT
+ BΣyBT
) (2.1)
Transform to standard normal variables
Let X ∼ N(µ, Σ). Then
Z = Σ− 1
2 (X − µ) ∼ N(0, 1) (2.2)
Note, that by Σ− 1
2 is actually meant a unique matrix, although in general
matrices with fractional exponents are not. The matrix that is meant can be
found from the diagonalisation into Σ = UΛUT
= (UΛ
1
2 )(UΛ
1
2 )T
where Λ
is the diagonal matrix with the eigenvalues of Σ and U is the matrix with the
eigenvectors. Then Σ− 1
2 = (UΛ
1
2 )−1
= Λ− 1
2 U−1
.
In the one-dimensional case, this corresponds to the transformation of X ∼
N(µ, σ2
) into Y = σ−1
(X − µ) ∼ N(0, 1).
Addition
Let Xi ∼ N(µi, Σi), i ∈ 1, ..., N be independent variables. Then
N
X
i
Xi ∼ N(
N
X
i
µi,
N
X
i
Σi) (2.3)
Note: This is a direct implication of equation (2.1).
4
6. Quadratic
Let Xi ∼ N(0, 1), i ∈ 1, ..., N be independent variables. Then
N
X
i
X2
i ∼ χ2
n (2.4)
Alternatively let X ∼ N(µx, Σx). Then
Z = (X − µ)T
Σ−1
(X − µ) ∼ χ2
n (2.5)
This is, however, the same thing since Z= X̃
T
X̃ =
PN
i X̃i
2
, where X̃i are the
decorrelated components (see eqn. (2.2)).
5
7. Chapter 3
Characteristic function and
Moments
The characteristic function of the univariate gaussian distribution is given by
φx(t) ≡ heitX
i = eitµ−σ2
t2
/2
. The generalization to multivariate gaussian distri-
butions is
φx(t) ≡ heitT
X
i = eiµT
t− 1
2 tT
Σt
(3.1)
The pdf p(x) is related to the characteristic function.
p(x) =
1
(2π)d
Z
Rd
φx(t) e−itT
x
dt (3.2)
It is seen that the characteristic function is the inverse Fourier transform of the
pdf.
Moments of a pdf are generally defined as :
hXk1
1 Xk2
2 · · · XkN
N i ≡
Z
Rd
xk1
1 xk2
2 · · · xkN
N p(x) dx (3.3)
where hXk1
1 Xk2
2 · · · XkN
N i is the k’th order moment, k = [k1, k2, . . . , kN ] (ki ∈ N)
and k = k1 + k2 + . . . + kN . A well-known example is the first order moment,
called the mean value µi (of variable Xi) - or the mean µ ≡ [µ1µ2 . . . µN ] of the
whole random vector X.
The k’th order central moment is defined as above, but with Xi replaced by
Xi − µi in equation (3.3). An example is the second order central moment,
called the variance, which is given by h(Xi − µi)2
i.
Any moment (that exists) can be found from the characteristic function [8]:
6
8. hXk1
1 Xk2
2 · · · XkN
N i = (−j)k ∂k
φx(t)
∂tk1
1 . . . ∂tkN
N
¯
¯
¯
¯
t=0
(3.4)
where k = k1 + k2 + . . . + kN .
1. Order Moments
Mean µ ≡ hXi (3.5)
2. Order Moments
Variance cii ≡ h(Xi − µi)2
i = hX2
i i − µ2
i (3.6)
Covariance cij ≡ h(Xi − µi)(Xj − µj)i (3.7)
Covariance matrix Σ ≡ h(X − µ)(X − µ)T
i ≡ [cij] (3.8)
3. Order Moments
Often the skewness is used.
Skew(X) ≡
h(Xi − hXii)3
i
h(Xi − hXii)2i
3
2
=
h(Xi − µi)3
i
h(Xi − µi)2i
3
2
(3.9)
All 3. order central moments are zero for gaussian distributions and thus also
the skewness.
4. Order Moments
The kurtosis is (in newer literature) given as
Kurt(X) ≡
h(Xi − µi)4
i
h(Xi − µi)2i2
− 3 (3.10)
Let X ∼ N(µ, Σ). Then
h(Xi − µi)(Xj − µj)(Xk − µk)(Xl − µl)i = cijckl + cilcjk + cikclj (3.11)
and
Kurt(X) = 0 (3.12)
N. Order Moments
Any central moment of a gaussian distribution can (fairly easily) be calculated
with the following method [3] (sometimes known as Wicks theorem).
Let X ∼ N(µ, Σ). Then
7
9. • Assume k is odd. Then the central k’th order moments are all zero.
• Assume k is even. Then the central k’th order moments are equal to
P
(cijckl . . . cxz). The sum is taken over all different permutations of the k
indices, where it is noted that cij = cji. This gives (k − 1)!/(2k/2−1
(k/2 −
1)!) terms which each is the product of k/2 covariances.
An example is illustrative. The different 4. order central moments of X are found
with the above method to give
h(Xi − µi)4
i = 3c2
ii
h(Xi − µi)3
(Xj − µj)i = 3ciicij
h(Xi − µi)2
(Xj − µj)2
i = ciicjj + 2c2
ij
h(Xi − µi)2
(Xj − µj)(Xk − µk)i = ciicjk + 2cijcik
h(Xi − µi)(Xj − µj)(Xk − µk)(Xl − µl)i = cijckl + cilcjk + cikclj
(3.13)
The above results were found by seeing that the different permutations of the
k=4 indices are (12)(34), (13)(24) and (14)(23). Other permutations are equiv-
alents, such as for instance (32)(14) which is equivalent to (14)(23). When cal-
culating e.g. h(Xi − µi)2
(Xj − µj)(Xk − µk)i, the assignment (1 → i, 2 → i, 3 →
j, 4 → k) gives the terms ciicjk, cijcik and cijcik in the sum.
Calculations with moments
Let b ∈ Rc
, A, B ∈ Mc·d
. Let X and Y be random vectors and f and g vector
functions. Then
hAf(X) + Bg(X) + bi = Ahf(X)i + Bhg(X)i + b (3.14)
hAX + bi = AhXi + b (3.15)
hhY|Xii ≡ E(E(Y|X)) = hYi (3.16)
If Xi and Xj are independent then
hXiXji = hXiihXji (3.17)
8
10. Chapter 4
Marginalization and
Conditional Distribution
4.1 Marginalization
Marginalization is the operation of integrating out variables of the pdf of a
random vector X. Assume that X is split into two parts (since the order-
ing of the Xi is arbitrary, this corresponds to any division of the variables),
X = [XT
1:cXT
c+1:N ]T
= [X1X2 . . . XcXc+1 . . . XN ]T
. Let the pdf of X be p(x) =
p(x1, . . . , xN ), then :
p(x1, . . . , xc) =
Z
· · ·
Z
Rc+1:N
p(x1, . . . , xN ) dxc+1 . . . xN (4.1)
The nice part about gaussian distributions is that every marginal distribution
of a gaussian distribution is itself a gaussian. More specifically, let X be split
into two parts as above and X ∼ N(µ, Σ), then :
p(x1, . . . , xc) = p(x1:c) = N1:c(µ1:c, Σ1:c) (4.2)
where µ1:c = [µ1, µ2, . . . , µc] and
Σ1:c =
c11 c21 . . . cc1
c12 c22
.
.
.
.
.
.
...
.
.
.
cc1 . . . . . . ccc
In words, the mean and covariance matrix of the marginal distribution is the
same as the corresponding elements of the joint distribution.
9
11. 4.2. CONDITIONAL DISTRIBUTION
4.2 Conditional distribution
As in the previous, let X = [XT
1:cXT
c+1:N ]T
= [X1X2 . . . XcXc+1 . . . XN ]T
be a
division of the variables into two parts. Let X ∼ N(µ, Σ) and use the notation
X = [XT
1:cXT
c+1:N ]T
= [XT
(1)XT
(2)]T
and
µ =
µ
µ(1)
µ(2)
¶
and
Σ =
µ
Σ11 Σ12
Σ21 Σ22
¶
It is found that the conditional distribution p(x(1)|x(2)) is in fact again a gaus-
sian distribution and
X(1)|X(2) ∼ N(µ(1) + Σ12Σ−1
22 (x2 − µ(2)), Σ11 − Σ12Σ−1
22 ΣT
12) (4.3)
10
12. Chapter 5
Tips and Tricks
5.1 Products
Consider the product Nx(µa, Σa) · Nx(µb, Σb) and note that they both have x
as their ”random variable”. Then
Nx(µa, Σa) · Nx(µb, Σb) = zcNx(µc, Σc) (5.1)
where Σc = (Σ−1
a + Σ−1
b )−1
and µc = Σc(Σ−1
a µa + Σ−1
b µb) and
zc = |2πΣaΣbΣ−1
c |− 1
2 exp
µ
−
1
2
(µa − µb)T
Σ−1
a ΣcΣ−1
b (µa − µb)
¶
= |2π(Σa + Σb)|− 1
2 exp
µ
−
1
2
(µa − µb)T
(Σa + Σb)−1
(µa − µb)
¶ (5.2)
In words, the product of two gaussians is another gaussian (unnormalized). This
can be generalised to a product of K gaussians with distributions Xk ∼ N(µk, Σk).
K
Y
k=1
Nx(µk, Σk) = z̃ · Nx(µ̃, Σ̃) (5.3)
where Σ̃ =
³PK
k=1 Σ−1
k
´−1
and µ̃ = Σ̃
³PK
k=1 Σ−1
k µk
´
=
³PK
k=1 Σ−1
k
´−1 ³PK
k=1 Σ−1
k µk
´
and
z̃ =
|2πΣd|
1
2
QK
k=1 |2πΣk|
1
2
Y
i<j
exp
µ
−
1
2
(µi − µj)T
Bij(µi − µj)
¶
(5.4)
where
11
13. 5.2. GAUSSIAN INTEGRALS
Bij = Σ−1
i
à K
X
k=1
Σ−1
k
!−1
Σ−1
j (5.5)
5.2 Gaussian Integrals
A nice thing about the fact that products of gaussian functions are again a
gaussian function, is that it makes gaussian integrals easier to calculate since
R
Nx(µ̃, Σ̃)dx = 1. Using this with the equations (5.1) and (5.3) of the previous
section gives the following.
Z
Rd
Nx(µa, Σa) · Nx(µb, Σb) dx =
Z
Rd
zcNx(µc, Σc) dx
= zc
(5.6)
Similarly,
Z
Rd
K
Y
k=1
Nx(µk, Σk) dx = z̃ (5.7)
Equation (5.3) can also be used to calculate integrals such as
R
|x|q
(
Q
k Nx(µk, Σk)) dx
or similar by using the same technique as above.
5.3 Useful integrals
Let X ∼ N(µ, Σ) and a ∈ Rd
an arbitrary vector. Then
heaT
x
i ≡
Z
Nx(µ, Σ)eaT
x
dx = eaT
µ+ 1
2 aT
Σa
(5.8)
From this expression, it is possible to find integrals such as
R
exp(xT
Ax +
aT
x) dx. Another useful integral is
hexT
Ax
i ≡
Z
Nx(µ, Σ)exT
Ax
dx = |I − 2ΣA|− 1
2 e− 1
2 µT
(Σ−2A−1
)−1
µ
(5.9)
where A ∈ Md·d
is a non-singular matrix.
12
14. Bibliography
[1] T.W. Anderson, An introduction to multivariate statistical analysis. Wiley,
1984.
[2] C. M. Bishop, Neural Networks for Pattern Recognition. Oxford University
Press, 1995.
[3] K. Triantafyllopoulos, “On the central moments of the multidimensional
Gaussian distribution,” The Mathematical Scientist, vol. 28, pp. 125–128,
2003.
[4] S. Roweis, “Gaussian Identities,” http://www.cs.toronto.edu/ roweis/notes.html.
[5] J. Larsen, “Gaussian Integrals,” Tech. Rep., http://isp.imm.dtu.dk/staff-
/jlarsen/pubs/frame.htm
[6] www.Wikipedia.org.
[7] www.MathWorld.wolfram.com
[8] P. Kidmose, “Blind Separation of Heavy Tail Signals,” Ph.d. Thesis,
http://isp.imm.dtu.dk/staff/kidmose/pk publications.html
13