3. Completely described by the cumulative probability distribution
function (cdf) or the probability distribution/density function
(pdf).
Some properties can be described by measures such as mean,
variance, mode, . . .
Andreas Scheidegger Univariate Random Variables 1
4. Probability Distribution/Density Function (pdf)
PA fB
z1 z2 zn zzrzl
Discrete RV Probability to obtain a certain output.
Continuous RV Proportional to the probability to obtain an output
close to a certain value.
Andreas Scheidegger Univariate Random Variables 2
5. Cumulative Distribution Function (cdf)
FA FB
z1 z2 zn zzrzl
0
1
0
1
Discrete and continous RV Probability to obtain an output equal
or smaller than a certain value.
Andreas Scheidegger Univariate Random Variables 3
6. cdf and pdf
Discrete RVs
Distribution function:
FA(z) = P(A ≤ z)
Probability distribution:
PA(zi ) for zi ∈ ΩA
Continous RVs
Distribution function:
FB(z) = P(B ≤ z)
Probability density:
fB(z) =
d
dz
FB(z)
P(B ∈ [z1, z2]) =
z2
z1
fB(z) dz
P(B ∈ [z, z + ∆]) ≈ ∆ · fB(z)
Andreas Scheidegger Univariate Random Variables 4
7. Characteristics of Random Variables
Measures of Location
Expected value:
E[A] =
z∈ΩA
z PA(z) , E[B] =
ΩB
z fB(z) dz
Median:
Med[Z] : P(Z ≤ Med[Z]) = P(Z Med[Z]) = Q0.5[Z]
Quantiles:
Qp[Z] : P(Z ≤ Qp[Z]) = p and P(Z Qp[Z]) = 1 − p
Mode:
Mode[A] = arg max
zi ∈ΩA
PA(zi ) , Mode[B] = arg max
z∈ΩB
fB(z)
Andreas Scheidegger Univariate Random Variables 5
8. Characteristics of Random Variables
Measures of Location
Expected value of a function of a RV:
E[g(A)] =
z∈ΩA
g(z)PA(z)
E[g(B)] =
ΩB
g(z)fB(z) dz
Andreas Scheidegger Univariate Random Variables 6
9. Characteristics of Random Variables
Measures of Location
Expected value of a function of a RV:
E[g(A)] =
z∈ΩA
g(z)PA(z)
E[g(B)] =
ΩB
g(z)fB(z) dz
Attention!
E[g(X)] = g (E[X])
Andreas Scheidegger Univariate Random Variables 6
10. Characteristics of Random Variables
Measures of Extension
Variance:
Var[Z] = E Z − E[Z]
2
Standard Deviation:
SD[Z] = Var[Z]
Inter-Quantile Range:
QRp[Z] = Q(1+p)/2[Z] − Q(1−p)/2[Z]
Andreas Scheidegger Univariate Random Variables 7
11. Characteristics of Random Variables
E[aZ + b] = a E[Z] + b
E[Z1 ± Z2] = E[Z1] ± E[Z2]
Var[Z] = E[Z2
] − E[Z]2
Var[aZ + b] = a2
Var[Z]
Only if Z1 and Z2 are independent:
Var[Z1 ± Z2] = Var[Z1] + Var[Z2]
Andreas Scheidegger Univariate Random Variables 8
14. Joint distribution
discrete RV:
PA,B(a, b) = PA|B(a|b) · PB(b) = PB|A(b|a) · PA(a)
E.g.: PA,B(3, 1) : Probability to obtain ai = 3 and bi = 1.
Andreas Scheidegger Multivariate Random Variables 10
15. Joint distribution
discrete RV:
PA,B(a, b) = PA|B(a|b) · PB(b) = PB|A(b|a) · PA(a)
E.g.: PA,B(3, 1) : Probability to obtain ai = 3 and bi = 1.
continous RV:
fA,B(a, b) = fA|B(a|b) · fB(b) = fB|A(b|a) · fA(a)
E.g.: fA,B(3, 1) : proportional to the probability to obtain a
realization close to 3 and 1.
Andreas Scheidegger Multivariate Random Variables 10
18. Marginal distribution
Discrete random variables:
PA(a) =
b∈ΩB
PA,B(a, b)
Continuous random variables:
fA(a) =
ΩB
fA,B(a, b) db
Andreas Scheidegger Multivariate Random Variables 12
19. Independence
Definition:
FA,B(a, b) = FA(a) · FB(b)
Discrete random variables:
PA,B(a, b) = PA(a) · PB(b)
Continuous random variables:
fA,B(a, b) = fA(a) · fB(b)
Andreas Scheidegger Multivariate Random Variables 13
20. Bayes’ Theorem1
Discrete random variables
Because
PA|B(a|b)PB(b) = PB|A(b|a)PA(a)
we can write
PA|B(a|b) =
PB|A(b|a)PA(a)
PB(b)
=
PB|A(b|a)PA(a)
a ∈ΩA
PB|A(b|a )PA(a )
1
Bayes’ Theorem as we know it today was actually formulated by P. Laplace
in 1774 and not by T. Bayes.
Andreas Scheidegger Multivariate Random Variables 14
21. Bayes’ Theorem
Continuous random variables
fA|B(a|b) =
fB|A(b|a)fA(a)
fB(b)
=
fB|A(b|a)fA(a)
fB|A(b|a )fA(a ) da
Andreas Scheidegger Multivariate Random Variables 15
22. Characteristics of Random Variables
Dependencies
Variance-Covariance Matrix:
Var[Z] = E Z − E[Z] Z − E[Z]
T
Individual Covariances:
Cov[Zi , Zj] = E Zi − E[Zi ] Zj − E[Zj] = Var[Z]i,j
Correlation Matrix:
Cor[Z]i,j =
Cov[Zi , Zj]
Var[Zi ] · Var[Zj]
Andreas Scheidegger Multivariate Random Variables 16
23. Correlation
Correlation measures only linear dependencies!
Figure: Several sets of (x, y) points, with the correlation coefficient of x
and y for each set. Source: Wikipedia.
Andreas Scheidegger Multivariate Random Variables 17
24. Short Notation
Function argument corresponds to RV
PA(a), PB|A(b|a) ←→ P(a), P(b|a)
fB(b), fA|B(a|b) ←→ f (b), f (a|b) or p(b), p(a|b)
Andreas Scheidegger Notation 18
25. Short Notation
Function argument corresponds to RV
PA(a), PB|A(b|a) ←→ P(a), P(b|a)
fB(b), fA|B(a|b) ←→ f (b), f (a|b) or p(b), p(a|b)
Example:
fX1|X2,X3
(x1|x2, x3) =
fX2|X1
(x2|x1)fX1|X3
(x1|x3)
fX2 (x2)
p(x1|x2, x3) =
p(x2|x1)p(x1|x3)
p(x2)
Andreas Scheidegger Notation 18
27. Directed Acyclic Graphs
Visualize independence structure of RV
A
B
DC
p(A)
p(B | A)
p(C | A, B)
p(D | B)
e.g. A and D are conditionally
independent. joint distribution:
p(A, B, C, D) =
p(A) p(B | A) p(C | A, B) p(D | B)
Andreas Scheidegger Notation 19
29. Central Limit Theorem
Lets X1, X2, . . . be independent and identically distributed RVs
with mean µ and a finite variance σ2. Further we define
Sn = X1 + X2 + . . . + Xn, that has a mean nµ and variance nσ2.
Then the standardized RV
Zn =
Sn − nµ
√
nσ
is standard normal distributed for n → ∞.
Andreas Scheidegger Normal distributions 21
30. Central Limit Theorem Example
n = 1
Density
−2 −1 0 1 2
0.00.40.8
n = 2
Density
−2 −1 0 1 2
0.00.30.6
n = 3
Density
−2 −1 0 1 2
0.00.30.6
n = 4
Density
−2 −1 0 1 2
0.00.20.4
n = 5
Density
−2 −1 0 1 2
0.00.20.4
n = 6
Density
−2 −1 0 1 2
0.00.20.4
n = 7
Density
−2 −1 0 1 2
0.00.20.4
n = 8
Density
−2 −1 0 1 2
0.00.20.4
n = 9
Density
−2 −1 0 1 2
0.00.20.4
n = 10
Density
−2 −1 0 1 2
0.00.20.4
n = 11
Density
−2 −1 0 1 2
0.00.20.4
n = 12
Density
−2 −1 0 1 2
0.00.20.4
Andreas Scheidegger Normal distributions 22
31. Relationships of Univariate Distributions
Figure 1. Univariate distribution relationships.
The American Statistician, February 2008, Vol. 62, No. 1 47
Downloadedby[Lib4RI]at02:2428May2013
at02:2428May2013
From: Leemis, L. M. and McQueston, J. T. (2008) Univariate distribution
relationships. The American Statistician, 62(1), 45–53. → Link
Andreas Scheidegger Normal distributions 23
32. Multivariate Normal Distribution
Density of a multivariate Normal distribution of dimension n with a
mean vector µ and a variance-covariance matrix Σ:
Z ∼ N(µ, Σ)
fN(µ,σ,R)(z) =
1
(2π)n/2
1
| Σ |1/2
exp −
1
2
(z − µ)T
Σ−1
(z − µ)
Andreas Scheidegger Normal distributions 24
34. Multivariate Normal Distribution
Properties
All marginals are normal distributed
Z ∼ N(µ, Σ) ⇒ Zi ∼ N(µi , Σi,i )
Linear transformation:
Z ∼ N(µ, Σ) ⇒ AZ + b ∼ N Aµ + b, AΣAT
Andreas Scheidegger Normal distributions 25
35. Multivariate Normal Distribution
Properties
All marginals are normal distributed
Z ∼ N(µ, Σ) ⇒ Zi ∼ N(µi , Σi,i )
Linear transformation:
Z ∼ N(µ, Σ) ⇒ AZ + b ∼ N Aµ + b, AΣAT
Conditional distribution:
Z =
X
Y
∼ N
µX
µY
,
ΣX,X ΣX,Y
ΣT
X,Y ΣY,Y
⇒ X | Y (y) ∼ N µX + ΣX,YΣ−1
Y,Y(y − µY), ΣX,X − ΣX,YΣ−1
Y,YΣT
X,Y
Andreas Scheidegger Normal distributions 25
39. Discrete random process
“Random vectors with infinity large number of elements”
(0.11, 10.78, -10.24, -3.90, 5.91, ...)
(-1.11, -4.06, -8.64, -0.92, -2.27, ...)
(0.76, -8.54, 0.81, 2.03, 12.9, ...)
Andreas Scheidegger Random Processes 27
41. What is a Probability?
Interpretation of probabilities
1. The probability for “head” is 1/2.
2. The probability that it rains tomorrow is 30%.
Frequentist Subjective
Other probability interpretations:
→ http://www.webcitation.org/6YupVo9zG
Andreas Scheidegger Interpretation 29
42. What is a Probability?
Interpretation of probabilities
1. The probability for “head” is 1/2.
2. The probability that it rains tomorrow is 30%.
Frequentist
1. The frequency that “head”
occurs if the random
experiment is repeated.
Subjective
1. Somebody’s belief that a
coin toss results in “head”,
given his/her experience.
Other probability interpretations:
→ http://www.webcitation.org/6YupVo9zG
Andreas Scheidegger Interpretation 29
43. What is a Probability?
Interpretation of probabilities
1. The probability for “head” is 1/2.
2. The probability that it rains tomorrow is 30%.
Frequentist
1. The frequency that “head”
occurs if the random
experiment is repeated.
2. “Rain tomorrow” is not a
repeatable experiment
Subjective
1. Somebody’s belief that a
coin toss results in “head”,
given his/her experience.
2. Somebody’s belief that it
rains tomorrow, given
his/her experience.
Other probability interpretations:
→ http://www.webcitation.org/6YupVo9zG
Andreas Scheidegger Interpretation 29
44. Summary
joint = conditional x marginal
f (a, b) = f (a|b) f (b) = f (b|a) f (a)
Marginals:
f (a) = f (a, b) db = f (a|b) f (b) db
More information in Appendix A.2 – A.5.
Andreas Scheidegger Summary 30
46. Implemented distribution in R
For all distributions four functions are implemented:
d__(x, ...) pdf evaluated at x
p__(x, ...) cdf evaluated at x
q__(p, ...) p-th quantile
r__(n, ...) sample n random numbers
beta *beta binomial *binom
Cauchy *cauchy chi-squared *chisq
exponential *exp F *f
gamma *gamma geometric *geom
hypergeometric *hyper log-normal *lnorm
multinomial *multinom negative binomial *nbinom
normal *norm Poisson *pois
Student’s t *t uniform *unif
Weibull *weibull
Andreas Scheidegger Summary 32
47. Normal Distribution
Density
Z ∼ N(µ, σ) fN(µ,σ)(z) =
1
σ
√
2π
exp −
(z − µ)2
2σ2
−3 −2 −1 0 1 2 3
012345
Normal with mean=0
z
f
sd = 0.1
sd = 0.25
sd = 0.5
sd = 1
sd = 2
sd = 4
−3 −2 −1 0 1 2 3
0.00.20.40.60.81.0
Normal with mean=0
z
F
sd = 0.1
sd = 0.25
sd = 0.5
sd = 1
sd = 2
sd = 4
Andreas Scheidegger Summary 33
48. Normal Distribution
Properties
E N(µ, σ) = Mode N(µ, σ) = Med N(µ, σ) = µ
SD N(µ, σ) = σ
Central limit theorem:
Lets X1, X2, . . . be independent and identically distributed RVs
with mean µ and a finite variance σ2. Further we define
Sn = X1 + X2 + . . . + Xn, that has a mean nµ and variance nσ2.
Then the standardized RV
Zn =
Sn − nµ
√
nσ
is standard normal distributed for n → ∞.
Andreas Scheidegger Summary 34
49. Lognormal Distribution
Definition:
Z = exp(X) , X ∼ N(m, s)
Density:
Z ∼ LN(µ, σ)
fLN(µ,σ)(z) =
1
√
2π
1
sz
exp
−
1
2
log
z
µ
+
s2
2
2
s2
for z 0
0 for z ≤ 0
with
s = log 1 +
σ2
µ2
Andreas Scheidegger Summary 35
50. Lognormal Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0
012345
Lognormal with mean=1
z
f
sd = 0.1
sd = 0.25
sd = 0.5
sd = 1
sd = 2
sd = 4
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.00.20.40.60.81.0
Lognormal with mean=1
z
F
sd = 0.1
sd = 0.25
sd = 0.5
sd = 1
sd = 2
sd = 4
Andreas Scheidegger Summary 36
52. Lognormal Distribution
R implementation
Attention: The lognormal distribution in R is defined with m and s
(the mean and standard deviation of X)!
The code below computes the arguments if mean µ and standard
deviation σ are given:
## conversion , ’mu ’ and ’sigma ’ given
meanlog - log(mu) - 0.5*log(1 + (sigma/mu )^2)
sdlog - sqrt(log(1 + sigma ^2/(mu ^2)))
## generate 1000 random samples
rlnorm (1000 , meanlog=meanlog , sdlog=sdlog)
Andreas Scheidegger Summary 38
56. F Distribution
Definition:
Z =
X
n
Y
m
, X ∼ χ2
n , Y ∼ χ2
m
Density:
Z ∼ Fn,m fFn,m (z) =
Γ (n + m)/2 (n/m)n/2 z(n−2)/2
Γ n/2 Γ m/2
Andreas Scheidegger Summary 42
57. F Distribution
0 1 2 3 4
0.00.20.40.60.81.01.2
F
z
f
df1 = 2 df2 = 10
df1 = 3 df2 = 10
df1 = 5 df2 = 10
df1 = 5 df2 = 100
0 1 2 3 4
0.00.20.40.60.81.0
F
z
F
df1 = 2 df2 = 10
df1 = 3 df2 = 10
df1 = 5 df2 = 10
df1 = 5 df2 = 100
Andreas Scheidegger Summary 43
58. F Distribution
Properties
E Fn,m =
m
m − 2
for m 2
Mode Fn,m =
m(n − 2)
n(m + 2)
for n 2
SD Fn,m =
2m2(n + m − 2)
n(m − 2)2(m − 4)
for m 4
Andreas Scheidegger Summary 44
59. t Distribution
Definition:
Z =
X
Y
n
, X ∼ N(0, 1) , Y ∼ χ2
n
Density:
Z ∼ tn ftn (z) =
Γ (n + 1)/2
√
π n Γ n/2 (1 + z2/n)(n+1)/2
Andreas Scheidegger Summary 45
60. t Distribution
−6 −4 −2 0 2 4 6
0.00.10.20.30.40.5
t
z
f
df = 1
df = 2
df = 4
df = 10
df = 100
−6 −4 −2 0 2 4 6
0.00.20.40.60.81.0
t
z
F
df = 1
df = 2
df = 4
df = 10
df = 100
Andreas Scheidegger Summary 46
62. Uniform Distribution
Density
Z ∼ U(zmin, zmax) fU(zmin,zmax) =
1
zmax − zmin
−3 −2 −1 0 1 2 3
0.00.20.40.60.81.0
Uniform with mean=0
z
f
max = 1
max = 2
−3 −2 −1 0 1 2 3
0.00.20.40.60.81.0
Uniform with mean=0
z
F
max = 1
max = 2
Andreas Scheidegger Summary 48
63. Uniform Distribution
Properties
E U(zmin, zmax) =
zmin + zmax
2
Med U(zmin, zmax) =
zmin + zmax
2
SD U(zmin, zmax) =
zmax − zmin
2
√
3
Andreas Scheidegger Summary 49