Lecture1

Advanced Computational Methods in Statistics:
Lecture 1
Monte Carlo Simulation
&
Introduction to Parallel Computing

Axel Gandy

Department of Mathematics
Imperial College London
http://www2.imperial.ac.uk/~agandy

London Taught Course Centre
for PhD Students in the Mathematical Sciences
Spring 2011

Today’s Lecture
Part I Monte Carlo Simulation
Part II Introduction to Parallel Computing

Axel Gandy 2

Random Number Generation Computation of Integrals Variance Reduction Techniques

Part I

Monte Carlo Simulation

Random Number Generation

Computation of Integrals

Variance Reduction Techniques

Axel Gandy Monte Carlo Simulation 3


Uniform Random Number Generation
Basic building block of simulation:
stream of independent rv U1 , U2 , . . . ∼ U(0, 1)
“True” random number generators:
based on physical phenomena
Example http://www.random.org/; R-package random: “The
randomness comes from atmospheric noise”
Disadvantages of physical systems:
cumbersome to install and run
costly
slow
cannot reproduce the exact same sequence twice [veriﬁcation,
debugging, comparing algorithms with the same stream]
Pseudo Random Number Generators: Deterministic algorithms
Example: linear congruential generators:
sn
un = , sn+1 = (asn + c)modM
M


General framework for Uniform RNG
(L’Ecuyer, 1994)
T s1 T s2 T s3 T ...
s

G G G
u1 u2 u3

s initial state (’seed’)
S finite set of states
T : S → S is the transition function
U finite set of output symbols
(often {0, . . . , m − 1} or a finite subset of [0, 1])
G : S → U output function
si := T (Si−1 ) and ui := G (si ).
output: u1 , u2 , . . .


Some Notes for Uniform RNG
S ﬁnite =⇒ ui is periodic
In practice: seed s often chosen by clock time as default.
Good practice to be able to reproduce simulations:
Save the seed!



Quality of Random Number Generators
“Random numbers should not be generated with a method
chosen at random” (Knuth, 1981, p.5)
Some old implementations were unreliable!
Desirable properties of random number generators:
Statistical uniformity and unpredictability
Period Length
Eﬃciency
Theoretical Support
Repeatability, portability, jumping ahead, ease of implementation
(more on this see e.g. Gentle (2003), L’Ecuyer (2004), L’Ecuyer
(2006), Knuth (1998))
Usually you will do well with generators in modern software (e.g.
the default generators in R).
Don’t try to implement your own generator!
(unless you have very good reasons)


Nonuniform Random Number Generation
How to generate nonuniform random variables?
Basic idea:
Apply transformations to a stream of iid U[0,1] random variables



Inversion Method
Let F be a cdf.
Quantile function (essentially the inverse of the cdf):

F −1 (x) = inf{x : F (x) ≥ u}

If U is uniform then F −1 (U) ∼ F . Indeed,

P(X ≤ x) = P(F −1 (U) ≤ x) = P(U ≤ F (x)) = F (x)

Only works if F −1 (or a good approximation of it) is available.



Acceptance-Rejection Method
f(x)
Cg(x)

0.30
0.20
0.10
0.00

0 5 10 15 20
x

target density f
Proposal density g (easy to generate from) such that for some
C < ∞:
f (x) ≤ Cg (x)∀x
Algorithm:
1. Generate X from g .
2. With probability Cg(X )) return X - otherwise goto 1.
f
(X
1
C= probability of acceptance - want it to be as close to 1 as
possible.


Further Algorithms
Ratio-of-Uniforms
Use of the characteristic function
MCMC
For many of those techniques and techniques to simulate speciﬁc
distributions see e.g. Gentle (2003).



Evaluation of an Integral
Want to evaluate

I := g (x)dx
[0,1]d

Importance for statistics: computation of expected values
(posterior means), probabilities (p-values), variances, normalising
constants, ....
Often, d is large. In a random sample, often d =sample size.
How to solve it?
Symbolical
Numerical Integration
Quasi Monte Carlo
Monte Carlo Integration



Numerical Integration/Quadrature
Main idea: approximate the function locally with simple
function/polynomials
Advantage: good convergence rate
Not useful for high dimensions - curse of dimensionality



Midpoint Formula
Basic:
1
1
f (x)dx ≈ f
0 2
Composite: apply the rule in n subintervals
1.0
0.5
0.0
−0.5
−1.0

0.0 0.5 1.0 1.5 2.0

1
Error: O( n ).



Trapezoidal Formula
Basic:
1
1
f (x)dx ≈ (f (0) + f (1))
0 2
Composite:
1.0
0.5
0.0
−0.5
−1.0

0.0 0.5 1.0 1.5 2.0

1
Error: O( n2 ).


Simpson’s rule
Approximate the integrand by a quadratic function
1
1 1
f (x)dx ≈ [f (0) + 4f ( ) + f (1)]
0 6 2
Composite Simpson:
1.0
0.5
0.0
−0.5
−1.0

0.0 0.5 1.0 1.5 2.0

1
Error: O( n4 ). Axel Gandy Monte Carlo Simulation 18


Advanced Numerical Integration Methods
Newton Cotes formulas
Adaptive methods
Unbounded integration interval: transformations



Curse of dimensionality - Numerical Integration in Higher
Dimensions

I := g (x)dx
[0,1]d

Naive approach:
write as iterated integral
1 1
I := ... g (x)dxn . . . dx1
0 0
use 1D scheme for each integral with, say g points .
n = g d function evaluations needed
for d = 100 (a moderate sample size) and g = 10 (which is not a
lot):
n > estimated number of atoms in the universe!
1
Suppose we use the trapezoidal rule, then the error = O( n2/d )
More advanced schemes are not doing much better!


n
1
g (x)dx ≈ g (Xi ),
[0,1]d n
i=1

where X1 , X2 , · · · ∼ U([0, 1]d ) iid.
SLLN:
n
1
g (Xi ) → g (x)dx (n → ∞)
n [0,1]d
i=1
1
CLT: error is bounded by OP ( √n ).
independent of d
Can easily compute asymptotic conﬁdence intervals.



Quasi-Monte-Carlo
Similar to MC, but instead of random Xi : Use deterministic xi
that ﬁll [0, 1]d evenly.
so-called “low-discrepancy sequences”.
R-package randtoolbox



Comparison between Quasi-Monte-Carlo and Monte Carlo
- 2D
1000 Points in [0, 1]2 generated by a quasi-RNG and a Pseudo-RNG

Quasi−Random−Number−Generator Standard R−Randon Number Generator

1.0
1.0

q q q
q q q q qq q q q q q qq q q q qq qq q q
qq q qq q q q q q q q q q qqq q
q q q q q q q qq q q q
q q q q q qq q qqq q q q qq q qq q q qqq
q q q qq q
q q q q q
q q qq q q
q q q q q qq q q q qq q
q q qq q q q q q q q q qq q qq q q q q q q q q qq q
qq q q q
q q q q q qq q q q q q qq qq qq q q q q qq q q q q
q q q q q q qq
q
q q q q qq q q
q q q q qq qq
q q
q q q q q q qqq q q q q q q q
q q q q q q qq q q qqq q q
q q q q
q q
q q q q q q q q qq qq q q q qq
q q q qq qq q qq q qq q q q q q q qq q q
q qq q
qq
q
qq q
q qq
q qq q q q q q qq q q qq q q q qq q q
q q q q q qq
q
qqq q q qq
q q qq qq q
q q qq
qq
q qq q
q q
q q q q q qq q q q
q q q qq q q q q q q q
q q q q q q q q q q q q qqq q q q q
q q qq q
q q q q q qq q q
q q qq q q qq
q q qq q q

0.8
0.8

q qq q q qq q qq
q q
q q q
q q q q q q q
qq q qq q q
q q q q q q q
q
q q
q q q q
q q q qq qq
q q
q q q qq q q
q q q q
q q q q q q q q q qq q q
q q qq q q qq q q q qq q q qq
q q q
q q q qq q q
q q
q q q q
q q q
q q q q q q q qq q q q q q q qq qq qq q qqqqq q q qq qq
q q
q q qq qq q
q q q q q q
q q q q qq
q q q qq q q qq q
q q q q q q
q
q qq
q q q qq q q qq q
q qqqqq
q q q q q qq q
q q q q qq q q q q q q q q qq q qq q qq q
q q
q q q q q q q q q qq q q q q q q qq q q qq q qq
q qq
q q q
q q q q qq q q qq q q q qq qq q
q q qq q q q q q qq q q
q q qq q q q qq q q
q
q qq qq q q
qq q q
q q qq q qq
qq q
q

0.6
0.6

q q q q q q qq q q
q q q
q q q q q qq q q q
q q q qqq q
q qq
q q qq q q qq q q q q qq q
q q qq q q qq q q q q qq q q q q qq q q qq q qq q q qqq
q q q q qq qq qq q q
q q q q
q qq
q q q q q q q q q q
q q q qq q q
q q q qqq q
q q q
q q
q q q q qq q q qq q q qq qq q q q qq
q q qq q q q q qq q q q qq q q q qq q qq q q q q q q q q q q q q q q q qq
q q qq
q q
q q q
q q qq q q qq
q q qq q q qq q q q qq q
q q qq q q q q
q
q q q
q q
qqq q qq qq q q qqqq q
q q q
q qq qq
qq q q q qq q qq q
q q qq qq q q q q q
q q q q q q q q q
q q qq
q q q q
q qq q
q q q q q qq q q qq qq q q q q q q
q q q q q q qqq qqqq q
q q qq q q qq q q
q q q q q q qq q q qq qq q q q qq q q q qq qq q q
q q qq q qq q q q q q qqq q q q q q
q q
qq qq

0.4
0.4

q q q
q q q q q qq
q q
q q qq q q q q q qq q q q
q q q q q qq q q q q q qq q q q
q qq q
q
q q q q q qq q q q qq q q q q q q qq q
q q q q q q q qq qq q qq q q
q q q q q qqq q
q q q
q q
qq
q
q q q
q q qq q q qq q q q qq qq q q q qq
q q q q qq
q q qq
qq q qq q qq q q q qq
q q qq q q q q q qq q q q qq q
q q
q q q qq
qq q
q q q qqqq q q q q q q q q q q qq q q q
qq q qq q q
q
q
q
q q qq q
q qq q
q qq q q qq q q qq q q q qq qq q
q q qq q q q q q q q
q q q qq q q q q q q qq q qq
q q q
q qq q qq qq q q
qq q
q q q
q q qq q q q q q q q qq q q qq q
q q qq q q q q qq qq q qq q q q qq q q q
qq q q qq
q q
q qq
q
qq q q q qq q q
q q q q q qq q q qq
q
q q q q qq q q
q qq q q q
q
q q qq q q q q q qq qq qq
qq q
q
q q q q q qq q qq

0.2
0.2

q q q q q q
q q q q qq q q qq qq q q qq qq
q q q q q q q qq q q q q q q qq q q qq q qq qq q qq q
q qq
q
q q q q
q q q q q
q q q qq q q qq qq q
q q qq qq
q q q
q q q q q q q q qq q q
q q q qq
q q q q q q q qq q
q
q q q q qq qq
q q q q q q q q
q q q q q q qq q qq qqq q
q qq q q q
q q q q qq
q qq q q q q q qq q q
q q qq q q q q q qq q q q q qq q
q q q q q qq q q q q qq q
q q q q
q q
q qq q
q q q qqq q
q
q q q q q q q
q q
q q q q qq q qqq q qq
q q qq qq
q q qq q q
q q qq q q q q q q q qq q q qq q q qq qq q
q q qq q q q q qq q
q
q qq qq q q q q q q qq qq q
q q q q qq q q q q q qq
qq q q q
q q q q q qq q q qq q
q
q q q qq q q q
q q q q q qq q q q q q qq q q qq qq q q q q q q q q qq q q q q q q q q q q q
q q q q q q
0.0
0.0

q q q q q q q q q q q q q q q qq qq qq
q q qq

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0



Comparison between Quasi-Monte-Carlo and Monte Carlo

(x1 + x2 )(x2 + x3 )2 (x3 + x4 )3 dx
[0,1]4

Using Monte-Carlo and Quasi-MC
2.75

quasi−MC
MC
2.70
2.65
2.60
2.55

0e+00 2e+05 4e+05 6e+05 8e+05 1e+06
n



Bounds on the Quasi-MC error
Koksma-Hlawka inequality (Niederreiter, 1992, Theorem 2.11)
1 ∗
g (xi ) − g (x)dx ≤ Vd (g )Dn ,
n [0,1]d

where
Vd (g ) is the so-called Hardy and Krause variation of g
and
Dn is the discrepancy of the points x1 , . . . , xn in [0, 1]d
given by
∗
Dn = sup |#{xi ∈ A : i = 1, . . . , n} − λ(A)|
A∈A
where
λ is Lebesgue measure
A is the set of all subrectangles of [0, 1]d of the
d
form i=1 [0, ai ]d
Many sequences have been suggested, e.g. the Halton sequence
(other sequences: Gandy
Axel
Faure, Sobol, ...) with:Carlo Simulation
Monte 25


Comparison
The consensus in the literature seems to be:
use numerical integration for small d
Quasi-MC useful for medium d
use Monte Carlo integration for large d



Importance Sampling
Main idea: Change the density we are sampling from.
Interested in E(φ(X )) = φ(x)f (x)dx
For any density g ,
f (x)
E(φ(X )) = φ(x) g (x)dx
g (x)
Thus an unbiased estimator of E(φ(X )) is
n
ˆ= 1
I φ(Xi )
f (Xi )
,
n g (Xi )
i=1
where X1 , . . . , Xn ∼ g iid.
How to choose g ?
Suppose g ∝ φf then Var(ˆ) = 0.
I
However, the corresponding normalizing constant is E(φ(X )), the
quantity we want to estimate!
A lot of theoretical work is based on large deviation theory.


Importance Sampling and Rare Events
Importance sampling can greatly reduce the variance for
estimating the probability of rare events, i.e. φ(x) = I(x ∈ A)
and E(φ(X )) = P(X ∈ A) small.



Control Variates
Interested in I = E X
Suppose we can also observe Y and know E Y .
Consider T = X + a(Y − E(Y ))
Then E T = I and
Var T = Var X + 2a Cov(X , Y ) − a2 Var Y
Minimized for a = − Cov(XY ) .
Var
,Y

usually, a not known → estimate
For Monte Carlo sampling:
generate iid sample (X1 , Y1 ), . . . , (Xn , Yn )
estimate Cov(X , Y ), Var Y based on this sample → ˆ
a
ˆ = 1 n [Xi + ˆ(Yi − E(Y ))]
I a
n i=1
ˆ can be computed via standard regression analysis.
I
Hence the term “regression-adjusted control variates” .
Can be easily generalised to several control variates.


Further Techniques
Antithetic Sampling
Use X and −X
Conditional Monte Carlo
Evaluate parts explicitly
Common Random Numbers
For comparing two procedures - use the same sequence of
random numbers.
Stratiﬁcation
Divide sample space Ω into strata Ω1 , . . . , Ωs
In each strata, generate Ri replicates conditional on Ωi and
obtain an estimates ˆi
I
Combine using the law of total probability:
ˆ = p1ˆ1 + · · · + ps ˆs
I I I

Need to know pi = P(Ωi ) for all i


Introduction Parallel RNG Practical use of parallel computing (R)

Part II

Parallel Computing

Introduction

Parallel RNG

Practical use of parallel computing (R)

Axel Gandy Parallel Computing 32


Moore’s Law

(Source: Wikipedia, Creative Commons Attribution ShareAlike 3.0 License)



Growth of Data Storage
Growth PC Harddrive Capacity
q
q
q q
q q q
1e+02

q q
q q
q q q
q q q q q
q q q
q
q q q q
q q
q q
capacity [GB]

q
q q
q
q q
q
q q q
q q
q q q
q q
q q
q q
q
q q q
1e+00

q
q q q
q
q
q
q
q
q q

q
q q q
q q q
1e−02

q q q
q q q q
q q q
q q q q q
q
q q q
q

1980 1985 1990 1995 2000 2005 2010
year

Not only the computer speed but also the data size is increasing
exponentially!
The increase in the available storage is at least as fast as the
increase in computing power.



Introduction
Recently: Less increase in CPU clock speed
→ multi core CPUs are appearing (quad core available - 80 cores
in labs)
→ software needs to be adapted to exploit this

Traditional computing:
Problem is broken into small steps that are executed sequentially
Parallel computing:
Steps are being executed in parallel



von Neumann Architecture
CPU executes a stored program that speciﬁes a sequence of read
and write operations on the memory.
Memory is used to store both program and data instructions
Program instructions are coded data which tell the computer to
do something
Data is simply information to be used by the program
A central processing unit (CPU) gets instructions and/or data
from memory, decodes the instructions and then sequentially
performs them.



Diﬀerent Architectures
Multicore computing
Symmetric multiprocessing
Distributed Computing
Cluster computing
Massive Parallel processor
Grid Computing
List of top 500 supercomputers at http://www.top500.org/



Flynn’s taxonomy
Single Instruction Multiple Instruction
Single Data SISD MISD
Multiple Data SIMD MIMD
Examples:
SIMD: GPUs



Memory Architectures of Parallel Computers
Traditional System Memory
CPU

Shared Memory System Memory
CPU CPU CPU
Distributed Memory System
Memory Memory Memory
CPU CPU CPU

Distributed Shared Memory System
Memory Memory Memory
CPU CPU CPU CPU CPU CPU CPU CPU CPU



Embarrassingly Parallel Computations
Examples:
Bootstrap
Cross-Validation



Speedup
Ideally: computational time reduced linearly in the number of
CPUs
Suppose only a fraction p of the total tasks can be parallelized.
Supposing we have n parallel CPUs, the speedup is
1
(Amdahl’s Law)
(1 − p) + p/n

→ no inﬁnite speedup possible.
Example: p = 90%, maximum speed up by a factor of 10.



Communication between processes
Forking
Threading
OpenMP
shared memory multiprocessing
PVM (Parallel Virtual Machine)
MPI (Message Passing Interface)
How to divide tasks? e.g. Master/Slave concept



Parallel Random Number Generation
Problems with RNG on parallel computers
Cannot use identical streams
Sharing a single stream: a lot of overhead.
Starting from diﬀerent seeds: danger of overlapping streams
(in particular if seeding is not sophisticated or simulation is large)
Need independent streams on each processor...



Parallel Random Number Generation - sketch of general
approach
1 T 1 T 1 T ...
s1 s2 s3

G G G
f1 1 1 1
u1 u2 u3

f2 T T T
s 2
s1 2
s2 2
s3 ...

G G G
f3
2
u1 2
u2 2
u3

3 T 3 T 3 T ...
s1 s2 s3

G G G
3
u1 3
u2 3
u3



Packages in R for Parallen random Number Generation
rsprng Interface to the scalable parallel random number
generators library (SPRNG)
http://sprng.cs.fsu.edu/
rlecuyer Essentially starts with one random stream and
partitions it into long substreams by jumping ahead.
L’Ecuyer et al. (2002)



Profile
Determine what part of the programme uses most time with a
profiler
Improve the important parts (usually the innermost loop)
R has a built-in profiler (see Rprof, Rprof.summary, package
profr)



Use Vectorization instead of Loops
> a <- rnorm(1e+07)
> system.time({
+ x <- 0
+ for (i in 1:length(a)) x <- x + a[i]
+ })[3]
elapsed
21.17
> system.time(sum(a))[3]
elapsed
0.07



Just-In-Time Compilation - RA
From the developer’s websie
(http://www.milbo.users.sonic.net/ra/): “Ra is
functionally identical to R but provides just-in-time compilation
of loops and arithmetic expressions in loops. This usually makes
arithmetic in Ra much faster. Ra will also typically run a little
faster than standard R even when just-in-time compilation is not
enabled.”
Not just a package - central parts are reimplemented.
Bill Venables (on R help archive):
“if you really want to write R code as you might C code, then jit
can help make it practical in terms of time. On the other hand, if
you want to write R code using as much of the inbuilt operators
as you have, then you can possibly still do things better.”



Use Compiled Code
R is an interpreted language.
Can include C, C++ and Fortran code.
Can dramaticallly speed up computationally intensive parts
(a factor of 100 is possible)
No speedup if the computationally part is a vector/matrix
operation.
Downside: decreased portability



R-Package: snow
Mostly for “embarassingly parallel” computations
Extends the “apply”-style function to a cluster of machines:
params <- 1:10000
cl <- makeCluster(8, "SOCK")
res <- parSapply(cl, params, function(x) foo(x))
applies the function to each of the parameters using the cluster.
will run 8 copies at once.



snow - Hello World
> library(snow)
> cl <- makeCluster(2, type = "SOCK")
> str(clusterCall(cl, function() Sys.info()[c("nodename", "machine

List of 2
$ : Named chr [1:2] "AG" "x86"
..- attr(*, "names")= chr [1:2] "nodename" "machine"
$ : Named chr [1:2] "AG" "x86"
..- attr(*, "names")= chr [1:2] "nodename" "machine"

> str(clusterApply(cl, 1:2, function(x) x + 3))

List of 2
$ : num 4
$ : num 5

> stopCluster(cl)



snow - set up random number generator
without setting upt the RNG
> clusterApply(cl, 1:2, function(i) rnorm(5))
[[1]]
[1] 0.1540537 -0.4584974 1.1320638 -1.4979826 1.1992120

[[2]]
[1] 0.1540537 -0.4584974 1.1320638 -1.4979826 1.1992120
> stopCluster(cl)
Now with proper setup of the RNG
> clusterSetupRNG(cl)
[1] "RNGstream"
> clusterApply(cl, 1:2, function(i) rnorm(5))
[[1]]
[1] -1.14063404 -0.49815892 -0.76670013 -0.04821059 -1.09852152

[[2]]
[1] 0.7049582 0.4821092 -1.2848088 0.7198440 0.7386390
> stopCluster(cl)



snow - Another Simple Example
5x4 processors - several servers

> cl <- makeCluster(rep(c("localhost","euler","dirichlet","leibniz","riemann"),
each=4),type="SOCK")

may need to give password if not set up public/private key for ssh

> system.time(sapply(1:1000,function(i) mean(rnorm(1e3))))[3]
0.156
> system.time(clusterApply(cl,1:1000,function(i) mean(rnorm(1e3))))[3]
0.161
> system.time(clusterApplyLB(cl,1:1000,function(i) mean(rnorm(1e3))))[3]
0.401

→ too much overhead - parallelisation does not lead to gains

> system.time(sapply(1:1000,function(i) mean(rnorm(1e5))))[3]
12.096
> system.time(clusterApply(cl,1:1000,function(i) mean(rnorm(1e5))))[3]
0.815
> system.time(clusterApplyLB(cl,1:1000,function(i) mean(rnorm(1e5))))[3]
0.648

> stopCluster(cl)

→ parallelisation leads to substantial gain in speed.


Extensions of snow
snowfall oﬀers additional support for implicit sequential execution (e.g.
for distributing packages using optional parallel support),
additional calculation functions, extended error handling, and
many functions for more comfortable programming.
snowFT Extension of the snow package supporting fault tolerant and
reproducible applications. It is written for the PVM
communication layer.



Rmpi
For more complicated parallel algorithms that are not embarassingly
parallel.
Tutorial under http://math.acadiau.ca/ACMMaC/Rmpi/
Hello world from this tutorial

# Load the R MPI package if it is not already loaded.
if (!is.loaded("mpi_initialize")) {
library("Rmpi") }

# Spawn as many slaves as possible
mpi.spawn.Rslaves()

# In case R exits unexpectedly, have it automatically clean up
# resources taken up by Rmpi (slaves, memory, etc...)
.Last <- function(){
if (is.loaded("mpi_initialize")){
if (mpi.comm.size(1) > 0){
print("Please use mpi.close.Rslaves() to close slaves.")
mpi.close.Rslaves()
}
print("Please use mpi.quit() to quit R")
.Call("mpi_finalize") } }

# Tell all slaves to return a message identifying themselves
mpi.remote.exec(paste("I am",mpi.comm.rank(),"of",mpi.comm.size()))
# Tell all slaves to close down, and exit the program
mpi.close.Rslaves()
mpi.quit()

(not able to install under win from CRAN - install from
http://www.stats.uwo.ca/faculty/yu/Rmpi/)



Some other Packages
nws Network of Workstations
http://nws-r.sourceforge.net/
multicore Use of parallel computing on a single machine via fork
(Unix, MacOS) - very fast and easy to use.
GridR http:
//cran.r-project.org/web/packages/GridR/
Wegener et al. (2009, Future Generation Computer
Systems)
papply on CRAN “ Similar to apply and lapply, applies a
function to all items of a list, and returns a list with the
results. Uses Rmpi to distribute the processing evenly
across a cluster.”‘
multiR http://e-science.lancs.ac.uk/multiR/
rparallel http://www.rparallel.org/


GPUs
graphical processing units - in graphics cards
very good at parallel processing
need to taylor to speciﬁc GPU.
Packages in R:
gputools several basic routines.
cudaBayesreg Bayesian multilevel modeling for fMRI.



Further Reading
A tutorial on Parallel Computing:
https://computing.llnl.gov/tutorials/parallel_comp/
High Performance Computing task view on CRAN
http://cran.r-project.org/web/views/
HighPerformanceComputing.html
An up-to-date talk on high performance comuting with R:
http://dirk.eddelbuettel.com/papers/
useR2010hpcTutorial.pdf


References

Part III

Appendix

Axel Gandy Appendix 62

References

Topics in the coming lectures:
Optimisation
MCMC methods
Bootstrap
Particle Filtering


References

References I
Gentle, J. (2003). Random Number Generation and Monte Carlo Methods.
Springer.
Knuth, D. (1981). The art of computer programming. Vol. 2: Seminumerical
algorithms. Addison-Wesley.
Knuth, G. (1998). The Art of Computer Programming, Seminumerical Algorithms,
Vol.2 .
L’Ecuyer, P. (1994). Uniform random number generation. Annals of Operations
Research 53, 77–120.
L’Ecuyer, P. (2004). Random number generation. In Handbook of Computational
Statistics: Concepts and Methods, Springer.
L’Ecuyer, P. (2006). Uniform random number generation. In Handbooks in
Operations Research and Management Science, Elsevier.
L’Ecuyer, P., Simard, R., Chen, E. J. & Kelton, W. D. (2002). An
objected-oriented random-number package with many long streams and
substreams. Operations Research 50, 1073–1075. The code in C, C++, Java,
and FORTRAN is available.


References

References II
Niederreiter, H. (1992). Random number generation and quasi-Monte Carlo
methods. SIAM.


Lecture1

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Lecture1

Similar to Lecture1 (20)

Recently uploaded

Recently uploaded (20)

Lecture1