Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Stat405                Simulation


                              Hadley Wickham
Thursday, 23 September 2010
1. Homework comments
               2. Mathematical approach
               3. More randomness
               4. Random nu...
Homework
                   Just graded your organisation and code, and
                   focused my comments there.
    ...
Code
                   Gives explicit technical details.
                   Your comments should remind you why
         ...
Mathematical
                               approach

                   Why are we doing this simulation? Could
         ...
calculate_prize <- function(windows) {
       payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
         "BB" = 25, "B" = 10,...
slots <- read.csv("slots.csv", stringsAsFactors = F)

     # Calculate empirical distribution
     dist <- table(c(slots$w...
poss <- expand.grid(
       w1 = slots, w2 = slots, w3 = slots,
       stringsAsFactors = FALSE
     )

     poss$prize <-...
Your turn
                   How can you calculate the probability of each
                   combination?
               ...
poss$prob <- with(poss,
       dist[w1] * dist[w2] * dist[w3])

     (poss_mean <- with(poss, sum(prob * prize)))

     # ...
More
                randomness

Thursday, 23 September 2010
Sample

                   Very useful for selecting from a discrete
                   set (vector) of possibilities.
   ...
How can you?
                   Choose 1 from vector
                   Choose n from vector, with replacement
           ...
# Choose 1 from vector
     sample(letters, 1)

     # Choose n from vector, without replacement
     sample(letters, 10)
...
# Put a vector in random order
     sample(letters)

     # Put a data frame in random order
     slots[sample(1:nrow(slot...
Your turn
                   Source of randomness in random_prize is
                   sample. Other options are:
       ...
Function              Distribution       Parameters
                 runif            Uniform            min, max
        ...
Distributions
                   Other functions
                    •         r to generate random numbers
              ...
# Easy to combine random variables

     n <- rpois(10000, lambda = 10)
     x <- rbinom(10000, size = n, prob = 0.3)
    ...
# Simulation is a powerful tool for exploring
     # distributions. Easy to do computationally; hard
     # to do analytic...
Your turn




Thursday, 23 September 2010
RNG
                              Computers are deterministic, so how
                                do they produce rand...
Thursday, 23 September 2010
How do computers
                generate random numbers?

                   They don’t! Actually produce pseudo-
       ...
next_val <- function(x, a, c, m) {
       (a * x + c) %% m
     }

     x <- 1001
     (x <- next_val(x, 1664525, 10139042...
# Random numbers are reproducible!

     set.seed(1)
     runif(10)

     set.seed(1)
     runif(10)

     # Very useful w...
True randomness
                   Atmospheric radio noise: http://
                   www.random.org. Use from R with
   ...
Upcoming SlideShare
Loading in …5
×

10 simulation

969 views

Published on

  • Be the first to comment

  • Be the first to like this

10 simulation

  1. 1. Stat405 Simulation Hadley Wickham Thursday, 23 September 2010
  2. 2. 1. Homework comments 2. Mathematical approach 3. More randomness 4. Random number generators Thursday, 23 September 2010
  3. 3. Homework Just graded your organisation and code, and focused my comments there. Biggest overall tip: use floating figures (with figure {...}) with captions. Use ref{} to refer to the figure in the text. Captions should start with brief description of plot (including bin width if applicable) and finish with brief description of what the plot reveals. Will grade captions more aggressively in the future. Thursday, 23 September 2010
  4. 4. Code Gives explicit technical details. Your comments should remind you why you did what you did. Most readers will not look at it, but it’s very important to include it, because it means that others can check your work. Thursday, 23 September 2010
  5. 5. Mathematical approach Why are we doing this simulation? Could work out the expected value and variance mathematically. So let’s do it! Simplifying assumption: slots are iid. Thursday, 23 September 2010
  6. 6. calculate_prize <- function(windows) { payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } prize } Thursday, 23 September 2010
  7. 7. slots <- read.csv("slots.csv", stringsAsFactors = F) # Calculate empirical distribution dist <- table(c(slots$w1, slots$w2, slots$w3)) dist <- dist / sum(dist) slots <- names(dist) Thursday, 23 September 2010
  8. 8. poss <- expand.grid( w1 = slots, w2 = slots, w3 = slots, stringsAsFactors = FALSE ) poss$prize <- NA for(i in seq_len(nrow(poss))) { window <- as.character(poss[i, 1:3]) poss$prize[i] <- calculate_prize(window) } Thursday, 23 September 2010
  9. 9. Your turn How can you calculate the probability of each combination? (Hint: think about subsetting. Another hint: think about the table and character subsetting. Final hint: you can do this in one line of code) Then work out the expected value (the payoff). Thursday, 23 September 2010
  10. 10. poss$prob <- with(poss, dist[w1] * dist[w2] * dist[w3]) (poss_mean <- with(poss, sum(prob * prize))) # How do we determine the variance of this # estimator? Thursday, 23 September 2010
  11. 11. More randomness Thursday, 23 September 2010
  12. 12. Sample Very useful for selecting from a discrete set (vector) of possibilities. Four arguments: x, size, replace, prob Thursday, 23 September 2010
  13. 13. How can you? Choose 1 from vector Choose n from vector, with replacement Choose n from vector, without replacement Perform a weighted sample Put a vector in random order Put a data frame in random order Thursday, 23 September 2010
  14. 14. # Choose 1 from vector sample(letters, 1) # Choose n from vector, without replacement sample(letters, 10) sample(letters, 40) # Choose n from vector, with replacement sample(letters, 40, replace = T) # Perform a weighted sample sample(names(dist), prob = dist) Thursday, 23 September 2010
  15. 15. # Put a vector in random order sample(letters) # Put a data frame in random order slots[sample(1:nrow(slots)), ] Thursday, 23 September 2010
  16. 16. Your turn Source of randomness in random_prize is sample. Other options are: runif, rbinom, rnbinom, rpois, rnorm, rt, rcauchy What sort of random variables do they generate and what are their parameters? Practice generating numbers from them. Thursday, 23 September 2010
  17. 17. Function Distribution Parameters runif Uniform min, max rbinom Binomial size, prob rnbinom Negative binomial size, prob rpois Poisson lambda rnorm Normal mean, sd rt t df rcauchy Cauchy location, scale Thursday, 23 September 2010
  18. 18. Distributions Other functions • r to generate random numbers • d to compute density f(x) • p to compute distribution F(x) • q to compute inverse distribution F-1(x) Thursday, 23 September 2010
  19. 19. # Easy to combine random variables n <- rpois(10000, lambda = 10) x <- rbinom(10000, size = n, prob = 0.3) qplot(x, binwidth = 1) p <- runif(10000) x <- rbinom(10000, size = 10, prob = p) qplot(x, binwidth = 0.1) # cf. qplot(runif(10000), binwidth = 0.1) Thursday, 23 September 2010
  20. 20. # Simulation is a powerful tool for exploring # distributions. Easy to do computationally; hard # to do analytically qplot(1 / rpois(10000, lambda = 20)) qplot(1 / runif(10000, min = 0.5, max = 2)) qplot(rnorm(10000) ^ 2) qplot(rnorm(10000) / rnorm(10000)) # http://www.johndcook.com/distribution_chart.html Thursday, 23 September 2010
  21. 21. Your turn Thursday, 23 September 2010
  22. 22. RNG Computers are deterministic, so how do they produce randomness? Thursday, 23 September 2010
  23. 23. Thursday, 23 September 2010
  24. 24. How do computers generate random numbers? They don’t! Actually produce pseudo- random sequences. Common approach: Xn+1 = (aXn + c) mod m (http://en.wikipedia.org/wiki/ Linear_congruential_generator) Thursday, 23 September 2010
  25. 25. next_val <- function(x, a, c, m) { (a * x + c) %% m } x <- 1001 (x <- next_val(x, 1664525, 1013904223, 2^32)) # http://en.wikipedia.org/wiki/ List_of_pseudorandom_number_generators # R uses # http://en.wikipedia.org/wiki/Mersenne_twister Thursday, 23 September 2010
  26. 26. # Random numbers are reproducible! set.seed(1) runif(10) set.seed(1) runif(10) # Very useful when required to make a reproducible # example that involves randomness Thursday, 23 September 2010
  27. 27. True randomness Atmospheric radio noise: http:// www.random.org. Use from R with random package. Not really important unless you’re running a lottery. (Otherwise by observing a long enough sequence you can predict the next value) Thursday, 23 September 2010

×