SlideShare a Scribd company logo
Simulation in Excel:
Tricks, Trials & Trends
           Presented to the
     American College of Radiology
           12 January 2012


     Dennis Sweitzer, Ph.D.!
       www.Dennis-Sweitzer.com   !
Abstract
Simulation in Excel: Tricks, Trials & Trends

Excel is a general purpose spreadsheet which is widely used & understood, but rarely used by itself for
simulations. However, the Data Table function in MS Excel can be used to execute substantial
simulations, without requiring cumbersome programming "tricks" or VBA coding. The result is an
arbitrarily large results table in which each row is one iteration of the simulation, and each column is a
random variable generated in the simulation.

A small number of additional probability functions are easily programmed using VBA to make Excel a
general purpose simulation package. Because VBA is interpreted, use of VBA functions can greatly limit
the speed of a simulation. However, for simulations of small size and complexity, the ease and familiarity
of working in Excel, outweigh the disadvantages of speed. Examples from clinical trials will be used.

Finally, I discuss new methods to move simulations out of the black boxes and into the enterprise, based
on work by Sam Savage. Simulation results (a “SIP”, or “Stochastic Information Packet”) from multiple
platforms can be stored as XML strings(using the DIST standard) in a “SLURP” (“Stochastic Library Unit
with Relationships Preserved”), and from there used for reports, planning, etc, or incorporated into other
simulations.
Outline
           •  How to do Simulation in Excel
    •  Notes on using Inverse Probability Functions
•  Some Macros and VBA      •  Clinical Trial Examples
        functions
             •  Probability Management
               in SIPS, SLURPS, & DIST
Background

•  Occasional need for simulations
•  Excel is convenient, but
  –  does not explicitly support simulations
  –  Simulation usually requires VBA programming
     (so why not use R or SAS instead)
  –  Or Add-in commercial programs (eg., @Risk)
  –  Or some academic add-ins
•  Does have iterative calculations, Solver
•  Why not simulation?
Simulate what?
•  Stochastic Models
  –  Unknown parameters? èGuestimate a distribution
  –  Optimizing choices? èTest each with simulations
•  Sensitivity Analysis
  –  Variations in Inputs   è Variations in Outputs
  –  2 parameters: use a table
  –  >2 parameters: simulate & compare variation
Excel: Pros
Common Language / Common Tools
•  Most people understand Excel          MEGO
•  Many tools available in Excel
Transparency: Modeling assumptions can be:
         Specified -- Graphed -- Debated
           What you see is what you get!
More hands on deck, more eyes on the prize….:
   Statistician               Team Member
  Initial Model          Explores & breaks model
Repair & enhance          …Repeat until satisfied
Excel Cons
Slower than in SAS, S+, R, etc
Lacks some statistical/probability functions
•  Latest versions are a little better
•  Still need to add some VBA code
•  Known bugs in statistical routines (often fixed)
Tradeoffs:
•  Quicker modifications
   vs slower execution
Simple Solution: Data Tables
Excel Data Tables
•  Creates a table of values of a function
   Each column is a Random Variable

•  Leftmost column is used as an argument
    –  (unneeded for simulation)
•  Data Table repeats calculations for each row
   Each row is a simulation iteration
1. Create Simulation




Create Random Variables using Inverse Probability Method:
For Random Variable X with distribution function F(x),
          F(x): ℜ→ [0,1]
If Random Uniform U∈ [0,1]
             X = F-1(U)      (Excel: U=Rand() )
2. Align Random Variables
               •  Calculations can be
                  anywhere in
                  Spreadsheet
               •  Reference the
                  Variables in a row
               •  Is best to label
                  variables in same way
3. Select Data Table
       •  Select table region
         –  1st row is Rand Vars
         –  1st column is not used
            (can label iterations)
       •  From toolbar:
         –  Data>Data Table
4. Create Simulation Table
          •  Column input cell =
             Upper left hand corner
             of table
          •  Row input cell = ignore
          •  OK è Populates the
             table
          •  (may have to manually
             recalculate)
5. Execute Simulation
      Iterative development
      •  Simulation can be changed
      •  Add reporting variables
      •  Recalculate to rerun
          –  (no need to use Data Table
             again, unless expanding)
      •  Hint: debug with short table,
         expand for final run
The End
(of the key concepts)
But still more….
•  Why use inverse probability distributions
   (instead of random variables)?
•  When not to use a spreadsheet for simulation?
•  Tools:
  –  Macros to set up a simulation
  –  VBA functions for common simulation distributions
•  Trends: Probability Management
  –  SIPs, SLURPS, DIST
Inverse Probability Function
•  Most systems directly generate random
   variables with the desired distribution
•  Why use Inverse Probability Functions?
  –  Which are (probably) slower?
Personal opinion
•  Testing & Debugging
•  Verification ç Calculates correctly
•  Validation ç Calculations answer Problem
•  Sensitivity ç Input vs Output variability
Why use Inverse Probability Distributions?
•  Testing & Debugging
•  Validation & Verification
•  Sensitivity

ç Save the Rand() values
è Recreate unexpected results
è Reasonableness: small changes in Rand() à small
  changes in output?
è Explore impact of small changes in Rand() values
  on simulation output
As Mapping function

                             ⟼F-1
               U


Probability Distribution:     F(x): ℜ→ [0,1]
Random Uniform:               U∈ (0,1]   	

Inverse PDF:                  X = F-1(U)

For Continuous (or monotone) F-1
  Small changes in u   U è small changes in F-1 (u)
Mapping



2 Random Uniform Var
      As input to
Deterministic Function
Mapping




Random numbers in
        (should)
Map to outputs in
Example #1
                             Simple model,      Saving {Ui}:
                            function of 2 RV
                                                •  Verify
                                                •  Replicate
                   A Max value looks high.      •  Quantify
              Is it a bug? If not, how often?

Saved random U[0,1]
For each iteration
Check u U[0,1]
That generated high value
u=0.983… è random high
è Rarely happens
Example #1 (Sensitivity)
                   Sort by U1, U2

                   çSensitive
                   to U1


                   çInsensitive
                   to U2
Spreadsheet limitations
•  Only simple data structures are available
  –  Rows & columns, no lists & trees
  –  Discrete event simulations
•  Complex algorithms: difficult
  –  Eg, While or for loops
  –  Can improvise (cumbersome, slow, buggy)
•  Speed: slow
•  Data Storage: what-you-see-is-all-you-get
Tools: Excel Simulation Template
•  Adds some missing random functions
•  Adds some set-up macros



Excel template & examples at:
         www.Dennis-Sweitzer.com
Macro SimulateSampler
To start a new simulation when you don't
remember the names & parameters of
common random variables used in simulation:
•  Run the Macro SimulationSample
•  Copy, delete, and edit as needed.
•  Make sure all random values are referenced
   in the first row of the data table at the
   bottom.
Macro SimulationSampler
          •  Creates a simulation with
             each of common
             simulation functions
Macro SimulationSampler
               ………
               •  Sets up header
                  row for data
                  table
               •  Sets up a place
                  for statistics
Macro Simulate
•  Highlight the row of random variables
  –  (1st row of simulation table)
•  Run macro "Simulate”
  –  Prompts for which will ask for the number of
     simulation iterations,
  –  The default number of iterations is 100
  –  Debug & develop (manually recalculate)
  –  Final run with >1000 iterations
  –  Visual Basic code is computationally intensive,
Macro Simulate
Excel Random Variables




Rand() --Random Uniform [0,1]
NormSInv() – Inverse Standard Normal Distribution
CriticalBinomial() – Inverse Binomial Distribution
LogNormInv() - Inverse Log Normal Distribution
    Caveat: parameters are mean, SD after the Log transformation
Erlang Distribution



How long do you wait until you get a
predetermined number of arrivals?
•  Interarrival times are distributed IID
   exponential
•  Erlang is Gamma with integer parameter
Beta Distribution




Can use as
•  Distribution of a Binomial probability
    •  Range = [0,1]
•  Generic bounded hump (vs Normal as generic unbounded hump)
•  Better behaved than a triangular distribution
Example#2, Problem




Client: “Here’s our plan….”
•  Simple spreadsheet calculation
  –  But only the expected value,
  –  but not variability
Example #2, Simulation
                      •  Time to 100th
                         patient
                      •  Patients arrive
                         IID Exponential

     Summary Statistics of Simulated values
     (below)
     Interpretation: under the assumptions,
     90% of simulations required more than 4.4
     months
Added VBA Functions
Inverse Functions Needed for Simulation
•  Poisson, Negative Binomial
Interpolation from Table
•  Interpolate: 1 or 2 dimensional interpolation
Convenience
•  Beta with Mean, SD as parameters
•  Beta with Hi, Low, and Mode used for
   parameters
•  Log Normal with mean, SD as parameters
Missing Statistical Functions
         Inverse Distributions
•  InvPoisson :: Poisson
•  InvPascal :: Negative Binomial
   – (how many failures before k successes)
•  Negative Binomial is continuous valued distribution;
•  Discrete version is often denoted Pascal distribution
Example#3,
Patients to Screen

Expected Enrollment rate
= 75% ± 5%
~ Beta Distribution

# Screen Failures
~ Negative Binomial (Pascal)
   –  Depends on Enrollment
      Rate
Beta Distribution (2)




For
Convenience
•  Beta distribution given Mean, SD
•  Beta distribution given Mean, SD, upper, lower bounds
•  Beta distribution given Mode, Upper, Lower bounds
Simulation from a Table


   Find the value in the 1st vector;
ç Return interpolated value from 2nd
Simulate arbitrary distribution:
•  Top Row: values in [0,1]
•  Bottom Row: Quantiles
•  Result: interpolated value of U from table
Or a function: y=f(x)
•  X is found in top row, y is interpolated from bottom row
Table Simulation Uses
• Polygonal distributions (like Triangular)
• Survival curve (for time to event)
  – Est. K-M curve from data, simulate rest of trial
• Arbitrary empirical distributions
• Distribution from observations
• Table of power calculations
  – eg, assurance calculations:
     • If # patients is random, so is effective power of the study
     • If True effect size is random, so is Pr{success}
Simulation from a 2-dimensional table




Here:
•  Rows are quartiles of a random function
•  Left column is value of a parameter
•  A family of distributions which vary with the parameter


•  Parameter y=75% (can be random)
•  Generate random numbers from the interpolated distribution.
Example #4: Interim Review
•  After 2 months, review randomization rates
•  Continue to Randomize to 100 patients
•  How long?
Example#4: Interim Review (Simulation)
Y= # Patients at 2 mos
~ Poisson

Time to Randomize
(100-Y) additional pts
~ Erlang (Gamma)

80% CI:; (2.5, 3.7)
months
Clinical Trials Applications
•  Simulations for planning
•  Prototyping larger simulation
•  Checking assumptions/validation
Planning
Expected Trial Performance
•  Usually not of interest -- already done w/o simulation
•  But should be
Variability of Trial Performance
•  Important for Risk Management: What s the earliest,
   the latest, the most, the least, etc
•  80% CIs
Structural Problems
•  Interactions of parameters may doom the trial before it
   even starts! (eg, mean (max{ X, Y} ) vs max{ mean(X), mean(Y) } )

                 ¡The Flaw of Averages!
Prototyping
Prototyping:
•  Toy simulation with hands-on teamwork
•  Development model
•  Get team buy-in on assumptions
•  Processing speed not important
•  Rapid modifications are important
Ideal?
•  Develop a prototype in an 1 hour meeting
•  Check for errors later
•  Run large simulations later for precise estimates
Checking planning assumptions
•  H0 = Simulation assumptions
•  Observed: a value X
•  {xi} = corresponding values in simulation
•  Rank of X in {xi} ≈ p-value
Stored Values: Use Function Percent Rank
Descriptive Statistics: Use Frequency Count

Use to:
•  Test assumptions, validate model, +??
•  If an observed value of X is rare in the simulation,
   question assumptions!
Checking Assumptions
Example:
•  A trial is designed based on a non-trivial simulation.
•  The model predicts a completion rate of 65%
           with 95% C.I.= (55%, 75%)
•  4 months into the trial, a 50% completion rate is
   observed.
•  How significant is this discrepancy?
Resimulate:
•  {xi} = simulated completion rates (1/iteration)
•  Rank of observed 50% in simulated {xi} ≈ p-value
•  How likely is the observation, under the modeled
   assumptions?
Sensitivity Analysis
•  What-ifs
•  Interactions between parameters
        è Identify Key Control points!
•  Vary parameters between simulations
•  Compare simulation results
  –  Eg, average, worst-case scenarios
•  Correlations between simulated parameters
   and outcomes
Weighted simulations
Advantage:
•  Large but unlikely events are more likely to
   be simulated
•  Common but dull events are simulated
   infrequently, but up-weighted
•  Rare, but exciting, events are simulated, and
   down-weighted
Macro Management
VBA Editor:
   Alt-F11 (or find the menu)
•  Copy Module between sheets
•  Copy code from .xls sheet &
   insert into VBA editor
•  Open & save as new sheet
Macro Management (newer)
In Visual Basic

From the
Tool Bar

•  File > Export File
  –  Export VBA code
     (module: “SweitzerSimulationCoreCode”)
•  File > Import File
  –  Imports VBA code (into a module)
Further resources
Commercial and Free software packages
Provide:
•  More rigorous algorithms
•  More functions
  –  Resampling, multivariate, etc
•  More support
Commercial Add-Ins
@RISK
     www.palisade.com
Crystal Ball
     www.decisioneering.com
Free Add-Ins
PopTools       (Windows only)
      www.cse.csiro.au/poptools
SimTools.xla (Macintosh & Windows)
http://home.uchicago.edu/~rmyerson/addins.htm
Caveat: Licensing
•  Free for non-commercial (eg, education)
•  Not clear for other uses
   (NB: vba code from my website is free for all use,
                   but not as useful)
Semi-Commercial
Low-cost Excel simulation add-in:
•  RiskSim by Michael Middleton
•  www.treeplan.com/
•  Also: Decision Trees, Sensitivity Analysis,
   on-line text-book:
      http://www.treeplan.com/chapters.htm
Additional Reading
INTRODUCTION TO MODELING AND GENERATING
PROBABILISTIC INPUT PROCESSES FOR SIMULATION

www.informs-sim.org/wsc07papers/008.pdf
Spreadsheet Simulation (Seila, 2006)
www.informs-sim.org/wsc06papers/002.pdf
Work Smarter, Not Harder: Guidelines for
Designing Simulation Experiments
www.informs-sim.org/wsc06papers/005.pdf
Tips for the Successful Practice of Simulation
www.informs-sim.org/wsc06papers/007.pdf
Probability Management

Built more elaborate models
Learned to
•  Display results in column
•  Copy values to save
•  Do math with the results
                               Why not?
                               •  Save columns
                                  of simulated
                                  iterations
                               •  Recombine as
                                  needed
Combining simulations results
                   4 simulations:
                    { 2 studies} x {2 scenarios}
Why not?
•  Save columns
                       Study#1,
   of simulated
   iterations
                       Early Start             Estimates of
•  Recombine as                                total:
                         Study#1,
   needed                Late Start
                                               •  Resources
                                               •  Costs
                                               •  Pr{success}
                      Study#2,
                      Early Start
                                                   Pick optimal

                                               M
                            Study#2,
                            Late Start                Requires
                                               independence!
     •  Ie., portfolio optimization
Combining simulation iterations
                    4 simulations:
                     { 2 studies} x {2 scenarios}
Why not?
•  Save columns
                        Study#1,
   of simulated
                        Early Start
   iterations
•  Recombine as           Study#1,
   needed                 Late Start            Estimates of …

                       Study#2,
       Simulation      Early Start
       of common
         factors             Study#2,
                             Late Start


 •  Preserves relationships
Probability Management
                        Other people already doing it
Further research:




 Primary source for rest of presentation:
 Savage, Scholtes and Zweidler, 2006, "Probability
 Management," OR/MS Today, Vol.33, No.1 (February 2006)
 •  http://www.orms-today.org/orms-2-06/frprobability.html
 (Part 2)
 •  http://www.orms-today.org/orms-4-06/frprobability.html
Basic idea

Simulations
 Simulations
    Simulations
of common
  of of common
      common
  factors
     factors
       factors    Dependent
                  Simulations
                    Dependent
                    Simulations

                   Dependent
                   Simulations
                                     Reporting &
                   Dependent         Analysis
                   Simulations       Programs


                                  Estimates of …
Basic idea
     Simulations
                    Simulations             Multiple simulations:
                        Simulations         •  Different platforms
                             Simulations
                                Simulations
                                               •  Different sources
                                      Simulations •  Different uses


                                                Reporting &
                                                Analysis
                                                Programs &
                                                  Reporting
•  Database of Simulation Results                 Analysis
•  Results at the iteration level                 Programs

•  Coherent
Basic Definitions
   Simulations




                                                 SIP: Stochastic
                                                 Information Package
                                                 •  Basic unit of information
                                                 •  Eg, “the price of oil”, but for
                                                    10,000 alternative universes

SLURP: Stochastic Library Unit with
Relationships Preserved
•  SIPs are coherent with each other
    –  Eg, in each SIP, iteration #4567 is from the same alternative universe
•  Analogous to demographic “Representative Samples”
Basic Definitions
   Simulations



Benefits of coherent               Requires central control:
modeling                           •  Common standards
•  Statistical dependencies are    •  Certification authority
   modeled consistently across        –  “Chief Probability Officer”
   the organization
•  Models can be “rolled up”
   between levels of the
   organization
•  Auditability: Easier to audit
   individual simple models
Coherence
  Simulations



Example: variables           Requires central control:
X&Y                          •  Common standards
•  Coherent                  •  Certification authority
                                –  “Chief Probability Officer”
•  But not correlated
DIST Standard
      Simulations
                                        XML
How to                                  •  10,000 numbers
                                           1 XML string
Store SIPs?                             Metadata + Base 64
•  Massive                              encoding of values
   amounts of data
                                        Contents:
How to                                  •  Name
                     Reduce precision
Share SIPs?            and pack it!     •  Mean, Min, Max,
                                            Count of values
                                        •  Data type (Binary,
                                            1 or 2 Byte)
                                        3 bytes (8 bits each)
                                        into
                                        4 characters (6 bits each)
DIST Standard
•  A SIP in DIST    fits into 1 cell on a spreadsheet


                                   <dist name="User Interface, weeks"
                                   avg="3.3751" min="2.03" max="7.75" count="100"
                                   type="Double" origin="DistShaper3 at smpro.ca"
                                   ver="1.1" >G00Z9SIDCIEmC0nYFtMi6R0XKZ
                                   +KvSzBI85ui5tMZgoDlbGt dF1d/
                                   CqEMwUlmCfVMMg6oUByUXQyIATsaSw1QhgrhOwaaAI9D
                                   6oks9M+IDk0XQyIDlI2mhJZBkQXRnm7IR45ST3D///
                                   IDlgrHD I38VraK2kLownZf41jWw1tROxTsS/
                                   jGRAUJCbwHfwougAAEXR r3A83FQnpnhXukBxM
                                   +kswBykeb0gOQ5RByk83PxtV7mCrH1QQ
                                   jy6LPGstpgFYRrYKvqZ9Ez8AAAAA</dist>!
 •  Each cell contains an array
 •  Operations apply functions
    to each element in array

                                   Source: Marc Thibault, Sam Savage. Probability
                                   Management for Projects: Managing Uncertainty in
                                   plan estimates and targets.. October 2011
Supporting Software
MS Excel Spreadsheet Add-ins
•  Risk Solver from Frontline Systems (www.Solver.com)
                           <dist name="User Interface, weeks"
•  XLSim 3 (www.VectorEconomics.com) max="7.75" count="100"
                           avg="3.3751" min="2.03"
                             type="Double" origin="DistShaper3 at smpro.ca"
   –  small (single sheet) interactive simulation with DISTs
                             ver="1.1" >G00Z9SIDCIEmC0nYFtMi6R0XKZ
                             +KvSzBI85ui5tMZgoDlbGt dF1d/
   –  enables the users of Oracle Crystal Ball and @Risk from
                             CqEMwUlmCfVMMg6oUByUXQyIATsaSw1QhgrhOwaaAI9D
                             6oks9M+IDk0XQyIDlI2mhJZBkQXRnm7IR45ST3D///
      Palisade Corp. to read and right DISTs.
                             IDlgrHD I38VraK2kLownZf41jWw1tROxTsS/
                             jGRAUJCbwHfwougAAEXR r3A83FQnpnhXukBxM
•  Analytica from Lumina Decision Systems, Inc
                             +kswBykeb0gOQ5RByk83PxtV7mCrH1QQ
                             jy6LPGstpgFYRrYKvqZ9Ez8AAAAA</dist>!
   (www.Lumina.com)
SAS?
R/S+ --Already is vector oriented
•  RExcel runs R from Excel. ??
R/S+
Ø  x1<-rnorm(10000)          # an array of 10,000 standard random normal
Ø  y1<-rpois(10000, 5)      # an array of 10,000 random poissons
Ø  (x1+y1)[1:10]            # element by element operations
•  Already handles vectors – very fast
•  Needs functions to encode & decode DIST


                ¿Accessing R from with spreadsheet?
•  RExcel – Access R from within Excel (Addin)
•  ROOo – Access R from within OpenOffice spreadsheet
    •  Open Source (like LINIX)
•  (Perhaps) use spreadsheet for upper level simulation
•  Use R at lower level – each cell contains 1000’s of simulated values
Probability Management



Savage, Scholtes and Zweidler, 2006, "Probability
Management," OR/MS Today, Vol.33, No.1 (February 2006)
•  http://www.orms-today.org/orms-2-06/frprobability.html
(Part 2)
•  http://www.orms-today.org/orms-4-06/frprobability.html
The End
(Actual – not simulated)

More Related Content

Similar to Sim Slides,Tricks,Trends,2012jan15

مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
Fares Al-Qunaieer
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
Vinay Kumar
 
Mbd dd
Mbd ddMbd dd
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
Seiya Tokui
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
Omid Vahdaty
 
Machine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConfMachine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConf
Seth Juarez
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
Techbuddy Consulting Pvt. Ltd.
 
Elementary Data Analysis with MS Excel_Day-4
Elementary Data Analysis with MS Excel_Day-4Elementary Data Analysis with MS Excel_Day-4
Elementary Data Analysis with MS Excel_Day-4
Redwan Ferdous
 
Problem-solving and design 1.pptx
Problem-solving and design 1.pptxProblem-solving and design 1.pptx
Problem-solving and design 1.pptx
TadiwaMawere
 
Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
ssuserff72e4
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
Mark Peng
 
Matlab pt1
Matlab pt1Matlab pt1
Matlab pt1
Austin Baird
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
Eran Shlomo
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
Andrew Ferlitsch
 
Vba Class Level 1
Vba Class Level 1Vba Class Level 1
Vba Class Level 1
Ben Miu CIM® FCSI A+
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
Pranav Ainavolu
 
presentation Updated.pdf
presentation Updated.pdfpresentation Updated.pdf
presentation Updated.pdf
GovtSenSecNagkalan
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
Ivo Andreev
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Universitat Politècnica de Catalunya
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Maninda Edirisooriya
 

Similar to Sim Slides,Tricks,Trends,2012jan15 (20)

مدخل إلى تعلم الآلة
مدخل إلى تعلم الآلةمدخل إلى تعلم الآلة
مدخل إلى تعلم الآلة
 
Mat lab workshop
Mat lab workshopMat lab workshop
Mat lab workshop
 
Mbd dd
Mbd ddMbd dd
Mbd dd
 
Learning stochastic neural networks with Chainer
Learning stochastic neural networks with ChainerLearning stochastic neural networks with Chainer
Learning stochastic neural networks with Chainer
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
Machine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConfMachine Learning on Azure - AzureConf
Machine Learning on Azure - AzureConf
 
MATLAB & Image Processing
MATLAB & Image ProcessingMATLAB & Image Processing
MATLAB & Image Processing
 
Elementary Data Analysis with MS Excel_Day-4
Elementary Data Analysis with MS Excel_Day-4Elementary Data Analysis with MS Excel_Day-4
Elementary Data Analysis with MS Excel_Day-4
 
Problem-solving and design 1.pptx
Problem-solving and design 1.pptxProblem-solving and design 1.pptx
Problem-solving and design 1.pptx
 
Lecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdfLecture1_computer vision-2023.pdf
Lecture1_computer vision-2023.pdf
 
General Tips for participating Kaggle Competitions
General Tips for participating Kaggle CompetitionsGeneral Tips for participating Kaggle Competitions
General Tips for participating Kaggle Competitions
 
Matlab pt1
Matlab pt1Matlab pt1
Matlab pt1
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
Vba Class Level 1
Vba Class Level 1Vba Class Level 1
Vba Class Level 1
 
Understanding Basics of Machine Learning
Understanding Basics of Machine LearningUnderstanding Basics of Machine Learning
Understanding Basics of Machine Learning
 
presentation Updated.pdf
presentation Updated.pdfpresentation Updated.pdf
presentation Updated.pdf
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
 
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
Lecture 8 - Feature Engineering and Optimization, a lecture in subject module...
 

More from Dennis Sweitzer

Clinical Study Modeling & Simulation
Clinical Study Modeling & SimulationClinical Study Modeling & Simulation
Clinical Study Modeling & Simulation
Dennis Sweitzer
 
TolstoyTarget,AnimatedExpl,v5
TolstoyTarget,AnimatedExpl,v5TolstoyTarget,AnimatedExpl,v5
TolstoyTarget,AnimatedExpl,v5
Dennis Sweitzer
 
DSweitzer,SERC,StudySimulations,2016jul
DSweitzer,SERC,StudySimulations,2016julDSweitzer,SERC,StudySimulations,2016jul
DSweitzer,SERC,StudySimulations,2016jul
Dennis Sweitzer
 
2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep
Dennis Sweitzer
 
JSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzerJSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzer
Dennis Sweitzer
 
Jsm2013,598,sweitzer,randomization metrics,v2,aug08
Jsm2013,598,sweitzer,randomization metrics,v2,aug08Jsm2013,598,sweitzer,randomization metrics,v2,aug08
Jsm2013,598,sweitzer,randomization metrics,v2,aug08
Dennis Sweitzer
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
Dennis Sweitzer
 
Randomization: Too Important to Gamble with.
Randomization: Too Important to Gamble with.Randomization: Too Important to Gamble with.
Randomization: Too Important to Gamble with.
Dennis Sweitzer
 
Election Polling &amp; Forecasting 2004
Election Polling &amp; Forecasting 2004Election Polling &amp; Forecasting 2004
Election Polling &amp; Forecasting 2004
Dennis Sweitzer
 
Splatter Plots2,Sweitzer,2011dec13
Splatter Plots2,Sweitzer,2011dec13Splatter Plots2,Sweitzer,2011dec13
Splatter Plots2,Sweitzer,2011dec13
Dennis Sweitzer
 
Sweitzer,Simulating Multi Phase Studies
Sweitzer,Simulating Multi Phase StudiesSweitzer,Simulating Multi Phase Studies
Sweitzer,Simulating Multi Phase Studies
Dennis Sweitzer
 
Jsm Proceedings Sweitzer Trial Term Model V7
Jsm Proceedings Sweitzer Trial Term Model V7Jsm Proceedings Sweitzer Trial Term Model V7
Jsm Proceedings Sweitzer Trial Term Model V7
Dennis Sweitzer
 

More from Dennis Sweitzer (12)

Clinical Study Modeling & Simulation
Clinical Study Modeling & SimulationClinical Study Modeling & Simulation
Clinical Study Modeling & Simulation
 
TolstoyTarget,AnimatedExpl,v5
TolstoyTarget,AnimatedExpl,v5TolstoyTarget,AnimatedExpl,v5
TolstoyTarget,AnimatedExpl,v5
 
DSweitzer,SERC,StudySimulations,2016jul
DSweitzer,SERC,StudySimulations,2016julDSweitzer,SERC,StudySimulations,2016jul
DSweitzer,SERC,StudySimulations,2016jul
 
2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep2013jsm,Proceedings,DSweitzer,26sep
2013jsm,Proceedings,DSweitzer,26sep
 
JSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzerJSM2013,Proceedings,paper307699_79238,DSweitzer
JSM2013,Proceedings,paper307699_79238,DSweitzer
 
Jsm2013,598,sweitzer,randomization metrics,v2,aug08
Jsm2013,598,sweitzer,randomization metrics,v2,aug08Jsm2013,598,sweitzer,randomization metrics,v2,aug08
Jsm2013,598,sweitzer,randomization metrics,v2,aug08
 
Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2Sct2013 boston,randomizationmetricsposter,d6.2
Sct2013 boston,randomizationmetricsposter,d6.2
 
Randomization: Too Important to Gamble with.
Randomization: Too Important to Gamble with.Randomization: Too Important to Gamble with.
Randomization: Too Important to Gamble with.
 
Election Polling &amp; Forecasting 2004
Election Polling &amp; Forecasting 2004Election Polling &amp; Forecasting 2004
Election Polling &amp; Forecasting 2004
 
Splatter Plots2,Sweitzer,2011dec13
Splatter Plots2,Sweitzer,2011dec13Splatter Plots2,Sweitzer,2011dec13
Splatter Plots2,Sweitzer,2011dec13
 
Sweitzer,Simulating Multi Phase Studies
Sweitzer,Simulating Multi Phase StudiesSweitzer,Simulating Multi Phase Studies
Sweitzer,Simulating Multi Phase Studies
 
Jsm Proceedings Sweitzer Trial Term Model V7
Jsm Proceedings Sweitzer Trial Term Model V7Jsm Proceedings Sweitzer Trial Term Model V7
Jsm Proceedings Sweitzer Trial Term Model V7
 

Sim Slides,Tricks,Trends,2012jan15

  • 1. Simulation in Excel: Tricks, Trials & Trends Presented to the American College of Radiology 12 January 2012 Dennis Sweitzer, Ph.D.! www.Dennis-Sweitzer.com !
  • 2. Abstract Simulation in Excel: Tricks, Trials & Trends Excel is a general purpose spreadsheet which is widely used & understood, but rarely used by itself for simulations. However, the Data Table function in MS Excel can be used to execute substantial simulations, without requiring cumbersome programming "tricks" or VBA coding. The result is an arbitrarily large results table in which each row is one iteration of the simulation, and each column is a random variable generated in the simulation. A small number of additional probability functions are easily programmed using VBA to make Excel a general purpose simulation package. Because VBA is interpreted, use of VBA functions can greatly limit the speed of a simulation. However, for simulations of small size and complexity, the ease and familiarity of working in Excel, outweigh the disadvantages of speed. Examples from clinical trials will be used. Finally, I discuss new methods to move simulations out of the black boxes and into the enterprise, based on work by Sam Savage. Simulation results (a “SIP”, or “Stochastic Information Packet”) from multiple platforms can be stored as XML strings(using the DIST standard) in a “SLURP” (“Stochastic Library Unit with Relationships Preserved”), and from there used for reports, planning, etc, or incorporated into other simulations.
  • 3. Outline •  How to do Simulation in Excel •  Notes on using Inverse Probability Functions •  Some Macros and VBA •  Clinical Trial Examples functions •  Probability Management in SIPS, SLURPS, & DIST
  • 4. Background •  Occasional need for simulations •  Excel is convenient, but –  does not explicitly support simulations –  Simulation usually requires VBA programming (so why not use R or SAS instead) –  Or Add-in commercial programs (eg., @Risk) –  Or some academic add-ins •  Does have iterative calculations, Solver •  Why not simulation?
  • 5. Simulate what? •  Stochastic Models –  Unknown parameters? èGuestimate a distribution –  Optimizing choices? èTest each with simulations •  Sensitivity Analysis –  Variations in Inputs è Variations in Outputs –  2 parameters: use a table –  >2 parameters: simulate & compare variation
  • 6. Excel: Pros Common Language / Common Tools •  Most people understand Excel MEGO •  Many tools available in Excel Transparency: Modeling assumptions can be: Specified -- Graphed -- Debated What you see is what you get! More hands on deck, more eyes on the prize….: Statistician Team Member Initial Model Explores & breaks model Repair & enhance …Repeat until satisfied
  • 7. Excel Cons Slower than in SAS, S+, R, etc Lacks some statistical/probability functions •  Latest versions are a little better •  Still need to add some VBA code •  Known bugs in statistical routines (often fixed) Tradeoffs: •  Quicker modifications vs slower execution
  • 8. Simple Solution: Data Tables Excel Data Tables •  Creates a table of values of a function  Each column is a Random Variable •  Leftmost column is used as an argument –  (unneeded for simulation) •  Data Table repeats calculations for each row  Each row is a simulation iteration
  • 9. 1. Create Simulation Create Random Variables using Inverse Probability Method: For Random Variable X with distribution function F(x), F(x): ℜ→ [0,1] If Random Uniform U∈ [0,1] X = F-1(U) (Excel: U=Rand() )
  • 10. 2. Align Random Variables •  Calculations can be anywhere in Spreadsheet •  Reference the Variables in a row •  Is best to label variables in same way
  • 11. 3. Select Data Table •  Select table region –  1st row is Rand Vars –  1st column is not used (can label iterations) •  From toolbar: –  Data>Data Table
  • 12. 4. Create Simulation Table •  Column input cell = Upper left hand corner of table •  Row input cell = ignore •  OK è Populates the table •  (may have to manually recalculate)
  • 13. 5. Execute Simulation Iterative development •  Simulation can be changed •  Add reporting variables •  Recalculate to rerun –  (no need to use Data Table again, unless expanding) •  Hint: debug with short table, expand for final run
  • 14. The End (of the key concepts)
  • 15. But still more…. •  Why use inverse probability distributions (instead of random variables)? •  When not to use a spreadsheet for simulation? •  Tools: –  Macros to set up a simulation –  VBA functions for common simulation distributions •  Trends: Probability Management –  SIPs, SLURPS, DIST
  • 16. Inverse Probability Function •  Most systems directly generate random variables with the desired distribution •  Why use Inverse Probability Functions? –  Which are (probably) slower? Personal opinion •  Testing & Debugging •  Verification ç Calculates correctly •  Validation ç Calculations answer Problem •  Sensitivity ç Input vs Output variability
  • 17. Why use Inverse Probability Distributions? •  Testing & Debugging •  Validation & Verification •  Sensitivity ç Save the Rand() values è Recreate unexpected results è Reasonableness: small changes in Rand() à small changes in output? è Explore impact of small changes in Rand() values on simulation output
  • 18. As Mapping function ⟼F-1 U Probability Distribution: F(x): ℜ→ [0,1] Random Uniform: U∈ (0,1] Inverse PDF: X = F-1(U) For Continuous (or monotone) F-1 Small changes in u U è small changes in F-1 (u)
  • 19. Mapping 2 Random Uniform Var As input to Deterministic Function
  • 20. Mapping Random numbers in (should) Map to outputs in
  • 21. Example #1 Simple model, Saving {Ui}: function of 2 RV •  Verify •  Replicate A Max value looks high. •  Quantify Is it a bug? If not, how often? Saved random U[0,1] For each iteration Check u U[0,1] That generated high value u=0.983… è random high è Rarely happens
  • 22. Example #1 (Sensitivity) Sort by U1, U2 çSensitive to U1 çInsensitive to U2
  • 23. Spreadsheet limitations •  Only simple data structures are available –  Rows & columns, no lists & trees –  Discrete event simulations •  Complex algorithms: difficult –  Eg, While or for loops –  Can improvise (cumbersome, slow, buggy) •  Speed: slow •  Data Storage: what-you-see-is-all-you-get
  • 24. Tools: Excel Simulation Template •  Adds some missing random functions •  Adds some set-up macros Excel template & examples at: www.Dennis-Sweitzer.com
  • 25. Macro SimulateSampler To start a new simulation when you don't remember the names & parameters of common random variables used in simulation: •  Run the Macro SimulationSample •  Copy, delete, and edit as needed. •  Make sure all random values are referenced in the first row of the data table at the bottom.
  • 26. Macro SimulationSampler •  Creates a simulation with each of common simulation functions
  • 27. Macro SimulationSampler ……… •  Sets up header row for data table •  Sets up a place for statistics
  • 28. Macro Simulate •  Highlight the row of random variables –  (1st row of simulation table) •  Run macro "Simulate” –  Prompts for which will ask for the number of simulation iterations, –  The default number of iterations is 100 –  Debug & develop (manually recalculate) –  Final run with >1000 iterations –  Visual Basic code is computationally intensive,
  • 30. Excel Random Variables Rand() --Random Uniform [0,1] NormSInv() – Inverse Standard Normal Distribution CriticalBinomial() – Inverse Binomial Distribution LogNormInv() - Inverse Log Normal Distribution Caveat: parameters are mean, SD after the Log transformation
  • 31. Erlang Distribution How long do you wait until you get a predetermined number of arrivals? •  Interarrival times are distributed IID exponential •  Erlang is Gamma with integer parameter
  • 32. Beta Distribution Can use as •  Distribution of a Binomial probability •  Range = [0,1] •  Generic bounded hump (vs Normal as generic unbounded hump) •  Better behaved than a triangular distribution
  • 33. Example#2, Problem Client: “Here’s our plan….” •  Simple spreadsheet calculation –  But only the expected value, –  but not variability
  • 34. Example #2, Simulation •  Time to 100th patient •  Patients arrive IID Exponential Summary Statistics of Simulated values (below) Interpretation: under the assumptions, 90% of simulations required more than 4.4 months
  • 35. Added VBA Functions Inverse Functions Needed for Simulation •  Poisson, Negative Binomial Interpolation from Table •  Interpolate: 1 or 2 dimensional interpolation Convenience •  Beta with Mean, SD as parameters •  Beta with Hi, Low, and Mode used for parameters •  Log Normal with mean, SD as parameters
  • 36. Missing Statistical Functions Inverse Distributions •  InvPoisson :: Poisson •  InvPascal :: Negative Binomial – (how many failures before k successes) •  Negative Binomial is continuous valued distribution; •  Discrete version is often denoted Pascal distribution
  • 37. Example#3, Patients to Screen Expected Enrollment rate = 75% ± 5% ~ Beta Distribution # Screen Failures ~ Negative Binomial (Pascal) –  Depends on Enrollment Rate
  • 38. Beta Distribution (2) For Convenience •  Beta distribution given Mean, SD •  Beta distribution given Mean, SD, upper, lower bounds •  Beta distribution given Mode, Upper, Lower bounds
  • 39. Simulation from a Table Find the value in the 1st vector; ç Return interpolated value from 2nd Simulate arbitrary distribution: •  Top Row: values in [0,1] •  Bottom Row: Quantiles •  Result: interpolated value of U from table Or a function: y=f(x) •  X is found in top row, y is interpolated from bottom row
  • 40. Table Simulation Uses • Polygonal distributions (like Triangular) • Survival curve (for time to event) – Est. K-M curve from data, simulate rest of trial • Arbitrary empirical distributions • Distribution from observations • Table of power calculations – eg, assurance calculations: • If # patients is random, so is effective power of the study • If True effect size is random, so is Pr{success}
  • 41. Simulation from a 2-dimensional table Here: •  Rows are quartiles of a random function •  Left column is value of a parameter •  A family of distributions which vary with the parameter •  Parameter y=75% (can be random) •  Generate random numbers from the interpolated distribution.
  • 42. Example #4: Interim Review •  After 2 months, review randomization rates •  Continue to Randomize to 100 patients •  How long?
  • 43. Example#4: Interim Review (Simulation) Y= # Patients at 2 mos ~ Poisson Time to Randomize (100-Y) additional pts ~ Erlang (Gamma) 80% CI:; (2.5, 3.7) months
  • 44. Clinical Trials Applications •  Simulations for planning •  Prototyping larger simulation •  Checking assumptions/validation
  • 45. Planning Expected Trial Performance •  Usually not of interest -- already done w/o simulation •  But should be Variability of Trial Performance •  Important for Risk Management: What s the earliest, the latest, the most, the least, etc •  80% CIs Structural Problems •  Interactions of parameters may doom the trial before it even starts! (eg, mean (max{ X, Y} ) vs max{ mean(X), mean(Y) } ) ¡The Flaw of Averages!
  • 46. Prototyping Prototyping: •  Toy simulation with hands-on teamwork •  Development model •  Get team buy-in on assumptions •  Processing speed not important •  Rapid modifications are important Ideal? •  Develop a prototype in an 1 hour meeting •  Check for errors later •  Run large simulations later for precise estimates
  • 47. Checking planning assumptions •  H0 = Simulation assumptions •  Observed: a value X •  {xi} = corresponding values in simulation •  Rank of X in {xi} ≈ p-value Stored Values: Use Function Percent Rank Descriptive Statistics: Use Frequency Count Use to: •  Test assumptions, validate model, +?? •  If an observed value of X is rare in the simulation, question assumptions!
  • 48. Checking Assumptions Example: •  A trial is designed based on a non-trivial simulation. •  The model predicts a completion rate of 65% with 95% C.I.= (55%, 75%) •  4 months into the trial, a 50% completion rate is observed. •  How significant is this discrepancy? Resimulate: •  {xi} = simulated completion rates (1/iteration) •  Rank of observed 50% in simulated {xi} ≈ p-value •  How likely is the observation, under the modeled assumptions?
  • 49. Sensitivity Analysis •  What-ifs •  Interactions between parameters è Identify Key Control points! •  Vary parameters between simulations •  Compare simulation results –  Eg, average, worst-case scenarios •  Correlations between simulated parameters and outcomes
  • 50. Weighted simulations Advantage: •  Large but unlikely events are more likely to be simulated •  Common but dull events are simulated infrequently, but up-weighted •  Rare, but exciting, events are simulated, and down-weighted
  • 51. Macro Management VBA Editor: Alt-F11 (or find the menu) •  Copy Module between sheets •  Copy code from .xls sheet & insert into VBA editor •  Open & save as new sheet
  • 52. Macro Management (newer) In Visual Basic From the Tool Bar •  File > Export File –  Export VBA code (module: “SweitzerSimulationCoreCode”) •  File > Import File –  Imports VBA code (into a module)
  • 53. Further resources Commercial and Free software packages Provide: •  More rigorous algorithms •  More functions –  Resampling, multivariate, etc •  More support
  • 54. Commercial Add-Ins @RISK www.palisade.com Crystal Ball www.decisioneering.com
  • 55. Free Add-Ins PopTools (Windows only) www.cse.csiro.au/poptools SimTools.xla (Macintosh & Windows) http://home.uchicago.edu/~rmyerson/addins.htm Caveat: Licensing •  Free for non-commercial (eg, education) •  Not clear for other uses (NB: vba code from my website is free for all use, but not as useful)
  • 56. Semi-Commercial Low-cost Excel simulation add-in: •  RiskSim by Michael Middleton •  www.treeplan.com/ •  Also: Decision Trees, Sensitivity Analysis, on-line text-book: http://www.treeplan.com/chapters.htm
  • 57. Additional Reading INTRODUCTION TO MODELING AND GENERATING PROBABILISTIC INPUT PROCESSES FOR SIMULATION www.informs-sim.org/wsc07papers/008.pdf Spreadsheet Simulation (Seila, 2006) www.informs-sim.org/wsc06papers/002.pdf Work Smarter, Not Harder: Guidelines for Designing Simulation Experiments www.informs-sim.org/wsc06papers/005.pdf Tips for the Successful Practice of Simulation www.informs-sim.org/wsc06papers/007.pdf
  • 58. Probability Management Built more elaborate models Learned to •  Display results in column •  Copy values to save •  Do math with the results Why not? •  Save columns of simulated iterations •  Recombine as needed
  • 59. Combining simulations results 4 simulations: { 2 studies} x {2 scenarios} Why not? •  Save columns Study#1, of simulated iterations Early Start Estimates of •  Recombine as total: Study#1, needed Late Start •  Resources •  Costs •  Pr{success} Study#2, Early Start Pick optimal M Study#2, Late Start Requires independence! •  Ie., portfolio optimization
  • 60. Combining simulation iterations 4 simulations: { 2 studies} x {2 scenarios} Why not? •  Save columns Study#1, of simulated Early Start iterations •  Recombine as Study#1, needed Late Start Estimates of … Study#2, Simulation Early Start of common factors Study#2, Late Start •  Preserves relationships
  • 61. Probability Management Other people already doing it Further research: Primary source for rest of presentation: Savage, Scholtes and Zweidler, 2006, "Probability Management," OR/MS Today, Vol.33, No.1 (February 2006) •  http://www.orms-today.org/orms-2-06/frprobability.html (Part 2) •  http://www.orms-today.org/orms-4-06/frprobability.html
  • 62. Basic idea Simulations Simulations Simulations of common of of common common factors factors factors Dependent Simulations Dependent Simulations Dependent Simulations Reporting & Dependent Analysis Simulations Programs Estimates of …
  • 63. Basic idea Simulations Simulations Multiple simulations: Simulations •  Different platforms Simulations Simulations •  Different sources Simulations •  Different uses Reporting & Analysis Programs & Reporting •  Database of Simulation Results Analysis •  Results at the iteration level Programs •  Coherent
  • 64. Basic Definitions Simulations SIP: Stochastic Information Package •  Basic unit of information •  Eg, “the price of oil”, but for 10,000 alternative universes SLURP: Stochastic Library Unit with Relationships Preserved •  SIPs are coherent with each other –  Eg, in each SIP, iteration #4567 is from the same alternative universe •  Analogous to demographic “Representative Samples”
  • 65. Basic Definitions Simulations Benefits of coherent Requires central control: modeling •  Common standards •  Statistical dependencies are •  Certification authority modeled consistently across –  “Chief Probability Officer” the organization •  Models can be “rolled up” between levels of the organization •  Auditability: Easier to audit individual simple models
  • 66. Coherence Simulations Example: variables Requires central control: X&Y •  Common standards •  Coherent •  Certification authority –  “Chief Probability Officer” •  But not correlated
  • 67. DIST Standard Simulations XML How to •  10,000 numbers 1 XML string Store SIPs? Metadata + Base 64 •  Massive encoding of values amounts of data Contents: How to •  Name Reduce precision Share SIPs? and pack it! •  Mean, Min, Max, Count of values •  Data type (Binary, 1 or 2 Byte) 3 bytes (8 bits each) into 4 characters (6 bits each)
  • 68. DIST Standard •  A SIP in DIST fits into 1 cell on a spreadsheet <dist name="User Interface, weeks" avg="3.3751" min="2.03" max="7.75" count="100" type="Double" origin="DistShaper3 at smpro.ca" ver="1.1" >G00Z9SIDCIEmC0nYFtMi6R0XKZ +KvSzBI85ui5tMZgoDlbGt dF1d/ CqEMwUlmCfVMMg6oUByUXQyIATsaSw1QhgrhOwaaAI9D 6oks9M+IDk0XQyIDlI2mhJZBkQXRnm7IR45ST3D/// IDlgrHD I38VraK2kLownZf41jWw1tROxTsS/ jGRAUJCbwHfwougAAEXR r3A83FQnpnhXukBxM +kswBykeb0gOQ5RByk83PxtV7mCrH1QQ jy6LPGstpgFYRrYKvqZ9Ez8AAAAA</dist>! •  Each cell contains an array •  Operations apply functions to each element in array Source: Marc Thibault, Sam Savage. Probability Management for Projects: Managing Uncertainty in plan estimates and targets.. October 2011
  • 69. Supporting Software MS Excel Spreadsheet Add-ins •  Risk Solver from Frontline Systems (www.Solver.com) <dist name="User Interface, weeks" •  XLSim 3 (www.VectorEconomics.com) max="7.75" count="100" avg="3.3751" min="2.03" type="Double" origin="DistShaper3 at smpro.ca" –  small (single sheet) interactive simulation with DISTs ver="1.1" >G00Z9SIDCIEmC0nYFtMi6R0XKZ +KvSzBI85ui5tMZgoDlbGt dF1d/ –  enables the users of Oracle Crystal Ball and @Risk from CqEMwUlmCfVMMg6oUByUXQyIATsaSw1QhgrhOwaaAI9D 6oks9M+IDk0XQyIDlI2mhJZBkQXRnm7IR45ST3D/// Palisade Corp. to read and right DISTs. IDlgrHD I38VraK2kLownZf41jWw1tROxTsS/ jGRAUJCbwHfwougAAEXR r3A83FQnpnhXukBxM •  Analytica from Lumina Decision Systems, Inc +kswBykeb0gOQ5RByk83PxtV7mCrH1QQ jy6LPGstpgFYRrYKvqZ9Ez8AAAAA</dist>! (www.Lumina.com) SAS? R/S+ --Already is vector oriented •  RExcel runs R from Excel. ??
  • 70. R/S+ Ø  x1<-rnorm(10000) # an array of 10,000 standard random normal Ø  y1<-rpois(10000, 5) # an array of 10,000 random poissons Ø  (x1+y1)[1:10] # element by element operations •  Already handles vectors – very fast •  Needs functions to encode & decode DIST ¿Accessing R from with spreadsheet? •  RExcel – Access R from within Excel (Addin) •  ROOo – Access R from within OpenOffice spreadsheet •  Open Source (like LINIX) •  (Perhaps) use spreadsheet for upper level simulation •  Use R at lower level – each cell contains 1000’s of simulated values
  • 71. Probability Management Savage, Scholtes and Zweidler, 2006, "Probability Management," OR/MS Today, Vol.33, No.1 (February 2006) •  http://www.orms-today.org/orms-2-06/frprobability.html (Part 2) •  http://www.orms-today.org/orms-4-06/frprobability.html
  • 72. The End (Actual – not simulated)