Introduction of Stan
@Teito Nakagawa
#TokyoBUGS 1st
29 September 2013
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Motivation
As an analyst, I’m using…
SMALL DATA
census
report deficit
data
Motivation
But a requirement is
BIG.
I must make a model.
I must tell many things.
Motivation
That’s the reason that
I start to learn BUGS.
BUT IT TAKES MUCH TIME
Motivation
So, I
start
to learn
Stan.
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
What Is Stan?
• What Is Stan?
• Who Develop Stan?
• Sample Code of Stan
• Execution of Stan
What Is Stan?
• “Stan is a package for obtaining Bayesian
inference using the No-U-Turn sampler, a
variant of Hamiltonian ...
Who Develop Stan?
• Andrew Gelman, his stuffs, Jiqiang Guo and
Marcus Brubaker
Photo Photo Photo
Sample Code of Stan
– Similar to BUGS but more Procedural
# http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol1.pdf
# Page 3: R...
Execution of Stan
– Fast:Compile to Execution File
1. stanc:translating the Stan program to C++
2. make:compiling the resu...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
How to Install it(Windows).
1. Environment
2. Install rtools
3. Install Rstan
4. Install stan
5. Build Stan
1.Environment
• I tested following model executions and install
at my PC.
•Windows 8 64bit
•Intel(R) Core(TM) i7-2600 CPU ...
2.Install Rtools
• Rtools is a collectionof resources for building packages
for R under Microsoft Windows
• g++ is install...
3.Install RStan
• Rstan is a library for using Stan from R.
• It is not registered at CRAN.
• You can install it just doin...
3.Install RStan
#additional package instllation
install.packages('inline')
install.packages('Rcpp')
#check to use rcpp:if ...
4.Install Stan
To use Stan from command line, we can install
stan itself by following step.
1. Download tar file stan-src-...
5.Build Stan
Bulid stan at a once after installing Stan.
1. Make the library
2. Make the model parser and code generator
*...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Grammer of Stan
1. Grammer of Stan
2. Blocks
3. DataTypes
4. Scope of Variables
Stan program …
• Stan Program defines a statistical model
through conditional probability.
• Stan Program consists of vari...
Stan Program consits of variable type
declarations and statements.
data {
int<lower=0> N;
int<lower=0> T;
real x[T];
real ...
Stan Program has specific blocks.
• Skeletetal Stan Program
• The order must be kept.
• Blocks are optional except model b...
Stan Program has specific blocks.
• Given input data.
• Executed first and load
Data
• Transform variables for a
convenien...
Stan Program has specific blocks.
• Transform parameters for a
convenience
Transformed
Parameters
• Model itself, Write th...
Stan Program can deal with various
variable types.
From http://stan.googlecode.com/files/stan-
reference-1.3.0.pdf
Stan Program can deal with various
variable types.
• Scalar
– Int is 32bit scalar integer. Upper and lower constraints are...
Stan Program can deal with various
variable types.
• Vector Data Types
– Unit Vector: vector with a norm of one.
e.g. unit...
Stan Program can deal with various
variable types.
• Matrix Data Types
– Matrix:Matrix
e.g. matrix<upper=0>[3,4] B;
– Corr...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Rats Data Model
1. Rats Data
2. Rats Model
Rats Data
• Rats data and its model are
contained WinBUGS example volume I.
(http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol...
Rats Model
• Hierarchical Regression Model considering individual
and time differences.
ondistributiNormalofprecision
iday...
Rats Model
model {
mu_alpha ~ normal(0, 100);
mu_beta ~ normal(0, 100);
sigmasq_y ~ inv_gamma(0.001, 0.001);
sigmasq_alpha...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Execution from CommandLine
• Execution of Stan
• stanc
• make
• execution
Execution of Stan
1. stanc:translating the Stan program to C++
2. make:compiling the resulting C++ to an executable
3. exe...
stanc
• The model translation program stanc
changes .stan file to .cpp file.
USAGE: stanc [options] <model_file>
--name=<s...
make
• We can compile the generated .cpp file by
make command
>make src/models/bugs_examples/vol1/rats/rats
execution
• We can execute stan sampler by executing the
generated .exe file
USAGE: .¥src¥models¥bugs_examples¥vol1¥rats¥r...
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Execution from R
• Rstan
• Execution from R
• plot(stanfit)
• traceplot(stanfit)
• fit using previous model
• parallel exe...
RStan
• Rstan is a interface to Stan
– Compiling Stan code, c++ code and execute from
RStan
– Visualization function of St...
Execution from R
#set to dir which contains source file
STAN_HOME<-<STAN_HOME>
dirpath<-paste0(STAN_HOME, "/include/stansr...
plot(stanfit)
We can check a value and
R-hat each paramters
traceplot(stanfit)
We can trace each chains.
fit using previous model
Once a model is fitted, we can use the fitted
result as an input to fit the model with other
data...
Parallel Execution from R
#parallel processing of
library(doSNOW)
library(foreach)
cl<-makeCluster(4) #change the 2 to you...
Parallel ExecutionPerformance
#Parralel Processing
timecalc<-matrix(0, nrow=4, ncol=7)
iter<-c(1000, 3000, 5000, 10000, 30...
Performance result
4cluster is BEST on My PC.
INDEX
• Motivation
• What Is Stan?
• How to Install it(Windows).
• Grammer of Stan
• Rat Data Model
• Execution from Comma...
Reference
• Reference
– User‘s Guide and Reference Manual:Grammer,
Diffrence between BUGS and Get-Started
(http://stan.goo...
End Of Slide
Stanislaw
MarcinUlam
(13 April 1909 – 13 May 1984)
http://en.wikipedia.org/wiki/Stanislaw_Ulam
Upcoming SlideShare
Loading in...5
×

Introduction of stan

6,726

Published on

Published in: Technology
0 Comments
14 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,726
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
3
Comments
0
Likes
14
Embeds 0
No embeds

No notes for slide

Introduction of stan

  1. 1. Introduction of Stan @Teito Nakagawa #TokyoBUGS 1st 29 September 2013
  2. 2. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  3. 3. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  4. 4. Motivation As an analyst, I’m using… SMALL DATA census report deficit data
  5. 5. Motivation But a requirement is BIG. I must make a model. I must tell many things.
  6. 6. Motivation That’s the reason that I start to learn BUGS. BUT IT TAKES MUCH TIME
  7. 7. Motivation So, I start to learn Stan.
  8. 8. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  9. 9. What Is Stan? • What Is Stan? • Who Develop Stan? • Sample Code of Stan • Execution of Stan
  10. 10. What Is Stan? • “Stan is a package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo.”(Official Site http://mc-stan.org/) – Similar to BUGS but more Procedural – Still updating – Fast:Compile to Execution File – Easy to use:Having R Interface – First Converge:Hamilton Monte Carlo and NUTS
  11. 11. Who Develop Stan? • Andrew Gelman, his stuffs, Jiqiang Guo and Marcus Brubaker Photo Photo Photo
  12. 12. Sample Code of Stan – Similar to BUGS but more Procedural # http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol1.pdf # Page 3: Rats data { int<lower=0> N; int<lower=0> T; real x[T]; real y[N,T]; real xbar; } ... model { mu_alpha ~ normal(0, 100); mu_beta ~ normal(0, 100); sigmasq_y ~ inv_gamma(0.001, 0.001); From https://github.com/stan-dev/stan/tree/master/src/models/bugs_examples/vol1/rats
  13. 13. Execution of Stan – Fast:Compile to Execution File 1. stanc:translating the Stan program to C++ 2. make:compiling the resulting C++ to an executable 3. exe:Running the stan program. In Detail, Discuss in later >¥bin¥stanc --name=rats --o=rats.cpp .¥rats.stan >make src/models/bugs_examples/vol1/rats/rats >.¥rats --data=rats.data.R --init=rats.init.R
  14. 14. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  15. 15. How to Install it(Windows). 1. Environment 2. Install rtools 3. Install Rstan 4. Install stan 5. Build Stan
  16. 16. 1.Environment • I tested following model executions and install at my PC. •Windows 8 64bit •Intel(R) Core(TM) i7-2600 CPU 3.4GHZ •4core •8thread •12.0 GB memory •R 3.0.1 •Rtools 3.1 •Stan 1.3.0 •RStan1.3.0
  17. 17. 2.Install Rtools • Rtools is a collectionof resources for building packages for R under Microsoft Windows • g++ is installed by Rtools. • Download the installer and execute it. – http://cran.r-project.org/bin/windows/Rtools/ • You shall check install notice of official site but in most cases you can install it with just clicking “next” . Installation screen shot
  18. 18. 3.Install RStan • Rstan is a library for using Stan from R. • It is not registered at CRAN. • You can install it just doing following script from R. – The script was a modified script originally written in https://code.google.com/p/stan/wiki/RStanGettin gStarted#Install_Rstan
  19. 19. 3.Install RStan #additional package instllation install.packages('inline') install.packages('Rcpp') #check to use rcpp:if it works, then it is printed “hello world” library(inline) library(Rcpp) src <- ' std::vector<std::string> s; s.push_back("hello"); s.push_back("world"); return Rcpp::wrap(s);‘ hellofun <- cxxfunction(body = src, includes = '', plugin = 'Rcpp', verbose = FALSE) cat(hellofun(), '¥n') #rstan instllation Sys.setenv(R_MAKEVARS_USER='') options(repos = c(getOption("repos"), rstan = "http://wiki.stan.googlecode.com/git/R")) install.packages('rstan', type = 'source') #load rstan library(rstan)
  20. 20. 4.Install Stan To use Stan from command line, we can install stan itself by following step. 1. Download tar file stan-src-1.m.p.tgz – Downloading Site: https://code.google.com/p/stan/downloads/list 2. Just unzip the above file in Documents directory following command – tar has been already installed in Windows if Rtools has been installed. > tar --no-same-owner -xzf stan-src-1.m.p.tgz
  21. 21. 5.Build Stan Bulid stan at a once after installing Stan. 1. Make the library 2. Make the model parser and code generator *<stan-home> is the directory which is generated by the previous tar command. >cd <stan-home> >make bin libstan.a >cd <stan-home> >make bin/stanc <stan-home>/bin as a result
  22. 22. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  23. 23. Grammer of Stan 1. Grammer of Stan 2. Blocks 3. DataTypes 4. Scope of Variables
  24. 24. Stan program … • Stan Program defines a statistical model through conditional probability. • Stan Program consists of variable type declarations and statements. • Stan Program has specific blocks. • Stan Program can deal with various variable types. • Stan Program is different from BUGS.
  25. 25. Stan Program consits of variable type declarations and statements. data { int<lower=0> N; int<lower=0> T; real x[T]; real y[N,T]; real xbar; } transformed data { real x_minus_xbar[T]; real y_linear[N*T]; for (t in 1:T) x_minus_xbar[t] <- x[t] - xbar; … rats_vec.stan block block Variable type declaration defines variable Statement Assingnments, Sampling Loop, Condition
  26. 26. Stan Program has specific blocks. • Skeletetal Stan Program • The order must be kept. • Blocks are optional except model block data { ... declarations ... } transformed data { ... declarations ... statements ... } parameters { ... declarations ... } transformed parameters { ... declarations ... statements ... } model { ... declarations ... statements ... } generated quantities { ... declarations ... statements ... } Order Scope
  27. 27. Stan Program has specific blocks. • Given input data. • Executed first and load Data • Transform variables for a convenience Transformed data • Result output parameter • Updated on iterations. Parameters
  28. 28. Stan Program has specific blocks. • Transform parameters for a convenience Transformed Parameters • Model itself, Write this based on what you want to describe.Model • Generate Quantitie for monitoring convergence. Generated Quantities
  29. 29. Stan Program can deal with various variable types. From http://stan.googlecode.com/files/stan- reference-1.3.0.pdf
  30. 30. Stan Program can deal with various variable types. • Scalar – Int is 32bit scalar integer. Upper and lower constraints are allowed. e.g. int N; int<lower=0,upper=1> cond; – Real is 64bit scalar numeric value. e.g. real<lower=0> sigma; real<lower=-1,upper=1> rho; • Vector Data Types – Real value is only allowed. – Vector is any types of vector data. e.g. vector<lower=0>[3] u; – UnitSimplex:for categorical or multinominal data, a vector contains non-negative values added to 1 e.g. simplex[5] theta;
  31. 31. Stan Program can deal with various variable types. • Vector Data Types – Unit Vector: vector with a norm of one. e.g. unit_vector[5] theta; – Ordered Vector:Ordered vectors are most often employed as cut points in ordered logistic regression models e.g. ordered[5] c; – Positive, Ordered Vector: e.g. positive_ordered[5] d; – Row Vector:It is different from vector.Stan distinguish between row and column e.g. row_vector<lower=-1,upper=1>[10] u;
  32. 32. Stan Program can deal with various variable types. • Matrix Data Types – Matrix:Matrix e.g. matrix<upper=0>[3,4] B; – Correlation Matrices:From -1 to 1, values are allowed. e.g. corr_matrix[3] Sigma; – Covariance Matrices: symmetric and positive definite. e.g. cov_matrix[K] Omega; • Array Data Types – Arrays are declared by enclosing the dimensions in square brackets following the name of the variable. – An array’s elements may be any of the basic data types. e.g. cov_matrix[5] mu[2,3,4];
  33. 33. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  34. 34. Rats Data Model 1. Rats Data 2. Rats Model
  35. 35. Rats Data • Rats data and its model are contained WinBUGS example volume I. (http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol1.pdf) • Original article is Gelfand et al (1990) • Weights of young rats measured by weekly for hierarchical model • Rows:individual rats (N=30) • Columns:day(M=5) From http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol1.pdf
  36. 36. Rats Model • Hierarchical Regression Model considering individual and time differences. ondistributiNormalofprecision idayofeffectiindividualofeffect daysofmedianxdaysxdataobservedY ii barj : :: )22(:::    From http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/Vol1.pdf
  37. 37. Rats Model model { mu_alpha ~ normal(0, 100); mu_beta ~ normal(0, 100); sigmasq_y ~ inv_gamma(0.001, 0.001); sigmasq_alpha ~ inv_gamma(0.001, 0.001); sigmasq_beta ~ inv_gamma(0.001, 0.001); alpha ~ normal(mu_alpha, sigma_alpha); // vectorized beta ~ normal(mu_beta, sigma_beta); // vectorized for (n in 1:N) for (t in 1:T) y[n,t] ~ normal(alpha[n] + beta[n] * (x[t] - xbar), sigma_y); }
  38. 38. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R:RStan • Reference
  39. 39. Execution from CommandLine • Execution of Stan • stanc • make • execution
  40. 40. Execution of Stan 1. stanc:translating the Stan program to C++ 2. make:compiling the resulting C++ to an executable 3. exe:Running the stan program. >¥bin¥stanc --name=rats --o=rats.cpp .¥rats.stan >make src/models/bugs_examples/vol1/rats/rats >.¥rats --data=rats.data.R --init=rats.init.R
  41. 41. stanc • The model translation program stanc changes .stan file to .cpp file. USAGE: stanc [options] <model_file> --name=<string> Model name (default = "$model_filename_model") --o=<file> Output file for generated C++ code (default = "$name.cpp") >¥bin¥stanc --name=rats --o=rats.cpp .¥rats.stan
  42. 42. make • We can compile the generated .cpp file by make command >make src/models/bugs_examples/vol1/rats/rats
  43. 43. execution • We can execute stan sampler by executing the generated .exe file USAGE: .¥src¥models¥bugs_examples¥vol1¥rats¥rats [options] OPTIONS: --data=<file>:Read data from specified dump-format file (required if model declares data) --init=<file>:Use initial values from specified file or zero values if <file>=0 (default is random initialization) --samples=<file> File into which samples are written(default = samples.csv) --append_samples Append samples to existing file if it exists(does not write header --seed=<int> Random number generation seed (default = randomly generated from time) --chain_id=<int> Markov chain identifier (default = 1) --iter=<+int> Total number of iterations, including warmup(default = 2000) --thin=<+int> Period between saved samples after warm up(default = max(1, floor(iter - warmup) / 1000)) --refresh=<int> Period between samples updating progress report print (0 for no printing) (default = max(1,iter/200))) >.¥rats --data=rats.data.R --init=rats.init.R
  44. 44. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R • Reference
  45. 45. Execution from R • Rstan • Execution from R • plot(stanfit) • traceplot(stanfit) • fit using previous model • parallel execution from R
  46. 46. RStan • Rstan is a interface to Stan – Compiling Stan code, c++ code and execute from RStan – Visualization function of Stan Result(stanfit class) Stan code C++ code exe stanc() sampling()stan_model() S4:stanfit plot() traceplot() extract() Architecture of Rstan stan () stan ()
  47. 47. Execution from R #set to dir which contains source file STAN_HOME<-<STAN_HOME> dirpath<-paste0(STAN_HOME, "/include/stansrc/models/bugs_examples/vol1/rats") #load data to list:dat source(paste0(dirpath, "/rats.data.R")) dat<-list(y=y, x=x, xbar=xbar, N=N, T=T) #fit1:to simulate the model as one liner fit1 <- stan(file = paste0(dirpath, "/rats.stan"), data = dat, iter = 1000, chains = 4) #fit2:to simulate the model step by step #translating from stan code to c++ code rt <- stanc(file = paste0(dirpath, "/rats.stan"), model_name="stan", verbose=TRUE) #compile c++ code for model sm <- stan_model(stanc_ret = rt, verbose = FALSE) #execute model simulation fit2 <- sampling(sm, data = dat, chains = 4, iter=1000)
  48. 48. plot(stanfit) We can check a value and R-hat each paramters
  49. 49. traceplot(stanfit) We can trace each chains.
  50. 50. fit using previous model Once a model is fitted, we can use the fitted result as an input to fit the model with other data or settings. This would save us time of compiling the C++ code for the model https://code.google.com/p/stan/wiki/RStanGettingStarted #fit again using the previous fit result fit3<-stan(fit=fit1, data = dat, iter = 400, chains = 4)
  51. 51. Parallel Execution from R #parallel processing of library(doSNOW) library(foreach) cl<-makeCluster(4) #change the 2 to your number of CPU cores registerDoSNOW(cl) #parallel processing each chain of stan sflist1<-foreach(i=1:10,.packages='rstan') %dopar% { stan(fit = fit1, data=dat, chains = 1, chain_id = i, refresh = -1) } #merging the chains f3<-sflist2stanfit(sflist1)
  52. 52. Parallel ExecutionPerformance #Parralel Processing timecalc<-matrix(0, nrow=4, ncol=7) iter<-c(1000, 3000, 5000, 10000, 30000, 50000, 100000) numproc<-c(1,2,4,8) #Single Processing for(i in 1:7){ cat("p:", 1,", iter:", iter[i], "¥r¥n") t<-proc.time() #------------------------------------------------- a<-stan(fit = fit1, data=dat, chains = 8, refresh = -1, iter=iter[i]) #------------------------------------------------- timecalc[1,i]<-(proc.time()-t)["elapsed"] } #Parallel Processing for(p in 2:4){ for(i in 1:7){ cat("proc:",numproc[p],"iter:", iter[i], "¥r¥n") t<-proc.time() #------------------------------------------------- #parallel processing of library(doSNOW) library(foreach) cl<-makeCluster(numproc[p]) registerDoSNOW(cl) #parallel processing each chain of stan sflist1<-foreach(k=1:8,.packages='rstan') %dopar% { stan(fit = fit1, data=dat, chains = 1, chain_id = k, refresh = -1, iter=iter[i]) } #merging each chains f3<-sflist2stanfit(sflist1) #------------------------------------------------- timecalc[p,i]<-(proc.time()-t)["elapsed"] } }
  53. 53. Performance result 4cluster is BEST on My PC.
  54. 54. INDEX • Motivation • What Is Stan? • How to Install it(Windows). • Grammer of Stan • Rat Data Model • Execution from Command Line • Execution from R • Reference
  55. 55. Reference • Reference – User‘s Guide and Reference Manual:Grammer, Diffrence between BUGS and Get-Started (http://stan.googlecode.com/files/stan-reference- 1.3.0.pdf) – Official Site(http://mc-stan.org/)
  56. 56. End Of Slide Stanislaw MarcinUlam (13 April 1909 – 13 May 1984) http://en.wikipedia.org/wiki/Stanislaw_Ulam

×