Successfully reported this slideshow.
Upcoming SlideShare
×

# PyMC mcmc

8,931 views

Published on

Introduction of pyMC (Japanese)

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### PyMC mcmc

1. 1. 1 PyMCPyMCによる確率的プログラミングとによる確率的プログラミングとMCMCMCMC ととTheanoTheano 2014/7/12 BUGS,stan勉強会 #3 @xiangze750
2. 2. 2 Agenda ● Pythonでのベイズモデリング ● PyMCの使い方 ● “Probabilistic Programming and Bayesian Methods for Hackers” ● 参照すべきPyMCブログ “While My MCMC Gently Samples “ ● Theano, GPUとの連携 ● Appendix: Theano, HMC
3. 3. 3 Pythonでのベイズモデリング ● Pystan ● PyMC
4. 4. 4 PyMCの利点 ● Installが簡単 ● pythonでモデリング、実行、可視化ができる。 ● c++での高速化 (Theano) – HMC,NUTS – GPUの使用？
5. 5. 5 Install ● #PyMC 2.3 pip install pymc ● #PyMC 3(開発中) pip install git+https://github.com/pymc-devs/pymc ● うまくいかない場合 git clone https://github.com/pymc-devs/pymc cd pymc python setup.py install
6. 6. 6 Documents ● User's guide – http://pymc-devs.github.io/pymc/ ● Tutorial – https://github.com/fonnesbeck/pymc_tutorial ● Probabilistic Programming and Bayesian Methods for Hackers – http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic- Programming-and-Bayesian-Methods-for-Hackers/blob/master/Prologue/ Prologue.ipynb
7. 7. 7 PyMCの使い方 ● 基本的文法 ● モデル構築 – 確率変数(分布) – 決定的変数 @pm.deterministic – 観測変数 ● Sampling ● Traceplot, histogram
8. 8. 8 Probabilistic Programming and Bayesian Methods for Hackers ● 1. Introduction ● 2. MorePyMC ● 3. MCMC ● 4. The Greatest Theorem Never Told (The Law of Large Numbers) ● 5. Loss Functions ● 6. Priorities ● 7. Bayesian Machine Learning http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesi an-Methods-for-Hackers/tree/master/ Ipython notebook、pymcを使用したベイズの教科書
9. 9. 9 PyMCの使い方: 変数 #事前分布 lambda_1 = pm.Exponential("lambda_1", 1) lambda_2 = pm.Exponential("lambda_2", 1) tau = pm.DiscreteUniform("tau", lower=0, upper=10) http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb Stanの parameter block に対応 ● 確率変数
10. 10. 10 PyMCの使い方: 変数 #@pm.stochasticを用いて独自分布を定義することも可能 @pm.stochastic(dtype=int) def switchpoint(value=1900, t_l=1851, t_h=1962): if value > t_h or value < t_l: return -np.inf # Invalid values else: return -np.log(t_h - t_l + 1)# Uniform log- likelihood ● 確率変数 http://pymc-devs.github.io/pymc/modelbuilding.html#the-stochastic-class
11. 11. 11 PyMCの使い方: 変数 ● 確率分布の一覧 http://pymc-devs.github.io/pymc/distributions.html#chap-distributions
12. 12. 12 PyMCの使い方: 変数 #関数定義の前に@pm.deterministicを付ける n_data_points = 5 # in CH1 we had ~70 data points @pm.deterministic def lambda_(tau=tau, lambda_1=lambda_1, lambda_2=lambda_2): out = np.zeros(n_data_points) out[:tau] = lambda_1 # lambda before tau is lambda1 out[tau:] = lambda_2 # lambda after tau is lambda2 return out ＃lambdaの値をtauで切り替える。手続き的記述 http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb Stanのtransformed parameter blockに対応 ● 決定的変数
13. 13. 13 PyMCの使い方: 変数 #observed=True data = np.array([10, 25, 15, 20, 35]) obs = pm.Poisson("obs", lambda_, value=data, observed=True) http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb Stanのdata blockに対応 ● 観測変数
14. 14. 14 PyMCの使い方: モデル構築 #定義した変数のリストを渡す model = pm.Model([obs, lambda_, lambda_1, lambda_2, tau]) http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb Stanのmodel blockに対応 ● モデル
15. 15. 15 PyMCの使い方: sampling #MCMCのための初期値推定 model = pm.Model( [p, assignment, taus, centers ] ) map_ = pm.MAP( model ) map_.fit() #stores the fitted variables'values in foo.value #MCMC mcmc = pm.MCMC( model ) mcmc.sample( 100000, 50000 ) http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesi an-Methods-for-Hackers/blob/master/Chapter3_MCMC/IntroMCMC.ipynb
16. 16. 16 PyMCの使い方: histogram, random samples = [lambda_1.random() for i in range(20000)] plt.hist(samples, bins=70, normed=True, histtype="stepfilled") http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb
17. 17. 17 PyMCの使い方: traceplot with pm.Model() as model: x = pm.Normal('x', mu=0., sd=1) y = pm.Normal('y', mu=pm.exp(x), sd=2., shape=(ndims, 1)) # here, shape is telling us it's a vector rather than a scalar. z = pm.Normal('z', mu=x + y, sd=.75, observed=zdata) # shape is inferred from zdata with model: 　 start = pm.find_MAP() step = pm.NUTS() trace = pm.sample(3000, step, start) pm.traceplot(trace) http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb
18. 18. 18 PyMCの使い方: traceplot http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb
19. 19. 19 その他の例 ● Bayesian A/B testing ● Cheating among students ● Challenger Space Shuttle Disaster http://nbviewer.ipython.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayes ian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/MorePyMC.ipynb
20. 20. 20 注目すべきPyMCブログ While My MCMC Gently Samples http://twiecki.github.io/
21. 21. 21 注目すべきpymcブログ Repos https://github.com/twiecki ● mpi4py_map – provides a simple map() interface to mpi4py that allows easy parallelization of function evaluations over sequential input. ● CythonGSL – Cython interface for the GNU Scientific Library (GSL).
22. 22. 22 一般化線形モデル(glm) ● code(抜粋) http://twiecki.github.io/blog/2013/08/27/bayesian-glms-2/ with pm.Model() as model_robust: family = pm.glm.families.T() pm.glm.glm('y ~ x', data, family=family) trace_robust = pm.sample(2000, pm.NUTS(), progressbar=False) plt.figure(figsize=(5, 5)) plt.plot(x_out, y_out, 'x') pm.glm.plot_posterior_predictive(trace_robust, label='posterior predictive regression lines') plt.plot(x, true_regression_line, label='true regression line', lw=3., c='y') plt.legend();
23. 23. 23 例: 階層的線形モデル 家の中のラドン濃度 ● 85 countries ● 2 to 116 measurements http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
24. 24. 24 例: 階層的線形モデル ● Pooling of measurements – 各地点で共通のパラメータθ http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
25. 25. 25 例: 階層的線形モデル ● Unpooled measurements – パラメータθが地点c毎に異なる。 http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
26. 26. 26 例: 階層的線形モデル ● Partial pooling: Hierarchical Regression – ハイパーパラメータμ, σ http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/
27. 27. 27 例: 階層的線形モデル code ● code(抜粋) http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/ with pm.Model() as hierarchical_model: # ハイパーパラメータ(平均と分散) mu_a = pm.Normal('mu_alpha', mu=0., sd=100**2) sigma_a = pm.Uniform('sigma_alpha', lower=0, upper=100) mu_b = pm.Normal('mu_beta', mu=0., sd=100**2) sigma_b = pm.Uniform('sigma_beta', lower=0, upper=100) #箇所ごとの傾きと切片, 正規分布 a = pm.Normal('alpha', mu=mu_a, sd=sigma_a, shape=n_counties) b = pm.Normal('beta', mu=mu_b, sd=sigma_b, shape=n_counties) # Model error eps = pm.Uniform('eps', lower=0, upper=100) radon_est = a[county_idx] + b[county_idx] * data.floor.values #尤度 radon_like = pm.Normal('radon_like', mu=radon_est, sd=eps, observed=data.log_radon)
28. 28. 28 例: 階層的線形モデル code ● code(抜粋) http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/ #modelの実行 with hierarchical_model: start = pm.find_MAP() step = pm.NUTS(scaling=start) hierarchical_trace = pm.sample(2000, step, start=start, progressbar=False)
29. 29. 29 例: 階層的線形モデル ● CASSでは一カ所でしか測定していない – http://twiecki.github.io/blog/2014/03/17/bayesian-glms-3/ –
30. 30. 30 例: 階層的線形モデル ● Root Mean Square Deviation individual/non-hierarchical model 0.13 hierarchical model 0.08
31. 31. 31 Theano, GPUとの連携 PyMC3 https://github.com/pymc-devs/pymc ＿人人人人人人人人人人＿ ＞ 　開発中！！ 　＜ ￣Y^Y^Y^Y^Y^Y^Y^Y^Y￣
32. 32. 32 Appendix: Theanoについて ● PythonでDeep learning関係のアルゴリズム実装が出来る ライブラリ(http://deeplearning.net/software/theano/) – (Stacked) Auto Denoising Encoder, RBMなどの実装公開 – 定義した式を記号微分で変形させ、コンパイルする形式 – GPU(Nvidia CUDA)で計算を並列化可能 – 内部でgcc, nvcc(CUDAのコンパイラ)を呼んでいる。 – HMCのサンプル実装(http://deeplearning.net/tutorial/hmc.html)
33. 33. 33 Appendix: HMC (Hamilton Monte-Carlo) ● Hamilton (Hyblid) Monte-Carlo – “運動量”とハミルトニアンHを定義して分布関数が 小さい部分での移動幅を大きくし、効率的にサンプリ ング。 – 積分が1遷移あたりの計算量が通常のMCMCより大き くなる。 http://xiangze.hatenablog.com/entry/2014/06/21/234930
34. 34. 34 Appendix: HMC (Hamilton Monte-Carlo) 参考: ● BDA 12.4 “Hamiltonian Monte Carlo”のメモ – http://ito-hi.blog.so-net.ne.jp/2014-06-12 ● はじめてのMCMC (ハイブリッド・モンテカルロ) – http://tatsyblog.wordpress.com/2014/03/22/%E3%81%AF%E3%81%98% E3%82%81%E3%81%A6%E3%81%AEmcmc-%E3%83%8F%E3%82%A4%E3%83%96%E3 %83%AA%E3%83%83%E3%83%89%E3%83%BB%E3%83%A2%E3%83%B3%E3%83%86% E3%82%AB%E3%83%AB%E3%83%AD/ ● Theano実装の解説 – http://nbviewer.ipython.org/gist/xiangze/c2719235434bee796288