Upcoming SlideShare
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Standard text messaging rates apply

# Kernel regression with python

1,283

Published on

4 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total Views
1,283
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
13
0
Likes
4
Embeds 0
No embeds

No notes for slide

### Transcript

• 1. PRML復々習レーン#9 LTKernel regression with Python __youki__
• 2. 本題に入る前に • PRMLのカーネル法解説がわかりにくいと思うあなたへ． “Nonparametric Econometrics: A Primer” Jeffrey S. RacineFoundations and Trends® in Econometrics, Vol.3 No.1 (2008) DOI: 10.1561/0800000009 英語だけどとってもわかりやすいです！
• 3. Outline• Kernel Regression with instant code – Kernel regression with local constant estimator • 1-D kernel regression • 2-D kernel regression• Kernel Regression with statsmodels – Kernel regression with local linear estimator • 1-D kernel regression • 2-D kernel regression
• 4. Kernel Regression with Instant Code Implementation of local constant estimator
• 5. Implementation of local constant estimator• y: Local Constant Estimator  PRML（6.45） def get_local_constant_estimator(h, X, Y, x): y = np.empty(x.shape[0]) for i in xrange(x.shape[0]): K = get_gpke(h, X, x[i]) y[i] = (Y * K).sum() / K.sum() return y• g: The Generalized Product Kernel Density Estimator def get_gpke(h, X, x): K = np.empty(X.shape) for j in xrange(len(x)): K[:, j] = get_gaussian_kernel(h, X[:, j], x[j]) gpke = K.prod(axis=1) / h ** len(x) return gpke• k: Gaussian Kernel for continuous variables def get_gaussian_kernel(h, X, x): return (np.sqrt(2*np.pi) ** -1) * np.exp(-.5 * ((X - x)/h) ** 2)
• 6. kernel regression for 1d and 2d data 1d 正弦関数 2d混合ガウス分布
• 7. DEMO with instant python code
• 8. Kernel Regression with statsmodels local linear estimator and bandwidth estimation
• 9. statsmodels?• Statistics in python – http://statsmodels.sourceforge.net/devel/• 統計モデルを用いたデータ分析ツール – Linear Regression – Generalized Linear Models – Robust Linear Models – Regression with Discrete Dependent Variable – Time Series analysis – Statistics – Nonparametric Methods  kernel regression included!! – Generalized Method of Moments – Empirical Likelihood• まだRには遠く及ばないが簡単なことはこれを使えばできる!• 機械学習系のライブラリにはstatsmodelsではなくscikit-learnがある．
• 10. Kernel regression in statsmodels Additional functionalities
• 11. Estimator: how local linear estimator works?Edge biasoccurs in LC
• 12. Estimator: how local linear estimator works?Bad effect in LL
• 13. Local Constant vs. Local Linear
• 14. Bandwidth optimization: How AIC & CV works?Plug-InRuppert, D., S. J. Sheather, and M. P. Wand (1995), ‘Aneffective bandwidth selector for local least squaresregression’. Journal of the American StatisticalAssociation 90, 1257–1270.AIC & CVHurvich, C. M., J. S. Simonoff, and C. L. Tsai (1998),‘Smoothing parameter selection in nonparametricregression using an improved Akaike informationcriterion’. Journal of the Royal Statistical SocietySeries B 60, 271–293.上記以外にも経験的なBandwidth算出方法がある “Nonparametric Econometrics: A Primer” pp. 43
• 15. kernel regression for 1d and 2d data Using statsmodels-0.5.0 1d 正弦関数 2d混合ガウス分布
• 16. DEMO with statsmodels