Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline    Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions


...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline    Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions


...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?    What is Machine Learning?   Show me the money   Learn More   Questions


...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Outline   Why R?   How to find out about stuff?   What is Machine Learning?   Show me the money   Learn More   Questions



...
Upcoming SlideShare
Loading in...5
×

Machine Learning with R

1,092

Published on

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,092
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
36
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Machine Learning with R

  1. 1. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Machine Learning with R Joshua Reich josh@i2pi.com April 2, 2009
  2. 2. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions
  3. 3. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions ML Alternatives • Matlab • Weka • Python • Stand alone (e.g. Vowpal Wabbit)
  4. 4. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions ”The best thing about R is that it was developed by statisticians. The worst thing about R is that it was developed by statisticians.” –Bo Cowgill, Google (at SF R Meetup)
  5. 5. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Why R? • Working with the CLI - iterative discovery • Integrated graphics • Community supported packages (CRAN) • ODBC Integration • You already use it
  6. 6. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions How to find out about stuff? • ?function • help.search("search string") • RSiteSearch("search string") • http://rseek.org/ • names(object) or attributes(object) • > kmeans function (x, centers, iter.max = 10, nstart = 1, algorithm = c("Hartigan-Wong", "Lloyd", "Forgy", "MacQueen")) ...
  7. 7. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions What is Machine Learning? Statistics Machine Learning Probability Model Learning Model Observations Observations Estimation Training MLE Optimization
  8. 8. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Semantics v. Pragmatics • For most statistical models there are either closed form or quick numerical approximations for finding model properties - e.g., confidence intervals. Assuming you believe that your data generating process is accurately captured by your model, then you can make direct statements about unseen events. • Machine learning is a close cousin to non-parametric techniques and relies on training/testing/validation cycles, bootstrapping and cross-validation to determine measures of reliability. But invariably, simple models and a lot of data trump more elaborate models based on less data –Halevy, Norvig & Pereira.
  9. 9. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Inductive Bias In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6. ”What are you doing?”, asked Minsky. ”I am training a randomly wired neural net to play Tic-tac-toe”, Sussman replied. ”Why is the net wired randomly?”, asked Minsky. ”I do not want it to have any preconceptions of how to play”, Sussman said. Minsky then shut his eyes.
  10. 10. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Inductive Bias ”Why do you close your eyes?” Sussman asked his teacher. ”So that the room will be empty.”
  11. 11. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions What is Machine Learning? • Regression vs. Classification Y ∈ Rp vs. Y ∈ {Y1 , Y2 , . . . , YN } • Supervised vs. Unsupervised Learning Y = f (X ) vs. X1 , X2 , . . . , XN
  12. 12. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions General Supervised Learning Framework (Ch 7 of ElemStatLearn) • Training / Validation / Test • Variance - Bias Decomposition: Overfitting • Feature Selection / Regularization • Bootstrapping / Cross-Validation
  13. 13. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions What we will walk through • K-Means clustering: kmeans() • K-Nearest Neighbours: knn() • Regression Trees: rpart() • Improving trees with PCA: princomp() • Linear Discriminant Analysis: lda() • Support Vector Machines: svm()
  14. 14. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions The Problem A 15 A A A A A C AA AA C CC C C CCC CCC C C A A CC C C CCC CC C C C CC C CC C C CCCC C CCC C C C C A C A C CC CCCCCC CCCCCCCC AA AAAA CCCCC ACCCCCC CCCCCCCCC C C C C C 10 C CCC CC CC C C C C CC C C AA A AA CCCCCCCCCCCCCCCCCCCCCC C CC A A C CCCCCCCCCCCCCCCC CCC C C C CCCCCCCCCCC C CC C A AC C CCC CCCCCCCCCCCC CCCC C CC C CCCC C C C CCCCCCCC CCCC C C CC CCC CCCC CCCC C C A C CA A ACCC C CCCCC A AAC A A ACC CCCCC CCCCCCC C C C C C A CC CC CCC C CC C C C C C CC A A A C CCCCACACCCCCCCCCC C C CCCCC CC C C C C C A C A AAACCCACCCCCCCCCCCCCCC C A A CC CCC CCCCCCCC CC A A CCCCC CCCCCCCC CC A CC C C CC A A AAA ACCACCCCCCCCCCCCCCCCCCCCCC C C C CC C A A C A A AACCCCCCCCCCCCC CCCCCC C A A A AACCCC CACCCCCCCC CC C CC A C C C ACC C CCCCCCCC CC C CA CA CC C C CCCC C CC C AAACAACCCCCCC CCCCCC C C CC A C C C A A A AAAACAACAACCCACCCCCCCCCCCCC C A A A ACA A CACC C C CC C A A ACACCCCA CCACCCC CCCCC C C A C A AAAACAACCCCCCCC CCCC AAA CCCC C CC CC C AC CC C C C C CA A C C A AAAAC A A C AC AA ACA AA C A A AA CAAAA A AACCA AAC AAAAA C C AC A A AA A A A CC 5 A AA C A AAAAAAAA AA C A AA AA C A AAAAA AAAA A A C B A A AA A AAAA AAA A A AAAAAAAAAAAAAAAAAA A AAAA AAA AAAA A B BBBB BBBBBBB BBBBBBBBBBB BB AAAAAAA AAAAA A A AA A AAAAAAA A AAAAAAA A A AA B BBBBBBB BB B B BB A B B BBBBBBBBBBBB B BBBB B B B BBBBB B B AAAAAAAAAA A A BBB BBBBB BBB A A AAAAAAAAAAAAAAA A A B BBBBBBBBBBBBBBBB AAAAA AAAAAAAA B BB B y A AAA A A A A AAA A A A A AAA AAAA A A AA A AAAAAAAAAA AA A A AA AAAAAAAA AAA AAA AAAA A A AA A BBBBBB B B BBBBBBBBBBBBB B BBBBBBBBBB B BBBBBBBBBBBBBB BBBBBBB B B B BBBBBBBBBBBBB BB B B BBBBBBBBBBBBB BBBBBBBBBB BBBBBBB B A AAAAAAAAAAAAAA A BBBBBBBBBBBBBBBB B B BBBBBBBBBB BB BBB B BBBBBB B B BB B A A AA A A A AA A A A A AAAAAAAAA A BBBBB BBBBBBBB B BBB BB B B BB BBBBBB BBB BBB B B B BBBB BBBB B BBBBBBBBBBB BB B A A A AAAAAAAA A A A B BBBBBBBBBBBBB B B A A AA AA A B B BBBBBBBBB B B BB BBB B BB BBBBBBBB 0 AAA AAAAAAAAAAAAA BBB A A A AAAAA AA AA A A ABBBB BBBBBB BB B A AAA A AA A A A AA AA A A A AAAAAA AAAAAAA A AAAA AAA A A BB B BB BB BB B B B A AAAA AA AAAA AA AA A A A A A AA AA AAA A AAAAA A A BB B B B AAAAAA AAAA A AA A B A A AAAA AA A AA A A A A A AAAAAA AAA AAA A AA AA A A A A A AA A A A A AAAAAAA AA A A A A AAAAA A AAA A A AA AA AAAAA AAAAA A A A A AA AA AA A A A A −5 A A AAA AA A AAAAA AAAA AA A A AAA AAAA A AAA A A A A AAA A A A A AA A A AA A AA A A A A AA AA AAAAA A AA −10 A A A AA A A A −5 0 5 10 x
  15. 15. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Machine Learning More • MachineLearning CRAN view http://cran.r-project.org/web/views/MachineLearning.html • The caret package is a good one. • Elements of Statistical Learning http://www-stat.stanford.edu/ tibs/ElemStatLearn/ • Machine Learning (Mitchell) http://www.cs.cmu.edu/ tom/mlbook.html • Video Lectures http://videolectures.net/
  16. 16. Outline Why R? How to find out about stuff? What is Machine Learning? Show me the money Learn More Questions Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×