01 Intro

Hadley Wickham
Hadley WickhamRice Unversity
plyr
                       Introduction


                       Hadley Wickham
Tuesday, 7 July 2009
1. About me
                2. Active learning
                3. Resources
                4. Outline of the course




Tuesday, 7 July 2009
About me
                       New assistant professor at Rice
                       University.
                       Interested in developing tools
                       that make analysis easier -
                       focussing more on data
                       cleaning, organisation and
                       exploration than traditional
                       statistics.
                       Have written 17 R packages at
                       last count.


Tuesday, 7 July 2009
Active learning
                       Hard to absorb much information from
                       pure lecture format (especially after
                       lunch!), so we’ll be breaking things up
                       with some short interactive activities.
                       Please take 30 seconds now to introduce
                       yourself to your neighbours. You’ll be
                       working with them in a bit.


Tuesday, 7 July 2009
Resources
                       All code, data & slides available at
                       http://had.co.nz/plyr/09-user
                       Tutorial covers most of the material in
                       http://had.co.nz/plyr/plyr-intro.pdf, but
                       with more examples, and at a higher level.
                       Problems or questions? Use the
                       manipulatr mailing list:
                       http://groups.google.com/group/manipulatr


Tuesday, 7 July 2009
Outline
                       Basic strategy: split-apply-combine.
                       Exploring US baby name trends
                       Modelling Texas house sales
                       Wrap up: how it all fits together


                       Can’t go into much detail in 3 hours, but hopefully
                       this tutorial will get you off to a good start.



Tuesday, 7 July 2009
Code
                       bnames-explore.r—what we’ll work through next
                       bnames-cluster.r—expanded example showing how
                       to find cluster of similar names
                       tx-explore-houston.r—introduction to housing data
                       tx-explore-all.r—process of model building for
                       large data
                       nnet.r—fitting a neural network with many random
                       starts and varying parameters



Tuesday, 7 July 2009
Split-apply-combine
                       Split up a big dataset
                       Apply a function to each piece
                       Combine all the pieces back together


                       Keep this theme in mind as we work
                       through the examples.


Tuesday, 7 July 2009
1 of 8

More Related Content

Similar to 01 Intro(12)

Clark ch 5 and 6Clark ch 5 and 6
Clark ch 5 and 6
Christian King326 views
Smith - Developing Campus Stakeholders' Collaborations - Sept 8Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
National Information Standards Organization (NISO)594 views
All About Google ToolsAll About Google Tools
All About Google Tools
Lucy Gray1.7K views
What Genes Can Teach Us on How To Learn FasterWhat Genes Can Teach Us on How To Learn Faster
What Genes Can Teach Us on How To Learn Faster
Lorenz Duremdes, Polymath38 views
UseinfopUseinfop
Useinfop
rche211 views
Risky, edgy teaching 2013Risky, edgy teaching 2013
Risky, edgy teaching 2013
Paul Smalley494 views

More from Hadley Wickham

27 development27 development
27 developmentHadley Wickham
2.8K views36 slides
27 development27 development
27 developmentHadley Wickham
1.9K views36 slides
24 modelling24 modelling
24 modellingHadley Wickham
1.4K views38 slides
Graphical inferenceGraphical inference
Graphical inferenceHadley Wickham
1.1K views34 slides
R packagesR packages
R packagesHadley Wickham
3.6K views25 slides

More from Hadley Wickham(20)

27 development27 development
27 development
Hadley Wickham2.8K views
27 development27 development
27 development
Hadley Wickham1.9K views
24 modelling24 modelling
24 modelling
Hadley Wickham1.4K views
23 data-structures23 data-structures
23 data-structures
Hadley Wickham1K views
Graphical inferenceGraphical inference
Graphical inference
Hadley Wickham1.1K views
R packagesR packages
R packages
Hadley Wickham3.6K views
22 spam22 spam
22 spam
Hadley Wickham811 views
21 spam21 spam
21 spam
Hadley Wickham1.1K views
20 date-times20 date-times
20 date-times
Hadley Wickham1.3K views
19 tables19 tables
19 tables
Hadley Wickham686 views
18 cleaning18 cleaning
18 cleaning
Hadley Wickham875 views
17 polishing17 polishing
17 polishing
Hadley Wickham748 views
16 critique16 critique
16 critique
Hadley Wickham776 views
15 time-space15 time-space
15 time-space
Hadley Wickham986 views
14 case-study14 case-study
14 case-study
Hadley Wickham573 views
13 case-study13 case-study
13 case-study
Hadley Wickham906 views
12 adv-manip12 adv-manip
12 adv-manip
Hadley Wickham999 views
11 adv-manip11 adv-manip
11 adv-manip
Hadley Wickham34 views
11 adv-manip11 adv-manip
11 adv-manip
Hadley Wickham1.2K views
10 simulation10 simulation
10 simulation
Hadley Wickham676 views

01 Intro

  • 1. plyr Introduction Hadley Wickham Tuesday, 7 July 2009
  • 2. 1. About me 2. Active learning 3. Resources 4. Outline of the course Tuesday, 7 July 2009
  • 3. About me New assistant professor at Rice University. Interested in developing tools that make analysis easier - focussing more on data cleaning, organisation and exploration than traditional statistics. Have written 17 R packages at last count. Tuesday, 7 July 2009
  • 4. Active learning Hard to absorb much information from pure lecture format (especially after lunch!), so we’ll be breaking things up with some short interactive activities. Please take 30 seconds now to introduce yourself to your neighbours. You’ll be working with them in a bit. Tuesday, 7 July 2009
  • 5. Resources All code, data & slides available at http://had.co.nz/plyr/09-user Tutorial covers most of the material in http://had.co.nz/plyr/plyr-intro.pdf, but with more examples, and at a higher level. Problems or questions? Use the manipulatr mailing list: http://groups.google.com/group/manipulatr Tuesday, 7 July 2009
  • 6. Outline Basic strategy: split-apply-combine. Exploring US baby name trends Modelling Texas house sales Wrap up: how it all fits together Can’t go into much detail in 3 hours, but hopefully this tutorial will get you off to a good start. Tuesday, 7 July 2009
  • 7. Code bnames-explore.r—what we’ll work through next bnames-cluster.r—expanded example showing how to find cluster of similar names tx-explore-houston.r—introduction to housing data tx-explore-all.r—process of model building for large data nnet.r—fitting a neural network with many random starts and varying parameters Tuesday, 7 July 2009
  • 8. Split-apply-combine Split up a big dataset Apply a function to each piece Combine all the pieces back together Keep this theme in mind as we work through the examples. Tuesday, 7 July 2009