Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Efficient statistics in R

862 views

Published on

Summary of philosophy, tips, and ideology from 'Efficient R programming' by Colin Gillespie and Robin Lovelace.
https://bookdown.org/csgillespie/efficientR/

ModelR, dplyr, and tidy vignettes were also used used.

Philosophy proposed here from those ideas is that efficient learning, planning, workflows, and package selection lead to the most parsimonious and effective statistics. Efficient coding is similar to effective stats that apply more global models with fewer tests and therefore reduced likelihood of inflated table-wide errors. Minimizing assumptions is also an efficient approach to statistics.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Efficient statistics in R

  1. 1. efficient statistics @cjlortie
  2. 2. assumption: many domain scientists use R but are not programmers
  3. 3. R is the means to an end limitation: programming is a separate discipline
  4. 4. efficient learning to code important do show tell read watch help write up
  5. 5. vignettes books stackoverflow Journal of Statistical Software
  6. 6. efficient planning of statistics before code important
  7. 7. chunk your work
  8. 8. SMART workflows specific measurable attainable realistic time-bound
  9. 9. efficient planning directly in R plan plotrix DiagrammeR Gantt charts
  10. 10. use packages: innovative & sometimes efficient
  11. 11. package selection actively developed? well documented? well used?
  12. 12. efficient set up tips monitor resources use GitHub RStudio test code (microbenchmark) update R & packages (update.packages(ask = FALSE) no ask-to-save & no restore defaults directory management
  13. 13. efficient data importing rio::import readr .Rds tip url <-“” df=read.csv(url)
  14. 14. efficient data handling dplyr: data_grid dplyr: drop_na dplyr: tally check class %>% tibbles resample_bootstrap()
  15. 15. efficient coding access underlying routines as quickly as possible fewer functions is efficient vectorize code: functions that work with all-length vectors
  16. 16. efficient statistics modelr::model_matrix stats::glm() mgcv::gam() glmnet::glmnet() MASS::rlm()
  17. 17. profile models remember likelihood
  18. 18. data_grid and mine models
  19. 19. minimize assumptions pairwise.t.test() oneway.test()

×