Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Lesser Known
s of the Tidyverse
Emily Robinson
@robinson_es
About Me
- Data Analyst at Etsy
- R User for ~6 years
- Enjoy talking about:
• A/B Testing
• Building and finding Data
Sci...
Disclaimers
This talk represents my
own views, not those
of Etsy
It’s not Base R vs. Tidyverse
Talk Goals
1. Keep you hip to the lingo
2. Stop you from doing this ...
… by sharing useful functions
3. Point you to resources
The Tidyverse
An opinionated collection of R packages
designed for data science that share an
underlying design philosophy, grammar,
and...
Tidyverse ?=
Tidyverse !=
Tibble
Tidyverse != Hadleyverse
Tidyverse != Hadleyverse
Many other contributors
Demo
Problem: it takes over the console
Step 1: print your dataset!
Prints only 10 rows and the columns that fit on the screen
Solution: as_tibble()
Problem: your NAs aren’t actually NAs
Step 2: examine your NAs
Solution: na_if() to replace certain values with NA
Problem: how I can I do this quickly?
+
Skimr
Solution: dplyr::select_if() + skimr::skim()
Step 3: examine your numeric co...
Problem: it has multiple answers in each row
Step 4: examine a single column
Solution: stringr::str_split() …
Solution: stringr::str_split() and tidyr::unnest()
+
Problem: it’s a mess
Step 5: make a scatterplot!
ggplot(WorkChallenges, aes(x = fct_reorder(question, perc_problem), y =
perc_problem)) + geom_point()
Solution: fct_reorde...
Problem: your scale is mis-ordered
Step 6: make a bar chart!
Solution: fct_relevel() to manually order your factor
ggplot(aes(x = fct_relevel(response, "Rarely", "Sometimes", "Often",...
Final step: do something cool and new!
Problem:
One solution: make a minimal reproducible example
+
Part 0 (optional): use tribble() to make a toy dataset
Part 1: Use reprex() to find any problems
Credit: Nick Tiernay, https://www.njtierney.com/post/2017/01/11/magic-reprex/
Part 2: Use reprex() to post your question or issue
Credit: Nick Tiernay, https://www.njtierney.com/post/2017/01/11/magic-...
Review
stringr::str_split
tidyr::unnest
forcats::fct_reorder
forcats::fct_relevel
reprex::reprex
tibble::as_tibble
tibble:...
Resources
R4ds.had.co.nz
#rstats Twitter
#rstats Twitter
Datacamp.com
Base R to Tidyverse Translation
www.significantdigits.org/2017/10/switching-from-base-r-to-tidyverse/
- Tidyverse.org
- community.rstudio.com/c/tidyverse
- https://www.rstudio.com/resources/cheatsheets/
- https://medium.com/...
Come for the stickers
and package names
…
Stay for the friendly
community and happy
workflow.
The
tidyverse
Thank You!
tiny.cc/rstudiotalk
robinsones.github.io
@robinson_es
The Lesser Known Stars of the Tidyverse
The Lesser Known Stars of the Tidyverse
The Lesser Known Stars of the Tidyverse
Upcoming SlideShare
Loading in …5
×

The Lesser Known Stars of the Tidyverse

512 views

Published on

These are my slides for my RStudio::conf presentation on February 2, 2018. Recording of the talk will be available soon.

Published in: Data & Analytics
  • Be the first to comment

The Lesser Known Stars of the Tidyverse

  1. 1. The Lesser Known s of the Tidyverse Emily Robinson @robinson_es
  2. 2. About Me - Data Analyst at Etsy - R User for ~6 years - Enjoy talking about: • A/B Testing • Building and finding Data Science community • R
  3. 3. Disclaimers
  4. 4. This talk represents my own views, not those of Etsy
  5. 5. It’s not Base R vs. Tidyverse
  6. 6. Talk Goals
  7. 7. 1. Keep you hip to the lingo
  8. 8. 2. Stop you from doing this ...
  9. 9. … by sharing useful functions
  10. 10. 3. Point you to resources
  11. 11. The Tidyverse
  12. 12. An opinionated collection of R packages designed for data science that share an underlying design philosophy, grammar, and data structures
  13. 13. Tidyverse ?=
  14. 14. Tidyverse !=
  15. 15. Tibble
  16. 16. Tidyverse != Hadleyverse
  17. 17. Tidyverse != Hadleyverse Many other contributors
  18. 18. Demo
  19. 19. Problem: it takes over the console Step 1: print your dataset!
  20. 20. Prints only 10 rows and the columns that fit on the screen Solution: as_tibble()
  21. 21. Problem: your NAs aren’t actually NAs Step 2: examine your NAs
  22. 22. Solution: na_if() to replace certain values with NA
  23. 23. Problem: how I can I do this quickly? + Skimr Solution: dplyr::select_if() + skimr::skim() Step 3: examine your numeric columns
  24. 24. Problem: it has multiple answers in each row Step 4: examine a single column
  25. 25. Solution: stringr::str_split() …
  26. 26. Solution: stringr::str_split() and tidyr::unnest() +
  27. 27. Problem: it’s a mess Step 5: make a scatterplot!
  28. 28. ggplot(WorkChallenges, aes(x = fct_reorder(question, perc_problem), y = perc_problem)) + geom_point() Solution: fct_reorder() to order one axis by the other
  29. 29. Problem: your scale is mis-ordered Step 6: make a bar chart!
  30. 30. Solution: fct_relevel() to manually order your factor ggplot(aes(x = fct_relevel(response, "Rarely", "Sometimes", "Often", "Most of the time"))) + geom_bar()
  31. 31. Final step: do something cool and new! Problem:
  32. 32. One solution: make a minimal reproducible example +
  33. 33. Part 0 (optional): use tribble() to make a toy dataset
  34. 34. Part 1: Use reprex() to find any problems Credit: Nick Tiernay, https://www.njtierney.com/post/2017/01/11/magic-reprex/
  35. 35. Part 2: Use reprex() to post your question or issue Credit: Nick Tiernay, https://www.njtierney.com/post/2017/01/11/magic-reprex/
  36. 36. Review stringr::str_split tidyr::unnest forcats::fct_reorder forcats::fct_relevel reprex::reprex tibble::as_tibble tibble::tribble dplyr::na_if dplyr::select_if skimr::skim
  37. 37. Resources
  38. 38. R4ds.had.co.nz
  39. 39. #rstats Twitter
  40. 40. #rstats Twitter
  41. 41. Datacamp.com
  42. 42. Base R to Tidyverse Translation www.significantdigits.org/2017/10/switching-from-base-r-to-tidyverse/
  43. 43. - Tidyverse.org - community.rstudio.com/c/tidyverse - https://www.rstudio.com/resources/cheatsheets/ - https://medium.com/@kierisi/r4ds-the-next-iteration- d51e0a1b0b82 And much more!
  44. 44. Come for the stickers and package names … Stay for the friendly community and happy workflow. The tidyverse
  45. 45. Thank You! tiny.cc/rstudiotalk robinsones.github.io @robinson_es

×