Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- The AI Rush by Jean-Baptiste Dumont 309993 views
- AI and Machine Learning Demystified... by Carol Smith 3346890 views
- 10 facts about jobs in the future by Pew Research Cent... 530882 views
- 2017 holiday survey: An annual anal... by Deloitte United S... 868794 views
- Harry Surden - Artificial Intellige... by Harry Surden 487912 views
- Inside Google's Numbers in 2017 by Rand Fishkin 1075808 views

52 views

Published on

Presented as part of ResBaz 2018 in Sydney.

Published in:
Data & Analytics

No Downloads

Total views

52

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

0

Comments

0

Likes

3

No embeds

No notes for slide

- 1. The Grammar of Graphics Darya Vanichkina
- 2. … a personal commentary/interpretation of … http://vita.had.co.nz/papers/layered-grammar.html
- 3. Grammar “the fundamental principles or rules of an art or science” (OED Online 1989)
- 4. Simple case
- 5. Simple case
- 6. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 7. Exploring the diamonds data
- 8. Relationship between table (width of top of diamond relative to widest point) and price? diamonds %>% ggplot(aes(x = table, y = price)) + geom_point()
- 9. Relationship between carat (weight) and price? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point()
- 10. Relationship between carat (weight) and price? [How many diamonds?] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(alpha = 0.1)
- 11. Relationship between carat (weight) and price? [Lots of small diamonds!] diamonds %>% ggplot(aes(x = carat, y = price)) + geom_hex()
- 12. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 13. Relationship between clarity and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, y = price)) + geom_boxplot()
- 14. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity)) + geom_bar()
- 15. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 16. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 17. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_violin()
- 18. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 19. Relationship between price and carat grouped by cut and clarity? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity)
- 20. Relationship between price and carat grouped by cut and clarity (with trend)? diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity) + geom_smooth(method = "lm")
- 21. Relationship between table parameter (width of top of diamond relative to widest point, converted on the ﬂy into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut)
- 22. Relationship between table parameter (width of top of diamond relative to widest point, converted on the ﬂy into a factor) and price, faceted by cut? diamonds %>% mutate(TableIsSmall = factor(table < median(table)))%>% ggplot(aes(x = carat, y = price, colour = TableIsSmall)) + geom_point() + facet_grid(.~cut) Most diamonds with a small table (TableIsBig == FALSE) have an “Ideal” cut…
- 23. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 24. Relationship between cut and price? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 25. Relationship between cut and price? [log-scale] Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() Worse ——————————————————> Better diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()
- 26. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar()
- 27. How many diamonds with each clarity? Worse ——————————————————> Better diamonds %>% ggplot(aes(x = clarity, fill = clarity))+geom_bar() diamonds %>% ggplot(aes(x = clarity, fill = clarity)) +geom_bar(width=1) + coord_polar()
- 28. Components of the grammar Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact Adapted from DataCamp
- 29. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = “lm") diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10()
- 30. Finishing up with the diamonds data diamonds %>% ggplot(aes(x = carat, y = price)) + geom_point() + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_bw() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_minimal()
- 31. Just changing the themediamonds %>% ggplot(aes(x = carat, y = price)) + geom_point(col = "red") + facet_grid(cut~clarity, scales = "free_x") + geom_smooth(method = "lm") + theme_black() diamonds %>% ggplot(aes(x = cut, y = price)) + geom_boxplot()+ scale_y_log10() + theme_dark()
- 32. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact
- 33. Limitations
- 34. Limitations https://www.andrewheiss.com/blog/2017/08/10/exploring-minards-1812-plot-with-ggplot2/
- 35. The Grammar of Graphics Data Variables of interest Aesthetics x-axis y-axis colour ﬁll size labels alpha shape line width line type Geometries point line histogram bar boxplot Facets columns rows Statistics binning smoothing descriptive inferential Coordinates cartesian ﬁxed polar limits Themes Not data, but important for overall impact https://github.com/dvanic/18ResBazPresentation

No public clipboards found for this slide

Be the first to comment