Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

I survey three approaches for data visualization in R: (i) the built-in base graphics functions, (ii) the ggplot2 package, and (iii) the lattice package. I also discuss some methods for visualizing large data sets.

License: CC Attribution License

No Downloads

Total views

16,100

On SlideShare

0

From Embeds

0

Number of Embeds

594

Shares

0

Downloads

0

Comments

0

Likes

39

No notes for slide

- 1. A Survey of R Graphics June 18 2009 R Users Group of LA Michael E. Driscoll Principal, Dataspora [email_address] www.dataspora.com
- 2. “ The sexy job in the next ten years will be statisticians…” - Hal Varian
- 4. (from Jessica Hagy’s thisisindexed.com) Hypothesis
- 5. gdp <- read.csv('gdp.csv') hours <- read.csv('hours.csv') gdp.hours <- merge(hours,gdp) gdp.hours$freetime <- 4380 - gdp.hours$hours attach(gdp.hours) plot(freetime ~ gdp) m <- lm(freetime ~ gdp,data=gdp.hours) abline(m,col=3,lw=2) pm <- loess(freetime ~ gdp) lines(spline(gdp,fitted(pm))) Munge & Model
- 6. Visualization library(ggplot2) qplot(gdp,freetime, data=gdp.hours, geom=c("point", "smooth"), span=1)
- 7. basic graphics
- 8. R’s Two Graphics Systems
- 9. plot() graphs objects plot(freetime ~ gdp, data=gdp.hours) model <- lm(freetime ~ gdp, data=gdp.hours) ab line(model)
- 10. plot() graphs objects abline(model, col="red", lwd=3 )
- 11. par sets graphical par ameters par( pch =20, cex =5, col ="#5050a0 BB ") RGB hex alpha blending! help(par) plot(freetime ~ gdp, data=gdp.hours)
- 12. par sets graphical par ameters parameters for par() pch col adj srt pt.cex graphing functions points() text() xlab() legend()
- 13. Paneling Graphics <ul><li>By setting one parameter in particular, mfrow , we can partition the graphics display to give us a m ultiple f ramework in which to panel our plots, row wise. </li></ul><ul><li>par(mfrow = c( nrow, ncol)) </li></ul>Number of rows Number of columns
- 14. Paneling Graphics <ul><li>par(mfrow=c(2,2)) </li></ul><ul><li>hist (D$wg, main='Histogram',xlab='Weight Gain', ylab ='Frequency', col=heat.colors(14)) </li></ul><ul><li>boxplot (wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg, wg.11a$wg, wg.12p$wg, main='Weight Gain', ylab='Weight Gain (lbs)', </li></ul><ul><li>xlab='Shift', names = c('7am','8am','9am','10am','11am','12pm')) </li></ul><ul><li>plot (D$metmin,D$wg,main='Met Minutes vs. Weight Gain', xlab='Mets (min)',ylab='Weight Gain (lbs)',pch=2) </li></ul><ul><li>plot (t1,D2$Intel,type="l",main='Closing Stock Prices',xlab='Time',ylab='Price $') </li></ul><ul><li>lines(t1,D2$DELL,lty=2) </li></ul>
- 15. Paneling Graphics
- 16. Working with Graphics Devices <ul><li>Starting up a new graphic X11 window </li></ul><ul><ul><li>x11() </li></ul></ul><ul><li>To write graphics to a file, open a device, write to it, close. </li></ul><ul><ul><li>pdf(“mygraphic.pdf”,width=7,height=7) </li></ul></ul><ul><ul><li>plot(x) </li></ul></ul><ul><ul><li>dev.off() </li></ul></ul><ul><ul><li>In Linux, the package “Cairo “ is recommended for a device that renders high-quality vector and raster images (alpha blending!). The command would read Cairo(“mygraphic.pdf”, … </li></ul></ul><ul><ul><li>Common gotcha: under non-interactive sessions, you should explicitly invoke a print command to send a plot object to an open device. For example </li></ul></ul><ul><ul><li>print(plot(x)) </li></ul></ul>
- 17. library( ggplot2 )
- 18. gg plot2 = g rammar of g raphics
- 19. gg plot2 = g rammar of g raphics
- 20. Visualizing 50,000 Diamonds with ggplot2
- 21. qplot (carat, price, data = diamonds)
- 22. qplot( log (carat), log (price), data = diamonds) qplot(carat, price, log=“xy” , data = diamonds) OR
- 23. qplot(log(carat), log(price), data = diamonds, alpha = I(1/20) )
- 24. qplot(log(carat), log(price), data = diamonds, alpha = I(1/20), colour=color )
- 25. Achieving small multiples with “facets” qplot(log(carat), log(price), data = diamonds, alpha=I(1/20)) + facet_grid(. ~ color)
- 26. qplot(color, price/carat, data = diamonds, alpha = I(1/20), geom=“jitter” ) qplot(color, price/carat, data = diamonds, geom=“boxplot” ) old new
- 28. library( lattice )
- 29. lattice = trellis <ul><ul><li>(source: http://lmdvr.r-forge.r-project.org ) </li></ul></ul>
- 30. visualizing six dimensions of MLB pitches with lattice
- 31. xyplot(x ~ y, data=pitch)
- 32. xyplot(x ~ y, groups=type , data=pitch)
- 33. xyplot( x ~ y | type , data=pitch)
- 34. xyplot(x ~ y | type, data=pitch, fill.color = pitch$color, panel = function(x,y, fill.color, …, subscripts) { fill <- fill.color[subscripts] panel.xyplot(x,y, fill= fill, …) })
- 35. xyplot(x ~ y | type, data=pitch, fill.color = pitch$color, panel = function(x,y, fill.color, …, subscripts) { fill <- fill.color[subscripts] panel.xyplot(x, y, fill= fill, …) })
- 36. A Story of Two Pitchers Hamels Webb
- 37. list of lattice functions densityplot(~ speed | type, data=pitch)
- 38. plotting big data
- 39. xyplot with 1m points = Bad Idea Jeans xyplot(log(price)~log(carat),data=diamonds)
- 40. efficient plotting with hexbinplot hexbinplot(log(price)~log(carat),data=diamonds,xbins=40)
- 41. 100 thousand gene measures
- 42. efficient plotting with geneplotter
- 43. beautiful colors with Colorspace library(“Colorspace”) red <- LAB(50,64,64) blue <- LAB(50,-48,-48) mixcolor(10, red, blue)
- 44. R--> web
- 45. L inux A pache M ySQL R http://labs.dataspora.com/gameday
- 48. Configuring rapache <ul><li>Hello world script </li></ul>setContentType("text/html") png("/var/www/hello.png") plot(sample(100,100),col=1:8,pch=19) dev.off() cat("<html>") cat("<body>") cat("<h1>hello world</h1>") cat('<img src="../hello.png"') cat("</body>") cat("</html>")
- 49. Data Visualization References <ul><ul><li>ggplot2: Elegant Graphics for Data Analysis </li></ul></ul><ul><ul><li>by Hadley Wickham </li></ul></ul><ul><ul><li>http://had.co.nz/ggplot2 </li></ul></ul><ul><ul><li>Lattice : Multivariate Data Visualization with R </li></ul></ul><ul><ul><li>by Deepayan Sarkar </li></ul></ul><ul><ul><li>http://lmdvr.r-forge.r-project.org/ </li></ul></ul>
- 50. Contact Us Michael E. Driscoll, Ph.D. Principal [email_address] www.dataspora.com

No public clipboards found for this slide

Special Offer to SlideShare Readers

The SlideShare family just got bigger. You now have unlimited* access to books, audiobooks, magazines, and more from Scribd.

Cancel anytime.
Be the first to comment