Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

La R Users Group Survey Of R Graphics

3,524 views

Published on

A survey of data visualization functions and packages in R. In particular, I discuss three approaches for data visualization in R: (i) the built-in base graphics functions, (ii) the ggplot2 package, and (iii) the lattice package. I also discuss some methods for visualizing large data sets.

Published in: Technology
  • Be the first to comment

La R Users Group Survey Of R Graphics

  1. 1. A Survey of R Graphics<br />June 18 2009<br />R Users<br />Group of LA<br />Michael E. Driscoll<br />Principal, Dataspora<br />mike@dataspora.com<br />
  2. 2. “The sexy job in the next ten years will be statisticians…”<br />- Hal Varian<br />
  3. 3.
  4. 4. Hypothesis<br />(from Jessica Hagy’s thisisindexed.com)<br />
  5. 5. Munge & Model<br />gdp &lt;- read.csv(&apos;gdp.csv&apos;)hours &lt;- read.csv(&apos;hours.csv&apos;)gdp.hours &lt;- merge(hours,gdp)gdp.hours$freetime &lt;- 4380 - gdp.hours$hoursattach(gdp.hours)plot(freetime ~ gdp)m &lt;- lm(freetime ~ gdp,data=gdp.hours)abline(m,col=3,lw=2)pm &lt;- loess(freetime ~ gdp)lines(spline(gdp,fitted(pm)))<br />
  6. 6. Visualization<br />library(ggplot2)<br />qplot(gdp,freetime,<br />data=gdp.hours,<br />geom=c(&quot;point&quot;,<br /> &quot;smooth&quot;),<br />span=1)<br />
  7. 7. basic graphics<br />
  8. 8. R’s Two Graphics Systems<br />
  9. 9. plot() graphs objects<br />plot(freetime ~ gdp, <br /> data=gdp.hours)<br />model &lt;- lm(freetime ~ gdp,<br /> data=gdp.hours)<br />abline(model) <br />
  10. 10. plot() graphs objects<br />abline(model,<br />col=&quot;red&quot;,<br />lwd=3)<br />
  11. 11. par sets graphical parameters<br />par(pch=20, <br />cex=5,<br />col=&quot;#5050a0BB&quot;)<br />RGB hex<br />alpha blending!<br />plot(freetime ~ gdp, data=gdp.hours)<br />help(par)<br />
  12. 12. par sets graphical parameters<br />parameters<br />for par()<br />pch<br />col<br />adj<br />srt<br />pt.cex<br />graphing functions<br />points()<br />text()<br />xlab()<br />legend()<br />
  13. 13. Paneling Graphics<br />By setting one parameter in particular, mfrow, we can partition the graphics display to give us a multiple framework in which to panel our plots, rowwise.<br />par(mfrow= c( nrow, ncol))<br />Number of rows<br />Number of columns<br />
  14. 14. Paneling Graphics<br />par(mfrow=c(2,2))<br />hist(D$wg, main=&apos;Histogram&apos;,xlab=&apos;Weight Gain&apos;, ylab =&apos;Frequency&apos;, col=heat.colors(14))<br />boxplot(wg.7a$wg, wg.8a$wg, wg.9a$wg, wg.10a$wg, wg.11a$wg, wg.12p$wg, main=&apos;Weight Gain&apos;, ylab=&apos;Weight Gain (lbs)&apos;,<br />xlab=&apos;Shift&apos;, names = c(&apos;7am&apos;,&apos;8am&apos;,&apos;9am&apos;,&apos;10am&apos;,&apos;11am&apos;,&apos;12pm&apos;))<br />plot(D$metmin,D$wg,main=&apos;Met Minutes vs. Weight Gain&apos;, xlab=&apos;Mets (min)&apos;,ylab=&apos;Weight Gain (lbs)&apos;,pch=2)<br />plot(t1,D2$Intel,type=&quot;l&quot;,main=&apos;Closing Stock Prices&apos;,xlab=&apos;Time&apos;,ylab=&apos;Price $&apos;)<br />lines(t1,D2$DELL,lty=2)<br />
  15. 15. Paneling Graphics<br />
  16. 16. Working with Graphics Devices<br />Starting up a new graphic X11 window<br />x11()<br />To write graphics to a file, open a device, write to it, close.<br />pdf(“mygraphic.pdf”,width=7,height=7) <br />plot(x)<br />dev.off()<br />In Linux, the package “Cairo “ is recommended for a device that renders high-quality vector and raster images (alpha blending!). The command would read Cairo(“mygraphic.pdf”, …<br />Common gotcha: under non-interactive sessions, you should explicitly invoke a print command to send a plot object to an open device. For example <br /> print(plot(x))<br />
  17. 17. library(ggplot2)<br />
  18. 18. ggplot2 =grammar of graphics<br />
  19. 19. ggplot2 =grammar ofgraphics<br />
  20. 20. Visualizing 50,000 Diamonds with ggplot2<br />
  21. 21. qplot(carat, price, data = diamonds)<br />
  22. 22. qplot(log(carat), log(price), data = diamonds)<br />qplot(carat, price, log=“xy”, data = diamonds)<br />OR<br />
  23. 23. qplot(log(carat), log(price), data = diamonds, <br />alpha = I(1/20))<br />
  24. 24. qplot(log(carat), log(price), data = diamonds, <br />alpha = I(1/20), colour=color)<br />
  25. 25. Achieving small multiples with “facets”<br />qplot(log(carat), log(price), data = diamonds, alpha=I(1/20)) + facet_grid(. ~ color)<br />
  26. 26. old<br />new<br />qplot(color, price/carat, <br />data = diamonds, alpha = I(1/20), geom=“jitter”)<br />qplot(color, price/carat, <br />data = diamonds,<br />geom=“boxplot”)<br />
  27. 27.
  28. 28. library(lattice)<br />
  29. 29. lattice = trellis<br />(source: http://lmdvr.r-forge.r-project.org )<br />
  30. 30. visualizing six dimensions<br />of MLB pitches with lattice<br />
  31. 31. xyplot(x ~ y, data=pitch)<br />
  32. 32. xyplot(x ~ y, groups=type, data=pitch)<br />
  33. 33. xyplot(x ~ y | type, data=pitch)<br />
  34. 34. xyplot(x ~ y | type, data=pitch,<br />fill.color = pitch$color,<br />panel = function(x,y, fill.color, …, subscripts) {<br /> fill &lt;- fill.color[subscripts]<br />panel.xyplot(x,y, fill= fill, …) })<br />
  35. 35. xyplot(x ~ y | type, data=pitch,<br />fill.color = pitch$color,<br />panel = function(x,y, fill.color, …, subscripts) {<br /> fill &lt;- fill.color[subscripts]<br />panel.xyplot(x, y, fill= fill, …) })<br />
  36. 36. A Story of Two Pitchers<br />Hamels<br />Webb<br />
  37. 37. list of latticefunctions<br />densityplot(~ speed | type, data=pitch)<br />
  38. 38. plotting big data<br />
  39. 39. xyplotwith 1m points = Bad Idea Jeans<br />xyplot(log(price)~log(carat),data=diamonds)<br />
  40. 40. efficient plotting with hexbinplot<br />hexbinplot(log(price)~log(carat),data=diamonds,xbins=40)<br />
  41. 41. 100<br />thousand <br />gene measures<br />
  42. 42. efficient plotting with geneplotter<br />
  43. 43. beautiful colors with Colorspace<br />library(“Colorspace”)<br />red &lt;- LAB(50,64,64)<br />blue &lt;- LAB(50,-48,-48)<br />mixcolor(10, red, blue)<br />
  44. 44. R--&gt;web<br />
  45. 45. LinuxApacheMySQLR<br />http://labs.dataspora.com/gameday<br />
  46. 46.
  47. 47.
  48. 48. Configuring rapache<br />Hello world script<br />setContentType(&quot;text/html&quot;)<br />png(&quot;/var/www/hello.png&quot;)<br />plot(sample(100,100),col=1:8,pch=19)<br />dev.off()<br />cat(&quot;&lt;html&gt;&quot;)<br />cat(&quot;&lt;body&gt;&quot;)<br />cat(&quot;&lt;h1&gt;hello world&lt;/h1&gt;&quot;)<br />cat(&apos;&lt;imgsrc=&quot;../hello.png&quot;&apos;)<br />cat(&quot;&lt;/body&gt;&quot;)<br />cat(&quot;&lt;/html&gt;&quot;)<br />
  49. 49. Data Visualization References<br />ggplot2: Elegant Graphics for Data Analysis<br />by Hadley Wickham<br />http://had.co.nz/ggplot2<br />Lattice: Multivariate Data Visualization with R<br />by DeepayanSarkar<br />http://lmdvr.r-forge.r-project.org/<br />

×