Your SlideShare is downloading. ×
0
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Ggplot2 1outof3
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Ggplot2 1outof3

268

Published on

This is the slides I made for NYC Open Data meetup.event detail: http://www.meetup.com/NYC-Open-Data/events/137235202/ …

This is the slides I made for NYC Open Data meetup.event detail: http://www.meetup.com/NYC-Open-Data/events/137235202/

NYC Open Data meetup group offers free woekshop twice per week in NYC. Everyone is welcome to come and learn data science and open data topics. Sin up at http://www.meetup.com/NYC-Open-Data/

Published in: Business, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
268
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Learnggplot2 AsKungfu Skill A K f Skills Givenby KaiXiao(DataSciencetist) VivianZhang(Co-founder&CTO) Vi i Zh (C f d & CTO) Contact:vivian.zhang@supstat.com Contact: vivian zhang@supstat com
  • 2. • I: Point • II: Bar • III:Histogram g • IV:Line • V: Tile • VI:Map
  • 3. Introduction • ggplot2 is a plotting system for R • based on the《The Grammar of  Graphics》 • which tries to take the good parts  of base and lattice graphics and  none of the bad parts none of the bad parts • It takes care of many of the fiddly  details that make plotting a hassle details that make plotting a hassle • It becomes easy to produce  complex multi‐layered graphics p y g p
  • 4. Whyweloveggplot2? Why we love ggplot2? • controltheplotasabstractlayersandmakecreativitybecomereality; • getusedtostructuralthinking; d l hi ki • getbeautifulgraphicswhileavoidingcomplicateddetails 1973murdercasesinUSA
  • 5. 7BasicConcepts 7 Basic Concepts • Mapping • Scale • Geometric Stat st cs • Statistics • Coordinate • L Layer • Facet
  • 6. Mapping Mappingcontrolsrelationsbetweenvariables M i t l l ti b t i bl
  • 7. Scale Scalewillpresentmappingoncoordinatescales. Scale will present mapping on coordinate scales ScaleandMappingiscloselyrelatedconcepts.
  • 8. Geometric Geom means the graphical elements such as meansthegraphicalelements,suchas points,linesandpolygons.
  • 9. Statistics Statenablesustocalculateanddostatistical Stat enables us to calculate and do statistical analysisbased,suchasaddingaregressionline. Stat
  • 10. Coordinate Cood will affect how we observe graphical willaffecthowweobservegraphical elements.Transformationofcoordinatesisuseful. Stat Coord
  • 11. Layer Component:data,mapping,geom,stat Usinglayerwillallowuserstoestablishplotsstep i l ill ll bli h l bystep.Itbecomemucheasiertomodifyaplot.
  • 12. Facet Facetsplitsdataintogroupsanddraweach groupseparately.Usually,thereisaorder.
  • 13. 7BasicConcepts 7 Basic Concepts • Mapping • Scale • Geometric Stat st cs • Statistics • Coordinate • L Layer • Facet
  • 14. SkillI:Point Skill I:Point
  • 15. Sampledata mpg Sample data--mpg • Fuel economy data from 1999 and 2008 for 38 popular  y p p models of car • • • • • • • Details D il Displ :               engine displacement, in litres Cyl:                    number of cylinders  Cyl: number of cylinders Trans:                type of transmission  Drv:                   front‐wheel, rear wheel drive, 4wd  , , Cty:                    city miles per gallon  Hwy:                  highway miles per gallon 
  • 16. >library(ggplot2) >str(mpg) 'data.frame': 234obs.of14variables: $manufacturer:Factorw/15levels"audi","chevrolet",..: $model:Factorw/38levels"4runner4wd",..: $displ :num 1.81.8222.82.83.11.81.82... $year:int 1999 1999 2008 2008 1999 1999 2008 1999 $ year : int 19991999200820081999199920081999 $cyl :int 4444666444... $trans:Factorw/10levels"auto(av)","auto(l3)",..: $drv $d :Factorw/3levels"4","f","r": 3l l f $cty :int 18212021161818181620... $hwy $fl :int 29293130262627262528... :Factorw/5levels"c","d","e","p",..: $class:Factorw/7levels"2seater","compact",..:
  • 17. aesthetics p < ggplot(data mpg, mapping aes(x cty, y hwy)) p <‐ ggplot(data=mpg mapping=aes(x=cty y=hwy)) p + geom_point()
  • 18. > summary(p)  data: manufacturer, model, displ, year, cyl, trans, drv, cty, hwy,  fl, class [234x11]  fl l [ ] mapping: x = cty, y = hwy faceting: facet_null()  g _ () > summary(p+geom_point()) data: manufacturer, model, displ, year, cyl, trans, drv, cty, hwy, data: manufacturer model displ year cyl trans drv cty hwy fl, class [234x11] mapping:  x = cty, y = hwy faceting: facet_null()  f i f ll() ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ g geom_point: na.rm = FALSE  _p stat_identity:   position_identity: (width = NULL, height = NULL)
  • 19. p + geom_point(color red4 size=3) p + geom point(color='red4',size 3)
  • 20. #addonemorelayer--color p p<- ggplot(mpg,aes(x=cty,y=hwy,colour=factor(year))) ggp ( pg, ( y,y y, (y ))) p+geom_point()
  • 21. #addonemorestat(loess:localpartialpolynomialregression) > p + geom_point() + stat_smooth()
  • 22. p <‐ ggplot(data=mpg, mapping=aes(x=cty,y=hwy)) p g p + geom_point(aes(colour=factor(year)))+ _p ( ( (y ))) stat_smooth()
  • 23. Twoequallywaystodraw q y y p  <‐ ggplot(mpg, aes(x=cty,y=hwy)) p <‐ ggplot(mpg aes(x=cty y=hwy)) p  + geom_point(aes(colour=factor(year)))+ () stat_smooth() d <‐ ggplot() + () geom_point(data=mpg, aes(x=cty, y=hwy, colour=factor(year)))+ stat_smooth(data=mpg, aes(x=cty, y=hwy)) print(d)
  • 24. Besidethe“whitepaper”canvas,wewillfindgeom andstat canvas. > summary(d) data: [0x0] [ ] faceting: facet_null()  ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐ mapping: x = cty, y = hwy, colour = factor(year)  mapping: x = cty y = hwy colour = factor(year) geom_point: na.rm = FALSE  stat_identity:   position_identity: (width = NULL, height = NULL) pp g y, y y mapping: x = cty, y = hwy  geom_smooth:   stat_smooth: method = auto, formula = y ~ x, se = TRUE,  n = 80, fullrange = FALSE, level = 0.95, na.rm = FALSE  n = 80 fullrange = FALSE level = 0 95 na rm = FALSE position_identity: (width = NULL, height = NULL)
  • 25. #Usingscale()function,wecancontrolcolorofscale. p+geom_point(aes(colour=factor(year)))+ stat_smooth()+ h() scale_color_manual(values=c('steelblue','red4'))
  • 26. #Wecanmap“displ”to thesizeofpoint p+geom_point(aes(colour=factor(year),size=displ))+ stat_smooth()+ h() scale_color_manual(values=c('steelblue','red4'))
  • 27. # We solve the problem with overlapping and point being too small p + geom_point(aes(colour=factor(year),size=displ), alpha=0.5,position = "jitter")+ stat_smooth()+ scale_color_manual(values =c('steelblue','red4'))+ scale_size_continuous(range = c(4, 10))
  • 28. # We change the coordinate system. p + geom_point(aes(colour=factor(year),size=displ), alpha=0.5,position = "jitter")+ stat_smooth()+ scale_color_manual(values =c('steelblue','red4'))+ scale_size_continuous(range = c(4, 10)) +    coord_flip()
  • 29. p + geom_point(aes(colour=factor(year),size=displ), alpha=0.5,position = "jitter")+ stat_smooth()+ () scale_color_manual(values =c('steelblue','red4'))+ scale_size_continuous(range = c(4, 10))  +    coord_polar()
  • 30. p + geom_point(aes(colour=factor(year),size=displ), alpha=0.5,position = "jitter")  +    stat_smooth()+ scale_color_manual(values =c('steelblue','red4'))+ ( ( , )) scale_size_continuous(range = c(4, 10))+                    coord_cartesian(xlim = c(15, 25), ylim=c(15,40))
  • 31. #Usingfacet()function,wenowsplitdataanddrawthembygroup p+geom_point(aes(colour=class,size=displ), alpha=0.5,position="jitter")+ stat_smooth()+ h scale_size_continuous(range=c(4,10))+ facet_wrap(~year,ncol=1)
  • 32. #Addplotnameandspecifyallinformationyouwanttoadd p<- ggplot(mpg,aes(x=cty,y=hwy)) p+geom_point(aes(colour=class,size=displ), alpha=0.5,position="jitter")+stat_smooth()+ scale_size_continuous(range=c(4,10))+ l i ti ( (4 10)) facet_wrap(~year,ncol=1)+opts(title='modelofcarandmpg')+ labs(y='drivingdistancepergallononhighway',x='drivingdistancepergallononcityroad', size='displacement',colour ='model')
  • 33. #scatterplotfordiamonddataset p p<- ggplot(diamonds,aes(carat,price)) ggp ( , ( ,p )) p+geom_point()
  • 34. #usetransparencyandsmallsizepoints p g p+geom_point(size=0.1,alpha=0.1) _p ( , p )
  • 35. #usebincharttoobserveintensityofpoints p p+stat_bin2d(bins=40) _ ( )
  • 36. #estimatedatadentisy p+stat_density2d(aes(fill=..level..),geom="polygon")+ coord_cartesian(xlim =c(0,1.5),ylim=c(0,6000)) coord cartesian(xlim = c(0 1 5) ylim=c(0 6000))
  • 37. SkillII:Bar Skill II:Bar
  • 38. SkillIII:Histogram Skill III:Histogram
  • 39. SkillIV:Line Skill IV:Line
  • 40. SkillV:Tile Skill V:Tile
  • 41. SkillVI:Map Skill VI:Map
  • 42. Resources http://learnr.wordpress.com Redrawallthelatticegraph Redraw all the lattice graph byggplot2
  • 43. Resources Alltheexamplesaredoneby ggplot2. ggplot2
  • 44. Resources • http://wiki stdout org/rcookbook/Graphs/ http://wiki.stdout.org/rcookbook/Graphs/ • http://r-blogger.com • htt //St k http://Stackoverflow.com fl • http://xccds1977.blogspot.com • http://r-ke.info/ • http://www.youtube.com/watch?v=vnVJJYi1 mbw
  • 45. Thankyou!Comebackformore! Signupat:www.meetup.com/nyc-open-data Givefeedbackat:www.bit.ly/nycopen

×