Your SlideShare is downloading. ×
  • Like
Analyzing mlb data with ggplot
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Analyzing mlb data with ggplot

  • 3,804 views
Published

Making basic, good-looking plots in Python is tough. Matplotlib gives you great control, but at the expense of being very detailed. The rise of pandas has made Python the go-to language for data …

Making basic, good-looking plots in Python is tough. Matplotlib gives you great control, but at the expense of being very detailed. The rise of pandas has made Python the go-to language for data wrangling and munging but many people are still reluctant to leave R because of its outstanding data viz packages.


ggplot is a port of the popular R package ggplot2 into Python. It provides a high level grammar that allow users to quickly and easily make good looking plots. An example may be found here:
http://blog.yhathq.com/posts/ggplot-for-python.html

Greg will show you how to use ggplot to analyze data from the MLB's open data source, pitchf/x. He will take you through the basics of ggplot and show how easy it is to create histograms, plot smoothed curves, customize colors & shapes.

http://www.meetup.com/PyData-Boston/events/184382092/

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Pretty much amazing.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
3,804
On SlideShare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
35
Comments
1
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. analyzing MLB data with ggplot Greg Lamp
  • 2. ggplot ● What is it? ● Alternatives ● How it works ● Why should I use it? ● Brief case study ● Questions
  • 3. Here I am on the Internet. Founder/CTO @ Yhat Hi, I’m Greg!
  • 4. What is ggplot?
  • 5. DSL for graphics
  • 6. DSL for graphics scatterplot histogram labels color shape
  • 7. What about matplotlib?
  • 8. a quick example
  • 9. matplotlib ggplot
  • 10. it’s not all bad!
  • 11. matplotlib syntax, api, default themes, learning curve
  • 12. matplotlib maturity, ipython, customization, community syntax, api, default themes, learning curve
  • 13. What about d3.js?
  • 14. d3.js
  • 15. ggplot
  • 16. ggplot d3.js
  • 17. How it works
  • 18. Format
  • 19. ggplot
  • 20. data frame
  • 21. “aesthetics”
  • 22. Aesthetics
  • 23. color
  • 24. shape
  • 25. size
  • 26. ...fill, alpha, slope, intercept, ymin, ymax, ...
  • 27. Geoms, Stats, & Scales
  • 28. geom_point
  • 29. geom_area
  • 30. ...there are many
  • 31. stat_smooth
  • 32. ...there are a few
  • 33. scale_color_brewer
  • 34. scale_color_gradient
  • 35. ...there are many
  • 36. Layers
  • 37. ggplot()
  • 38. + ggplot() geom_point()
  • 39. + + ggplot() geom_point() stat_smooth()
  • 40. + + ggplot() geom_point() stat_smooth()+ +
  • 41. ggplot() + geom_point() + stat_smooth()
  • 42. Why is this good?
  • 43. Makes “reasonable assumptions”
  • 44. not real colors
  • 45. matplotlib freaks
  • 46. still not real colors ...but i can guess what you mean
  • 47. Concise yet expressive
  • 48. Looks pretty good (and is easy to customize)
  • 49. Seaborngithub.com/mwaskom/seaborn
  • 50. Case Study
  • 51. pitch speed
  • 52. 103.4 mph
  • 53. Load ggplot and pandas
  • 54. Read in our pitch f/x data
  • 55. define the x- axis pass in your data frame
  • 56. add a histogram
  • 57. How does fatigue impact velocity?
  • 58. ...not helpful
  • 59. What about at the individual level?
  • 60. Justin Verlander
  • 61. ggplot let’s you fail quicker
  • 62. Finding Help
  • 63. /tagged/python-ggplot
  • 64. http://ggplot.yhathq.com
  • 65. What’s next?
  • 66. Thanks! @theglamp greg@yhathq.com