Your SlideShare is downloading. ×
0
Introduction to Matplotlib for Data Analysis <ul><li>What is matplotlib?
Why do I use it? </li></ul>
Installation What you need is: Python  version 2.5, 2.6 or 2.7  Numpy  version 1.3+  Matplotlib  version 1.0.1  Linux Matp...
Website for documentation http://matplotlib.sourceforge.net/ Gallery has large number  of examples.
Ways to run matplotlib <ul><li>Interactively using pylab and ipython
Interactively in shell
File
As part of a larger program </li></ul>catherine@catherine-HP-Mini-110-3100:~$ ipython -pylab Interactively using pylab in ...
In [1]: plot([1,2,3,4],[56,45,58,32]) Out[1]: [<matplotlib.lines.Line2D object at 0xa9a3a2c>] Show Window Save in various ...
Simple Bargraph Using bar import numpy as np import matplotlib.pyplot  as plt data1=[12,23,38,42,41]  fig = plt.figure(1,(...
Add title ax.set_title('Simple bar graph',  size=20) Change the plot range ax.set_ylim(0,180) Axis labels ax.set_xlabel('D...
Side by Side ax.bar(ind+0.125, data1, width=0.25, color='pink', label='A1') ax.bar(ind+0.375, data2, width=0.25, color='th...
Importing Data Using numpy genfromtxt import numpy as np infile = open(&quot;data.csv&quot;, &quot;r&quot;) data = np.genf...
Upcoming SlideShare
Loading in...5
×

Introduction to Matplotlib for Data Analysis

3,235

Published on

Abstract
This talk covers what matplotlib is, why use it and how to install it. We'll be covering simple examples that will allow you to get started, but also display the strengths of the matplotlib package. I spend most of my day writing queries using Jade and SQL. Here, we use matplotlib as part of my work for exploratory data analysis and display and explaination of complicated data. I presented this topic at my local linux group. This month someone showed me a graph they had done using matplotlib, after my talk.
Outline
10 slides of a open office presentation mixed with demonstration using pylab interactively and running a script to output a chart. - Introduction: Why I use matplotlib. - Installation: Which packages required and where to obtain them. - Website: Url and description - Short display using pylab interactively: Start up pylab in ipython and bring up a simple scatter chart. Explaination of the functionality of the show window. - Basic bargraph script: Explaination of code. Difference between axes, axis and figure. - Adding labels: Adding title, axis labels, legend, ticks and ticklabels. - How to import data: Genfromtxt Splitting the imported data - Multiple plots on the same figure: Using Subplot Using Gridspec which requires matplotlib 1.0.0 - Twin Axes: How to plot two different datasets on the same plot with different scales.

[Presentation by Catherine Thwaites, uploaded by ewblen on behalf of kiwipycon]

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,235
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
46
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Introduction to Matplotlib for Data Analysis"

  1. 1. Introduction to Matplotlib for Data Analysis <ul><li>What is matplotlib?
  2. 2. Why do I use it? </li></ul>
  3. 3. Installation What you need is: Python version 2.5, 2.6 or 2.7 Numpy version 1.3+ Matplotlib version 1.0.1 Linux Matplotlib 0.99 is the latest in the debian repositaries, Latest version 1.0.1. needs to be installed from source. Instructions http://matplotlib.sourceforge.net/users/installing.html Windows Download and install.
  4. 4. Website for documentation http://matplotlib.sourceforge.net/ Gallery has large number of examples.
  5. 5. Ways to run matplotlib <ul><li>Interactively using pylab and ipython
  6. 6. Interactively in shell
  7. 7. File
  8. 8. As part of a larger program </li></ul>catherine@catherine-HP-Mini-110-3100:~$ ipython -pylab Interactively using pylab in ipython Imports modules required to plot in one namespace Chart is updated as you enter commands
  9. 9. In [1]: plot([1,2,3,4],[56,45,58,32]) Out[1]: [<matplotlib.lines.Line2D object at 0xa9a3a2c>] Show Window Save in various open formats Change plot size in window Zoom to inspect Pan to move along
  10. 10. Simple Bargraph Using bar import numpy as np import matplotlib.pyplot as plt data1=[12,23,38,42,41] fig = plt.figure(1,(6,6)) fig.clf() ax = fig.add_subplot(111) ind = np.arange(len(data1)) rects = ax.bar(ind+0.125, data1, width=0.75, color='thistle') plt.show()
  11. 11. Add title ax.set_title('Simple bar graph', size=20) Change the plot range ax.set_ylim(0,180) Axis labels ax.set_xlabel('Data',size=14) ax.set_ylabel('Places'size=14) Axis ticks and labels ax.set_xticks(ind+0.5) labels = ['west','east','centre', 'north','south'] ax.set_xticklabels(labels, size=14) Add bar labels def bar_label(rects): above = 1.05 * min([r.get_height() for r in rects]) for rect in rects: height = rect.get_height() ax.text(rect.get_x()+rect.get_width()/2., 1.05*height, '%d'%int(height), ha='center', va='bottom') bar_label(rects) Titles and labels
  12. 12. Side by Side ax.bar(ind+0.125, data1, width=0.25, color='pink', label='A1') ax.bar(ind+0.375, data2, width=0.25, color='thistle', label='A2') ax.bar(ind+0.625, data3, width=0.25, color='salmon', label='A3') ax.legend(loc='upper left' ) Cumulative rects1 = ax.bar(ind+0.125, data1, width=0.75, color='lightblue', label='A1') rects2 = ax.bar(ind+0.125, data2, width=0.75, bottom=data1, color='thistle', label='A2') ax.legend(loc='upper left') Two datasets on same axes
  13. 13. Importing Data Using numpy genfromtxt import numpy as np infile = open(&quot;data.csv&quot;, &quot;r&quot;) data = np.genfromtxt(infile, delimiter=&quot;,&quot;, dtype=(&quot;S20,S20,f8&quot;), names=True) infile.close() – - Split into Colours yellow = data[data['Colour']=='Yellow'] blue = data[data['Colour']!='Yellow'] – - plot histogram of data fig = plt.figure(1, figsize=(12,8)) ax = fig.add_subplot(111) ax.hist(yellow['Length'], color='gold') ax.tick_params('both',labelsize=16) plt.show()
  14. 14. Multiple plots on the same figure Using add_subplot fignum ax1 = fig.add_subplot(231) ax2 = fig.add_subplot(232) ax3 = fig.add_subplot(233) ax4 = fig.add_subplot(234) ax5 = fig.add_subplot(235) ax6 = fig.add_subplot(236) ax1.plot([12,13,25.5,15.2,19], 'bo-') ax2.plot([13,18.5,1.5,2,21], 'ro-') ax3.plot([10,12,11.5,16,23], 'go-') ax4.plot([6,11,5,12,21,32], 'ko-') ax5.plot([1.9,13,19.5,16.2,5], 'mo-') ax6.plot([13,13.2,26,18,14], 'yo-') numrows numcols Use to compare measurements across different categories
  15. 15. Multiple plots on the same figure Using Gridspec fig = plt.figure(1,(6,6)) gs = gridspec.GridSpec(3, 2, width_ratios=[1,1], height_ratios=[1,1,2], hspace=0.2,bottom=0.1) ax1 = fig.add_subplot(gs[0,0]) ax2 = fig.add_subplot(gs[0,1]) ax3 = fig.add_subplot(gs[1,0]) ax4 = fig.add_subplot(gs[1,1]) ax5 = fig.add_subplot(gs[2,:]) 3 by 2 grid Double height for bottom row <ul><li>Easier to use for complicated plot layouts </li></ul>Span bottom row
  16. 16. Multiple datasets on the same axes Using Twin Axes import numpy as np import matplotlib.pyplot as plt fig = plt.figure() ax = fig.add_subplot(111) twin_ax = ax.twinx() sales = [45,69,60,67] returns = [82,91,89,78.5] ind = np.arange(len(sales)) rects1 = ax.bar(ind+0.125, sales, width=0.75, color='thistle') p1 = twin_ax.plot(ind+0.5, returns,'gs-') ax.set_ylim(0, 75) twin_ax.set_ylim(0,100) ax.set_xticks(ind+0.5) ax.set_xticklabels(['North','South','East','West']) ax.set_ylabel('Sales') twin_ax.set_ylabel('% Returned') ax.set_title('Sales v Returns') plt.figlegend( (rects1[0], p1), ('Sales', '% Returned'), loc='upper left') plt.show()
  17. 17. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×