Plotting data with python and pylab

54,104 views

Published on

A short introduction to how to read a CSV file and plot it with python and pylab.

Published in: Technology
2 Comments
14 Likes
Statistics
Notes
No Downloads
Views
Total views
54,104
On SlideShare
0
From Embeds
0
Number of Embeds
6,868
Actions
Shares
0
Downloads
353
Comments
2
Likes
14
Embeds 0
No embeds

No notes for slide

Plotting data with python and pylab

  1. 1. Barcelona Python Meetup Plotting data with python and pylab Giovanni M. Dall'Olio
  2. 2. Problem statement  Let's say we have a table of data like this: name country apples pears Giovanni Italy 31 13 Mario Italy 23 33 Luigi Italy 0 5 Margaret England 22 13 Albert Germany 15 6  How to read it in python?  How to do some basic plotting?
  3. 3. Alternatives for plotting data in python  Pylab (enthought)→ Matlab/Octave approach  Enthought → extended version of Pylab (free for  academic use)  rpy/rpy2 → allows to run R commands within  python  Sage → interfaces python with Matlab, R, octave,  mathematica, ...
  4. 4. The Pylab system  pylab is a system of three libraries, which together  transform python in a Matlab­like environment  It is composed by:  Numpy (arrays, matrices, complex numbers, etc.. in  python)  Scipy (extended scientific/statistics functions)  Matplotlib (plotting library)  iPython (extended interactive interpreter)
  5. 5. How to install pylab  There are many alternatives to install PyLab:  use the package manager of your linux distro   use enthought's distribution ( http://www.enthought.com/products/epd.php) (free  for academic use)  compile and google for help!  Numpy and scipy contains some Fortran libraries,  therefore easy_install doesn't work well with  them
  6. 6. ipython -pylab  Ipython is an extended version of the standard  python interpreter  It has a modality especially designed for pylab  The standard python interpreter doesn't support  very well plotting (not multi­threading)  So if you want an interactive interpreter, use  ipython with the pylab option:      $: alias pylab=”ipython -pylab” $: pylab In [1]:
  7. 7. Why the python interpreter is not the best for plotting Gets blocked when you create a plot
  8. 8. How to read a CSV file with python  To read a file like this in pylab: name country apples pears Giovanni Italy 31 13 Mario Italy 23 33 Luigi Italy 0 5 Margaret England 22 13 Albert Germany 15 6  → Use the function 'matplotlib.mlab.csv2rec' >>> data = csv2rec('exampledata.txt', delimiter='t')
  9. 9. Numpy - record arrays  csv2rec stores data in a numpy recarray object, where  you can access columns and rows easily: >>> print data['name'] ['Giovanni' 'Mario' 'Luigi' 'Margaret' 'Albert'] >>> data['apples'] array([31, 23, 0, 22, 15]) >>> data[1] ('Mario', 'Italy', 23, 33)
  10. 10. Alternative to csv2rec  numpy.genfromtxt (new in 2009)  More options than csv2rec, included in numpy  Tricky default parameters: need to specify dtype=None >>> data = numpy.genfromtxt('datafile.txt', dtype=None) >>> data array....
  11. 11. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> bar(arange(len(data)), data['apples'], color='red', width=0.1, label='apples') >>> bar(arange(len(data))+0.1, data['pears'], color='blue', width=0.1, label='pears') >>> xticks(range(len(data)), data['name'], ) >>> legend() >>> grid('.')
  12. 12. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> figure() >>> clf() Read a CSV file and storing  it in a recordarray object Use figure() and cls() to  reset the graphic device
  13. 13. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> bar(x=arange(len(data)), y=data['apples'], color='red', width=0.1, label='apples')  The bar function creates a  barchart
  14. 14. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> bar(x=arange(len(data)), y=data['apples'], color='red', width=0.1, label='apples') >>> bar(arange(len(data))+0.1, data['pears'], color='blue', width=0.1, label='pears')  This is the second barchart
  15. 15. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> bar(x=arange(len(data)), y=data['apples'], color='red', width=0.1, label='apples') >>> bar(arange(len(data))+0.1, data['pears'], color='blue', width=0.1, label='pears') >>> xticks(range(len(data)), data['name'], )  Re­defining the labels in the X axis  (xticks)
  16. 16. Barchart >>> data = csv2rec('exampledata.txt', delimiter='t') >>> bar(x=arange(len(data)), y=data['apples'], color='red', width=0.1, label='apples') >>> bar(arange(len(data))+0.1, data['pears'], color='blue', width=0.1, label='pears') >>> xticks(range(len(data)), data['name'], ) >>> legend() >>> grid('.') >>> title('apples and pears by person')  Adding legend, grid, title
  17. 17. Barchart (result)
  18. 18. Pie Chart >>> pie(data['pears'], labels=data['name']) >>> pie(data['pears'], labels=['%sn(%s pears)' % (i,j) for (i, j) in zip(data['name'], data['pears'])] )
  19. 19. Pie chart (result)
  20. 20. A plot chart >>> x = linspace(1,10, 10) >>> y = randn(10) >>> plot(x,y, 'r.', ms=15)  
  21. 21. An histogram >>> x = randn(1000) >>> hist(x, bins=40) >>> title('histogram of random numbers')  
  22. 22. Matplotlib gallery
  23. 23. Scipy Cookbook
  24. 24. Thanks for the attention!!  PyLab ­ http://www.scipy.org/PyLab   matplotlib ­ http://matplotlib.sourceforge.net/   scipy ­ http://www.scipy.org/   numpy ­ http://numpy.scipy.org/   ipython ­ http://ipython.scipy.org/moin/   These slides: http://bioinfoblog.it 

×