Parallel Processing with IPython



          January 22, 2010
Enthought Python Distribution (EPD)

 MORE THAN SIXTY INTEGRATED PACKAGES

  • Python 2.6                     • Repository...
Enthought Training Courses




                 Python Basics, NumPy,
                 SciPy, Matplotlib, Chaco,
         ...
PyCon


    http://us.pycon.org/2010/tutorials/

        Introduction to Traits
        Introduction to Enthought Tool Sui...
Upcoming Training Classes
  March 1 – 5, 2009
       Python for Scientists and Engineers
       Austin, Texas, USA

  Marc...
Parallel Processing
        with
      IPython



                      6
IPython.kernel

 • IPython's interactive kernel provides a
   simple (but powerful) interface for task-
   based parallel ...
Getting started --- local cluster
UNIX and OSX (and now WINDOWS)                               manually WINDOWS
# run ipcl...
Getting started -- distributed
 • Run ipcontroller on a host and create .furl files
    • Creates separate .furl files to ...
Initialize client
 >>> from IPython.kernel import client
MULTIENGINECLIENT                        TASKCLIENT
# * allows fi...
MultiEngineClient
SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION

# Using map
>>> def func(x):
...     return x**2.5 * (3*...
TaskClient – Load Balancing
SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION

# Using map
>>> def func(x):
...     return x*...
MultiEngineClient
EXECUTE CODESTRING IN PARALLEL

>>> from enthought.blocks.api import func2str
# decorator that turns pyt...
TaskClient – Load Balancing Queue
EXECUTE CODESTRING IN PARALLEL

>>> from enthought.blocks.api import func2str
# decorato...
Parallel FFT On Memory Mapped File




                Time
 Processors               Speed Up
              (seconds)
   ...
EPD
http://www.enthought.com/products/epd.php


Webinars
http://www.enthought.com/training/webinars.php


Enthought Traini...
Upcoming SlideShare
Loading in...5
×

Parallel Processing with IPython

6,792

Published on

In this screencast, Travis Oliphant gives an introduction to IPython, an extremely useful tool for task-based parallel processing with Python.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,792
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
143
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Parallel Processing with IPython

  1. 1. Parallel Processing with IPython January 22, 2010
  2. 2. Enthought Python Distribution (EPD) MORE THAN SIXTY INTEGRATED PACKAGES • Python 2.6 • Repository access • Science (NumPy, SciPy, etc.) • Data Storage (HDF, NetCDF, etc.) • Plotting (Chaco, Matplotlib) • Networking (twisted) • Visualization (VTK, Mayavi) • User Interface (wxPython, Traits UI) • Multi-language Integration • Enthought Tool Suite (SWIG,Pyrex, f2py, weave) (Application Development Tools)
  3. 3. Enthought Training Courses Python Basics, NumPy, SciPy, Matplotlib, Chaco, Traits, TraitsUI, …
  4. 4. PyCon http://us.pycon.org/2010/tutorials/ Introduction to Traits Introduction to Enthought Tool Suite Fantastic deal (normally $700 at PyCon get the same material for $275) Corran Webster
  5. 5. Upcoming Training Classes March 1 – 5, 2009 Python for Scientists and Engineers Austin, Texas, USA March 1 – 5, 2009 Python for Quants London, UK http://www.enthought.com/training/
  6. 6. Parallel Processing with IPython 6
  7. 7. IPython.kernel • IPython's interactive kernel provides a simple (but powerful) interface for task- based parallel programming. • Allows fast development and tuning of task-parallel algorithm to better utilize resources. 7
  8. 8. Getting started --- local cluster UNIX and OSX (and now WINDOWS) manually WINDOWS # run ipcluster to start-up a # run ipcontroller and then # controller and a set of engines # ipengine for each desired engine $ ipcluster local –n 4 > start /B C:Python25Scriptsipcontroller.exe Your cluster is up and running. > start /B C:Python25Scriptsipengine.exe > start /B C:Python25Scriptsipengine.exe ... > start /B C:Python25Scriptsipengine.exe ... You can then cleanly stop the cluster from IPython using: 2009-02-11 23:58:26-0600 [-] Log opened. 2009-02-11 23:58:28-0600 [-] Using furl file: C:Documents mec.kill(controller=True) and Settingsdemo_ip ythonsecurityipcontroller-engine.furl You can also hit Ctrl-C to stop it, or use from the cmd 2009-02-11 23:58:28-0600 [-] registered engine with id: 3 line: 2009-02-11 23:58:28-0600 [-] distributing Tasks 2009-02-11 23:58:28-0600 [Negotiation,client] engine kill -INT 20465 registration succeeded, got id: 3 Creates several key-files in Creates several key-files in %HOME%_ipythonsecurity : ~/.ipython/security : ipcontroller-engine.furl ipcontroller-engine.furl ipcontroller-mec.furl ipcontroller-mec.furl ipcontroller-tc.furl ipcontroller-tc.furl 8
  9. 9. Getting started -- distributed • Run ipcontroller on a host and create .furl files • Creates separate .furl files to be used by the different connections (engine, multiengine client, task client). • Places .furl files by default in ~/.ipython/security (UNIX or Mac OSX) or %HOME%_ipythonsecurity (Windows). • Takes --<connection>-furl-file=FILENAME options where <connection> is engine, multiengine, or task to place the .furl files somewhere else. • Ensure the ipcontroller-engine.furl file is available to each host that will run an engine and run ipengine on these hosts. • Either place it in the default security directory • Use the –furl-file=FILENAME option to ipengine • Ensure the multiengine (task) .furl file is available to each host that will run a multiengine (task) client. • Either place it in the default security directory • Pass the FILENAME as the first argument to the constructor 9
  10. 10. Initialize client >>> from IPython.kernel import client MULTIENGINECLIENT TASKCLIENT # * allows fine-grained control # * does not expose individual # * each engine has an id number # engines # * more intuitive for beginners # * presents a load-balanced, # optional argument can be # fault-tolerant queue # location of mec furl-file # optional argument can be # created by the controller # location of tc furl-file >>> mec = client.MultiEngineClient() # created by the controller >>> mec.get_ids() >>> tc = client.TaskClient() [0 1 2 3] mec.map -- parallel map tc.map –- parallel map mec.parallel –- parallel function tc.parallel –- function decorator mec.execute -- execute in parallel tc.run -- run Tasks mec.push -- push data tc.get_task_result – get result mec.pull -- pull data mec.scatter -- spread out client.MapTask –- function-like mec.gather -- collect back client.StringTask –- code-string mec.kill -- kill engines and controller 10
  11. 11. MultiEngineClient SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION # Using map >>> def func(x): ... return x**2.5 * (3*x – 2) # standard map >>> result = map(func, range(32)) # mec.map >>> parallel_result = mec.map(func, range(32)) # mec.parallel >>> pfunc = mec.parallel()(func) or using decorators @mec.parallel def pfunc(x): return x**2.5 * (3*x – 2) >>> parallel_result2 = pfunc(range(32)) 11
  12. 12. TaskClient – Load Balancing SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION # Using map >>> def func(x): ... return x**2.5 * (3*x – 2) # standard map >>> result = map(func, range(32)) # mec.map >>> parallel_result = tc.map(func, range(32)) # mec.parallel >>> pfunc = tc.parallel()(func) or using decorators @tc.parallel def pfunc(x): return x**2.5 * (3*x – 2) >>> parallel_result2 = pfunc(range(32)) 12
  13. 13. MultiEngineClient EXECUTE CODESTRING IN PARALLEL >>> from enthought.blocks.api import func2str # decorator that turns python-code into a string >>> @func2str ... def code(): ... import numpy as np ... a = np.random.randn(N,N) ... eigs, vals = np.linalg.eig(a) ... maxeig = max(abs(eigs)) >>> mec['N'] = 100 >>> result = mec.execute(code) >>> print mec['maxeig'] [10.471428625885835, 10.322386155553213, 10.237638983818622, 10.614715948426941] 13
  14. 14. TaskClient – Load Balancing Queue EXECUTE CODESTRING IN PARALLEL >>> from enthought.blocks.api import func2str # decorator that turns python-code into a string >>> @func2str ... def code(): ... import numpy as np ... a = np.random.randn(N,N) ... eigs, vals = np.linalg.eig(a) ... maxeig = max(abs(eigs)) >>> task = client.StringTask(str(code), push={'N':100}, pull='maxeig') >>> ids = [tc.run(task) for i in range(4)] >>> res = [tc.get_task_result(id) for id in ids] >>> print [x['maxeig'] for x in res] [10.439989436983467, 10.250842410862729, 10.040835983392991, 10.603885977189803] 14
  15. 15. Parallel FFT On Memory Mapped File Time Processors Speed Up (seconds) 1 11.75 1.0 2 6.06 1.9 4 3.36 3.5 8 2.50 4.7
  16. 16. EPD http://www.enthought.com/products/epd.php Webinars http://www.enthought.com/training/webinars.php Enthought Training: http://www.enthought.com/training/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×