Your SlideShare is downloading. ×
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Parallel Processing with IPython
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Parallel Processing with IPython

6,646

Published on

In this screencast, Travis Oliphant gives an introduction to IPython, an extremely useful tool for task-based parallel processing with Python.

In this screencast, Travis Oliphant gives an introduction to IPython, an extremely useful tool for task-based parallel processing with Python.

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,646
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
140
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Parallel Processing with IPython January 22, 2010
  • 2. Enthought Python Distribution (EPD) MORE THAN SIXTY INTEGRATED PACKAGES • Python 2.6 • Repository access • Science (NumPy, SciPy, etc.) • Data Storage (HDF, NetCDF, etc.) • Plotting (Chaco, Matplotlib) • Networking (twisted) • Visualization (VTK, Mayavi) • User Interface (wxPython, Traits UI) • Multi-language Integration • Enthought Tool Suite (SWIG,Pyrex, f2py, weave) (Application Development Tools)
  • 3. Enthought Training Courses Python Basics, NumPy, SciPy, Matplotlib, Chaco, Traits, TraitsUI, …
  • 4. PyCon http://us.pycon.org/2010/tutorials/ Introduction to Traits Introduction to Enthought Tool Suite Fantastic deal (normally $700 at PyCon get the same material for $275) Corran Webster
  • 5. Upcoming Training Classes March 1 – 5, 2009 Python for Scientists and Engineers Austin, Texas, USA March 1 – 5, 2009 Python for Quants London, UK http://www.enthought.com/training/
  • 6. Parallel Processing with IPython 6
  • 7. IPython.kernel • IPython's interactive kernel provides a simple (but powerful) interface for task- based parallel programming. • Allows fast development and tuning of task-parallel algorithm to better utilize resources. 7
  • 8. Getting started --- local cluster UNIX and OSX (and now WINDOWS) manually WINDOWS # run ipcluster to start-up a # run ipcontroller and then # controller and a set of engines # ipengine for each desired engine $ ipcluster local –n 4 > start /B C:Python25Scriptsipcontroller.exe Your cluster is up and running. > start /B C:Python25Scriptsipengine.exe > start /B C:Python25Scriptsipengine.exe ... > start /B C:Python25Scriptsipengine.exe ... You can then cleanly stop the cluster from IPython using: 2009-02-11 23:58:26-0600 [-] Log opened. 2009-02-11 23:58:28-0600 [-] Using furl file: C:Documents mec.kill(controller=True) and Settingsdemo_ip ythonsecurityipcontroller-engine.furl You can also hit Ctrl-C to stop it, or use from the cmd 2009-02-11 23:58:28-0600 [-] registered engine with id: 3 line: 2009-02-11 23:58:28-0600 [-] distributing Tasks 2009-02-11 23:58:28-0600 [Negotiation,client] engine kill -INT 20465 registration succeeded, got id: 3 Creates several key-files in Creates several key-files in %HOME%_ipythonsecurity : ~/.ipython/security : ipcontroller-engine.furl ipcontroller-engine.furl ipcontroller-mec.furl ipcontroller-mec.furl ipcontroller-tc.furl ipcontroller-tc.furl 8
  • 9. Getting started -- distributed • Run ipcontroller on a host and create .furl files • Creates separate .furl files to be used by the different connections (engine, multiengine client, task client). • Places .furl files by default in ~/.ipython/security (UNIX or Mac OSX) or %HOME%_ipythonsecurity (Windows). • Takes --<connection>-furl-file=FILENAME options where <connection> is engine, multiengine, or task to place the .furl files somewhere else. • Ensure the ipcontroller-engine.furl file is available to each host that will run an engine and run ipengine on these hosts. • Either place it in the default security directory • Use the –furl-file=FILENAME option to ipengine • Ensure the multiengine (task) .furl file is available to each host that will run a multiengine (task) client. • Either place it in the default security directory • Pass the FILENAME as the first argument to the constructor 9
  • 10. Initialize client >>> from IPython.kernel import client MULTIENGINECLIENT TASKCLIENT # * allows fine-grained control # * does not expose individual # * each engine has an id number # engines # * more intuitive for beginners # * presents a load-balanced, # optional argument can be # fault-tolerant queue # location of mec furl-file # optional argument can be # created by the controller # location of tc furl-file >>> mec = client.MultiEngineClient() # created by the controller >>> mec.get_ids() >>> tc = client.TaskClient() [0 1 2 3] mec.map -- parallel map tc.map –- parallel map mec.parallel –- parallel function tc.parallel –- function decorator mec.execute -- execute in parallel tc.run -- run Tasks mec.push -- push data tc.get_task_result – get result mec.pull -- pull data mec.scatter -- spread out client.MapTask –- function-like mec.gather -- collect back client.StringTask –- code-string mec.kill -- kill engines and controller 10
  • 11. MultiEngineClient SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION # Using map >>> def func(x): ... return x**2.5 * (3*x – 2) # standard map >>> result = map(func, range(32)) # mec.map >>> parallel_result = mec.map(func, range(32)) # mec.parallel >>> pfunc = mec.parallel()(func) or using decorators @mec.parallel def pfunc(x): return x**2.5 * (3*x – 2) >>> parallel_result2 = pfunc(range(32)) 11
  • 12. TaskClient – Load Balancing SCALAR FUNCTION  PARALLEL VECTORIZED FUNCTION # Using map >>> def func(x): ... return x**2.5 * (3*x – 2) # standard map >>> result = map(func, range(32)) # mec.map >>> parallel_result = tc.map(func, range(32)) # mec.parallel >>> pfunc = tc.parallel()(func) or using decorators @tc.parallel def pfunc(x): return x**2.5 * (3*x – 2) >>> parallel_result2 = pfunc(range(32)) 12
  • 13. MultiEngineClient EXECUTE CODESTRING IN PARALLEL >>> from enthought.blocks.api import func2str # decorator that turns python-code into a string >>> @func2str ... def code(): ... import numpy as np ... a = np.random.randn(N,N) ... eigs, vals = np.linalg.eig(a) ... maxeig = max(abs(eigs)) >>> mec['N'] = 100 >>> result = mec.execute(code) >>> print mec['maxeig'] [10.471428625885835, 10.322386155553213, 10.237638983818622, 10.614715948426941] 13
  • 14. TaskClient – Load Balancing Queue EXECUTE CODESTRING IN PARALLEL >>> from enthought.blocks.api import func2str # decorator that turns python-code into a string >>> @func2str ... def code(): ... import numpy as np ... a = np.random.randn(N,N) ... eigs, vals = np.linalg.eig(a) ... maxeig = max(abs(eigs)) >>> task = client.StringTask(str(code), push={'N':100}, pull='maxeig') >>> ids = [tc.run(task) for i in range(4)] >>> res = [tc.get_task_result(id) for id in ids] >>> print [x['maxeig'] for x in res] [10.439989436983467, 10.250842410862729, 10.040835983392991, 10.603885977189803] 14
  • 15. Parallel FFT On Memory Mapped File Time Processors Speed Up (seconds) 1 11.75 1.0 2 6.06 1.9 4 3.36 3.5 8 2.50 4.7
  • 16. EPD http://www.enthought.com/products/epd.php Webinars http://www.enthought.com/training/webinars.php Enthought Training: http://www.enthought.com/training/

×