Advancing Python Performance Closer to Native Code
Mayank Tiwari
Technical Consulting Engineer, Intel
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Python* Landscape
Challenge#1
Domain experts are not
professional software
programmers
Adoption of Python
continues to grow among
domain experts & developers
for its productivity benefits
Challenge#2
Python performance limits
migration to production systems
Intel’s Python Tools
 Accelerate Python performance
 Enable easy access
 Empower the community
Most Popular Coding Languages of 2016
2
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
3
What’s Inside Intel® Distribution for Python
High Performance Python* for Scientific Computing, Data Analytics, Machine Learning
1 Available only in Intel® Parallel Studio Composer Edition.
Ecosystem compatibilityGreater ProductivityFaster Performance
Prebuilt & Accelerated Packages Supports Python 2.7 & 3.6, conda, pip
Operating System: Windows*, Linux*, MacOS1*
Intel® Architecture Platforms
Performance Libraries, Parallelism,
Multithreading, Language Extensions
Accelerated NumPy/SciPy/scikit-learn with
Intel® MKL1 & Intel® DAAL2
Data analytics, machine learning & deep
learning with scikit-learn, pyDAAL
Scale with Numba* & Cython*
Includes optimized mpi4py, works with
Dask* & PySpark*
Optimized for latest Intel® architecture
Prebuilt & optimized packages for numerical
computing, machine/deep learning, HPC, &
data analytics
Drop in replacement for existing Python - No
code changes required
Jupyter* notebooks, Matplotlib included
Conda build recipes included in packages
Free download & free for all uses including
commercial deployment
Compatible & powered by Anaconda*,
supports conda & pip
Distribution & individual optimized packages
also available at conda & Anaconda.org,
YUM/APT, Docker image
on DockerHub
Optimizations upstreamed to main Python
trunk
Commercial support through Intel® Parallel
Studio XE 2017
1Intel® Math Kernel Library
2Intel® Data Analytics Acceleration Library
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
4
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
5
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
6
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Installing Intel® Distribution for Python* 2018
7
Standalone
Installer
Anaconda.org
Anaconda.org/intel channel
YUM/APT
Docker Hub
Download full installer from
https://software.intel.com/en-us/intel-distribution-for-python
> conda config --add channels intel
> conda install intelpython3_full
> conda install intelpython3_core
docker pull intelpython/intelpython3_full
Access for yum/apt:
https://software.intel.com/en-us/articles/installing-intel-free-libs-and-
python
2.7 & 3.6
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
8
Outside of optimized Python*, how efficient is your
Python/C/C++ application code?
Are there any non-obvious sources of performance
loss?
Performance analysis gives the answer!
But Wait…..There’s More!
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Cluster
EditionProfessional Edition
Composer Edition
Intel® Parallel Studio XE
Create Faster Code…Faster
9
More Power for Your Code - software.intel.com/intel-parallel-studio-xe
Intel® VTune™
Amplifier
Performance Profiler
ANALYZE
Analysis Tools
Intel® Advisor
Vectorization Optimization
& Thread Prototyping
Intel® Inspector
Memory & Thread Debugger
SCALE
Cluster Tools
Intel® Trace Analyzer & Collector
MPI Tuning & Analysis
Intel® MPI Library
Message Passing Interface Library
Intel® Cluster Checker
Cluster Diagnostic Expert System
Operating System: Windows*, Linux*, MacOS1*
Intel® Architecture Platforms
BUILD
Compilers & Libraries
C / C++ Compiler
Optimizing Compiler
Intel® Distribution for Python*
High Performance Scripting
Intel® MKL
Fast Math Kernel Library
Intel® IPP
Image, Signal & Data Processing
Intel® TBB
C++ Threading Library
Intel® DAAL
Data Analytics Library
Fortran Compiler
Optimizing Compiler
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
10
A 2-prong approach for Faster Python* Performance
High Performance Python Distribution + Performance Profiling
 Leverage optimized native libraries for performance
 Drop-in replacement for your current Python - no code changes required
 Optimized for multi-core and latest Intel processors
Step 1: Use Intel® Distribution for Python
 Get detailed summary of entire application execution profile
 Auto-detects & profiles Python/C/C++ mixed code & extensions with low overhead
 Accurately detect hotspots - line level analysis helps you make smart optimization
decisions fast!
 Available in Intel® Parallel Studio XE Professional & Cluster Edition
Step 2: Use Intel® VTune™ Amplifier for
profiling
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
11
FAQs
- Do I need to make any changes to my code?
- Can I use blah blah framework/tool with IDP?
- How much performance I can expect on Non-Intel architecture?
- How much performance gain I can make in my blah blah application if I use
IDP instead of stock Python?
- I installed IDP but I can’t see any performance improvement in my application.
Why?
- Is Conda Python and Intel Python same?
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Intel® Distribution for Python
 Product page – overview, features, FAQs…
 Training materials – movies, tech briefs, documentation,
evaluation guides…
 Support – forums, secure support…
12
More Resources
Intel® VTune Amplifier
 Product page – overview, features, FAQs…
 Training materials – movies, tech briefs, documentation,
evaluation guides…
 Reviews
 Support – forums, secure support…
Copyright © 2017, Intel Corporation. All rights reserved.
*Other names and brands may be claimed as the property of others.
Optimization Notice
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR
OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO
LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS
INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE,
MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,
operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information
and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products.
Copyright © 2017, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are
trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel
microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the
availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent
optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are
reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the
specific instruction sets covered by this notice.
Notice revision #20110804
13
Ready access to high performance Python with Intel Distribution for Python 2018

Ready access to high performance Python with Intel Distribution for Python 2018

  • 1.
    Advancing Python PerformanceCloser to Native Code Mayank Tiwari Technical Consulting Engineer, Intel
  • 2.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Python* Landscape Challenge#1 Domain experts are not professional software programmers Adoption of Python continues to grow among domain experts & developers for its productivity benefits Challenge#2 Python performance limits migration to production systems Intel’s Python Tools  Accelerate Python performance  Enable easy access  Empower the community Most Popular Coding Languages of 2016 2
  • 3.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 3 What’s Inside Intel® Distribution for Python High Performance Python* for Scientific Computing, Data Analytics, Machine Learning 1 Available only in Intel® Parallel Studio Composer Edition. Ecosystem compatibilityGreater ProductivityFaster Performance Prebuilt & Accelerated Packages Supports Python 2.7 & 3.6, conda, pip Operating System: Windows*, Linux*, MacOS1* Intel® Architecture Platforms Performance Libraries, Parallelism, Multithreading, Language Extensions Accelerated NumPy/SciPy/scikit-learn with Intel® MKL1 & Intel® DAAL2 Data analytics, machine learning & deep learning with scikit-learn, pyDAAL Scale with Numba* & Cython* Includes optimized mpi4py, works with Dask* & PySpark* Optimized for latest Intel® architecture Prebuilt & optimized packages for numerical computing, machine/deep learning, HPC, & data analytics Drop in replacement for existing Python - No code changes required Jupyter* notebooks, Matplotlib included Conda build recipes included in packages Free download & free for all uses including commercial deployment Compatible & powered by Anaconda*, supports conda & pip Distribution & individual optimized packages also available at conda & Anaconda.org, YUM/APT, Docker image on DockerHub Optimizations upstreamed to main Python trunk Commercial support through Intel® Parallel Studio XE 2017 1Intel® Math Kernel Library 2Intel® Data Analytics Acceleration Library
  • 4.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 4
  • 5.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 5
  • 6.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 6
  • 7.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Installing Intel® Distribution for Python* 2018 7 Standalone Installer Anaconda.org Anaconda.org/intel channel YUM/APT Docker Hub Download full installer from https://software.intel.com/en-us/intel-distribution-for-python > conda config --add channels intel > conda install intelpython3_full > conda install intelpython3_core docker pull intelpython/intelpython3_full Access for yum/apt: https://software.intel.com/en-us/articles/installing-intel-free-libs-and- python 2.7 & 3.6
  • 8.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 8 Outside of optimized Python*, how efficient is your Python/C/C++ application code? Are there any non-obvious sources of performance loss? Performance analysis gives the answer! But Wait…..There’s More!
  • 9.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Cluster EditionProfessional Edition Composer Edition Intel® Parallel Studio XE Create Faster Code…Faster 9 More Power for Your Code - software.intel.com/intel-parallel-studio-xe Intel® VTune™ Amplifier Performance Profiler ANALYZE Analysis Tools Intel® Advisor Vectorization Optimization & Thread Prototyping Intel® Inspector Memory & Thread Debugger SCALE Cluster Tools Intel® Trace Analyzer & Collector MPI Tuning & Analysis Intel® MPI Library Message Passing Interface Library Intel® Cluster Checker Cluster Diagnostic Expert System Operating System: Windows*, Linux*, MacOS1* Intel® Architecture Platforms BUILD Compilers & Libraries C / C++ Compiler Optimizing Compiler Intel® Distribution for Python* High Performance Scripting Intel® MKL Fast Math Kernel Library Intel® IPP Image, Signal & Data Processing Intel® TBB C++ Threading Library Intel® DAAL Data Analytics Library Fortran Compiler Optimizing Compiler
  • 10.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 10 A 2-prong approach for Faster Python* Performance High Performance Python Distribution + Performance Profiling  Leverage optimized native libraries for performance  Drop-in replacement for your current Python - no code changes required  Optimized for multi-core and latest Intel processors Step 1: Use Intel® Distribution for Python  Get detailed summary of entire application execution profile  Auto-detects & profiles Python/C/C++ mixed code & extensions with low overhead  Accurately detect hotspots - line level analysis helps you make smart optimization decisions fast!  Available in Intel® Parallel Studio XE Professional & Cluster Edition Step 2: Use Intel® VTune™ Amplifier for profiling
  • 11.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice 11 FAQs - Do I need to make any changes to my code? - Can I use blah blah framework/tool with IDP? - How much performance I can expect on Non-Intel architecture? - How much performance gain I can make in my blah blah application if I use IDP instead of stock Python? - I installed IDP but I can’t see any performance improvement in my application. Why? - Is Conda Python and Intel Python same?
  • 12.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Intel® Distribution for Python  Product page – overview, features, FAQs…  Training materials – movies, tech briefs, documentation, evaluation guides…  Support – forums, secure support… 12 More Resources Intel® VTune Amplifier  Product page – overview, features, FAQs…  Training materials – movies, tech briefs, documentation, evaluation guides…  Reviews  Support – forums, secure support…
  • 13.
    Copyright © 2017,Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Optimization Notice Legal Disclaimer & Optimization Notice INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © 2017, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 13

Editor's Notes

  • #3 Reducing performance gap between Python and Native code requires programming skills. Python limitations Runs slower than C, C++, Fortran for CPU-intensive workloads. GIL – prevents efficient execution of multithreaded applications Building python packages with dependencies needs some level of expertise
  • #4 Suited for HPC, enterprise and cloud workloads running on Intel Xeon, Xeon Phi, and Core processor-based platforms The suite includes: optimizing C++ and Fortran compilers, high performance Python, ready-to-use, high performance libraries, (mention Intel® TBB is not an Intel-proprietary programming model) performance analysis tools, memory and threading debugger, Vectorization optimization and thread prototyping tool, MPI library, MPI analysis tools And cluster diagnostic expert system
  • #8 Ease-of-use starts with installation 2 options: Stand-alone installer Or with conda if one already has it With conda you need to add intel channel to install and update meta packages or individual packages Nice about Python: same on all platforms (Intel supports 3) Easy to use after installation, no further settings / changes necessary
  • #10 I would like to introduce Intel® Parallel Studio XE, a suite of development tools for technical computing, enterprise and cloud developers. The suite includes: optimizing C++ and Fortran compilers, high performance Python, ready-to-use, high performance libraries, (mention Intel® TBB is not an Intel-proprietary programming model) performance analysis tools, memory and threading debugger, Vectorization optimization and thread prototyping tool, MPI library, MPI analysis tools And cluster diagnostic expert system The suite enables C, C++, Fortran, Python* software developers to get high performance on today’s and efficiently scale on tomorrow’s Intel platforms. The tools simplify the creation of fast, reliable parallel code. Intel® Parallel Studio XE integrates into leading development environments, leveraging investment in existing code. Developers can choose from one of the three editions that meets their needs and requirements.