Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PyHEP 2018: Tools to bind to Python

33 views

Published on

Invited talk for the PyHEP workshop in Sofia, Bulgaria.

Published in: Software
  • Be the first to comment

  • Be the first to like this

PyHEP 2018: Tools to bind to Python

  1. 1. Tools to Bind to Python Henry Schreiner PyHEP 2018 This talk is interactive, and can be run in SWAN. If you want to run it manually, just download the repository: . Either use the menu option CELL -> Run All or run all code cells in order (don't skip one!) github.com/henryiii/pybindings_cc (https://github.com/henryiii/pybindings_cc) (https://cern.ch/swanserver/cgi-bin/go? projurl=https://github.com/henryiii/pybindings_cc.git)
  2. 2. Focus What Python bindings do How Python bindings work What tools are available
  3. 3. Caveats Will cover C++ and C binding only Will not cover every tool available Will not cover cppyy in detail (but see Enric's talk) Python 2 is dying, long live Python 3! but this talk is Py2 compatible also
  4. 4. Overview: Part one ctypes, CFFI : Pure Python, C only CPython: How all bindings work SWIG: Multi-language, automatic Cython: New language Pybind11: Pure C++11 CPPYY: From ROOT's JIT engine Part two An advanced binding in Pybind11
  5. 5. Since this talk is an interactive notebook, no code will be hidden. Here are the required packages: In [1]: Not on SWAN: cython, cppyy SWIG is also needed but not a python module Using Anaconda recommended for users not using SWAN !pip install --user cffi pybind11 numba # Other requirements: cython cppyy (SWIG is also needed but not a python module) # Using Anaconda recommended for users not using SWAN Requirement already satisfied: cffi in /eos/user/h/hschrein/.local/lib/pytho n3.6/site-packages Requirement already satisfied: pybind11 in /eos/user/h/hschrein/.local/lib/p ython3.6/site-packages Requirement already satisfied: numba in /cvmfs/sft-nightlies.cern.ch/lcg/vie ws/dev3python3/Wed/x86_64-slc6-gcc62-opt/lib/python3.6/site-packages Requirement already satisfied: pycparser in /eos/user/h/hschrein/.local/lib/ python3.6/site-packages (from cffi) Requirement already satisfied: llvmlite in /eos/user/h/hschrein/.local/lib/p ython3.6/site-packages (from numba) Requirement already satisfied: numpy in /cvmfs/sft-nightlies.cern.ch/lcg/vie ws/dev3python3/Wed/x86_64-slc6-gcc62-opt/lib/python3.6/site-packages (from n umba) You are using pip version 9.0.3, however version 10.0.1 is available. You should consider upgrading via the 'pip install --upgrade pip' command.
  6. 6. And, here are the standard imports. We will also add two variables to help with compiling: In [2]: from __future__ import print_function import os import sys from pybind11 import get_include inc = '-I ' + get_include(user=True) + ' -I ' + get_include(user=False) plat = '-undefined dynamic_lookup' if 'darwin' in sys.platform else '-fPIC' print('inc:', inc) print('plat:', plat) inc: -I /eos/user/h/hschrein/.local/include/python3.6m -I /cvmfs/sft-nightli es.cern.ch/lcg/nightlies/dev3python3/Wed/Python/3.6.5/x86_64-slc6-gcc62-opt/ include/python3.6m plat: -fPIC
  7. 7. What is meant by bindings? Bindings allow a function(alitiy) in a library to be accessed from Python. We will start with this example: In [3]: Desired usage in Python: %%writefile simple.c float square(float x) { return x*x; } y = square(x) Overwriting simple.c
  8. 8. C bindings are very easy. Just compile into a shared library, then open it in python with the built in module: In [4]: In [5]: This may be all you need! Example: Python interface. In for iOS, we can even use ctypes to access Apple's public APIs! ctypes (https://docs.python.org/3.7/library/ctypes.html) ctypes (https://docs.python.org/3.7/library/ctypes.html) !cc simple.c -shared -o simple.so from ctypes import cdll, c_float lib = cdll.LoadLibrary('./simple.so') lib.square.argtypes = (c_float,) lib.square.restype = c_float lib.square(2.0) AmpGen (https://gitlab.cern.ch/lhcb/Gauss/blob/LHCBGAUSS- 1058.AmpGenDev/Gen/AmpGen/options/ampgen.py) Pythonista (http://omz-software.com/pythonista/) Out[5]: 4.0
  9. 9. The C Foreign Function Interface for Python Still C only Developed for PyPy, but available in CPython too The same example as before: In [6]: CFFI (http://cffi.readthedocs.io/en/latest/overview.html) from cffi import FFI ffi = FFI() ffi.cdef("float square(float);") C = ffi.dlopen('./simple.so') C.square(2.0) Out[6]: 4.0
  10. 10. Let's see how bindings work before going into C++ binding tools This is how CPython itself is implemented CPython (python.org) C reminder: static means visible in this file only
  11. 11. In [7]: %%writefile pysimple.c #include <Python.h> float square(float x) {return x*x; } static PyObject* square_wrapper(PyObject* self, PyObject* args) { float input, result; if (!PyArg_ParseTuple(args, "f", &input)) {return NULL;} result = square(input); return PyFloat_FromDouble(result);} static PyMethodDef pysimple_methods[] = { { "square", square_wrapper, METH_VARARGS, "Square function" }, { NULL, NULL, 0, NULL } }; #if PY_MAJOR_VERSION >= 3 static struct PyModuleDef pysimple_module = { PyModuleDef_HEAD_INIT, "pysimple", NULL, -1, pysimple_methods}; PyMODINIT_FUNC PyInit_pysimple(void) { return PyModule_Create(&pysimple_module); } #else DL_EXPORT(void) initpysimple(void) { Py_InitModule("pysimple", pysimple_methods); } #endif Overwriting pysimple.c
  12. 12. Build: In [8]: Run: In [9]: !cc {inc} -shared -o pysimple.so pysimple.c {plat} import pysimple pysimple.square(2.0) Out[9]: 4.0
  13. 13. C++: Why do we need more? Sometimes simple is enough! export "C" allows C++ backend C++ API can have: overloading, classes, memory management, etc... We could manually translate everything using C API Solution: C++ binding tools!
  14. 14. This is our C++ example: In [10]: %%writefile SimpleClass.hpp #pragma once class Simple { int x; public: Simple(int x): x(x) {} int get() const {return x;} }; Overwriting SimpleClass.hpp
  15. 15. SWIG: Produces "automatic" bindings Works with many output languages Has supporting module built into CMake Very mature Downsides: Can be all or nothing Hard to customize Customizations tend to be language specific Slow development (swig.org)
  16. 16. In [11]: In [12]: %%writefile SimpleSWIG.i %module simpleswig %{ /* Includes the header in the wrapper code */ #include "SimpleClass.hpp" %} /* Parse the header file to generate wrappers */ %include "SimpleClass.hpp" !swig -swiglib Overwriting SimpleSWIG.i /build/jenkins/workspace/install/swig/3.0.12/x86_64-slc6-gcc62-opt/share/swi g/3.0.12
  17. 17. SWAN/LxPlus only: We need to fix the SWIG_LIB path if we are using LCG's version of SWIG (such as on SWAN) In [13]: if 'LCG_VIEW' in os.environ: swiglibold = !swig -swiglib swigloc = swiglibold[0].split('/')[-3:] swiglib = os.path.join(os.environ['LCG_VIEW'], *swigloc) os.environ['SWIG_LIB'] = swiglib
  18. 18. In [14]: In [15]: In [16]: !swig -python -c++ SimpleSWIG.i !c++ -shared SimpleSWIG_wrap.cxx {inc} -o _simpleswig.so {plat} import simpleswig x = simpleswig.Simple(2) x.get() Out[16]: 2
  19. 19. Built to be a Python+C language for high performance computations Performance computation space in competition with Numba Due to design, also makes binding easy Easy to customize result Can write Python 2 or 3, regardless of calling language Downsides: Requires learning a new(ish) language Have to think with three hats Very verbose (http://cython.org)
  20. 20. Aside: Speed comparison Python, Cython, In [17]: In [18]: Numba (https://numba.pydata.org) def f(x): for _ in range(100000000): x=x+1 return x %%time f(1) Out[18]: CPU times: user 6.88 s, sys: 0 ns, total: 6.88 s Wall time: 6.88 s 100000001
  21. 21. In [19]: In [20]: In [21]: %load_ext Cython %%cython def f(int x): for _ in range(10000000): x=x+1 return x %%timeit f(23) 69.7 ns ± 9.78 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  22. 22. In [22]: In [23]: In [24]: import numba @numba.jit def f(x): for _ in range(10000000): x=x+1 return x %time f(41) %%timeit f(41) Out[23]: CPU times: user 0 ns, sys: 11 µs, total: 11 µs Wall time: 56.3 µs 10000041 268 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
  23. 23. Binding with In [25]: Cython (https://cython.org) %%writefile simpleclass.pxd # distutils: language = c++ cdef extern from "SimpleClass.hpp": cdef cppclass Simple: Simple(int x) int get() Overwriting simpleclass.pxd
  24. 24. In [26]: %%writefile cythonclass.pyx # distutils: language = c++ from simpleclass cimport Simple as cSimple cdef class Simple: cdef cSimple *cself def __cinit__(self, int x): self.cself = new cSimple(x) def get(self): return self.cself.get() def __dealloc__(self): del self.cself Overwriting cythonclass.pyx
  25. 25. In [27]: In [28]: In [29]: !cythonize cythonclass.pyx !g++ cythonclass.cpp -shared {inc} -o cythonclass.so {plat} import cythonclass x = cythonclass.Simple(3) x.get() Compiling /eos/user/h/hschrein/SWAN_projects/pybindings_cc/cythonclass.pyx b ecause it changed. [1/1] Cythonizing /eos/user/h/hschrein/SWAN_projects/pybindings_cc/cythoncla ss.pyx Out[29]: 3
  26. 26. Similar to Boost::Python, but easier to build Pure C++11 (no new language required), no dependencies Builds remain simple and don't require preprocessing Easy to customize result Great Gitter community Used in for CUDA too Downsides: (http://pybind11.readthedocs.io/en/stable/) GooFit 2.1+ (https://goofit.github.io) [CHEP talk] (https://indico.cern.ch/event/587955/contributions/2938087/)
  27. 27. In [30]: %%writefile pybindclass.cpp #include <pybind11/pybind11.h> #include "SimpleClass.hpp" namespace py = pybind11; PYBIND11_MODULE(pybindclass, m) { py::class_<Simple>(m, "Simple") .def(py::init<int>()) .def("get", &Simple::get) ; } Overwriting pybindclass.cpp
  28. 28. In [31]: In [32]: !c++ -std=c++11 pybindclass.cpp -shared {inc} -o pybindclass.so {plat} import pybindclass x = pybindclass.Simple(4) x.get() Out[32]: 4
  29. 29. Born from ROOT bindings Built on top of Cling JIT, so can handle templates See Enric's talk for more Downsides: Header code runs in Cling Heavy user requirements (Cling) ROOT vs. pip version Broken on SWAN (so will not show working example here) CPPYY (http://cppyy.readthedocs.io/en/latest/) In [1]: import cppyy
  30. 30. In [2]: cppyy.include('SimpleClass.hpp') x = cppyy.gbl.Simple(5) x.get() --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-2-d0b91c309081> in <module>() ----> 1 cppyy.include('SimpleClass.hpp') 2 x = cppyy.gbl.Simple(5) 3 x.get() AttributeError: module 'cppyy' has no attribute 'include'
  31. 31. Continue to part 2
  32. 32. Binding detailed example: Minuit2 Let's try a non-trivial example: Minuit2 (6.14.0 standalone edition) Requirements Minuit2 6.14.0 standalone edition (included) Pybind11 (included) NumPy C++11 compatible compiler CMake 3 Expectations Be able to minimize a very simple function and get some parameters
  33. 33. Step 1: Get source Download Minuit2 source (provided in minuit2src) Install Pybind11 or add as submodule (provided in pybind11)
  34. 34. Step 2: Plan interface You should know what the C++ looks like, and know what you want the Python to look like For now, let's replicate the C++ experience For example: a simple minimizer for (should quickly find 0 as minimum): Define FCN Setup parameters Minimize Print result Will use print out for illustration (instead of MnPrint::SetLevel) f (x) = x2
  35. 35. In [1]: %%writefile SimpleFCN.h #pragma once #include <Minuit2/FCNBase.h> #include <Minuit2/FunctionMinimum.h> #include <Minuit2/MnPrint.h> #include <Minuit2/MnMigrad.h> using namespace ROOT::Minuit2; class SimpleFCN : public FCNBase { double Up() const override {return 0.5;} double operator()(const std::vector<double> &v) const override { std::cout << "val = " << v.at(0) << std::endl; return v.at(0)*v.at(0); } }; Overwriting SimpleFCN.h
  36. 36. In [2]: %%writefile simpleminuit.cpp #include "SimpleFCN.h" int main() { SimpleFCN fcn; MnUserParameters upar; upar.Add("x", 1., 0.1); MnMigrad migrad(fcn, upar); FunctionMinimum min = migrad(); std::cout << min << std::endl; } Overwriting simpleminuit.cpp
  37. 37. In [3]: %%writefile CMakeLists.txt cmake_minimum_required(VERSION 3.4) project(Minuit2SimpleExamle LANGUAGES CXX) add_subdirectory(minuit2src) add_executable(simpleminuit simpleminuit.cpp SimpleFCN.h) target_link_libraries(simpleminuit PRIVATE Minuit2::Minuit2) Overwriting CMakeLists.txt
  38. 38. Standard CMake configure and build (using Ninja instead of Make for speed) In [4]: !cmake -GNinja . !cmake --build . -- Configuring done -- Generating done -- Build files have been written to: /eos/user/h/hschrein/SWAN_projects/pybi ndings_cc [2/2] Linking CXX executable simpleminuitinuit.dir/simpleminuit.cpp.o
  39. 39. In [5]: !./simpleminuit val = 1 val = 1.001 val = 0.999 val = 1.0006 val = 0.999402 val = -8.23008e-11 val = 0.000345267 val = -0.000345267 val = -8.23008e-11 val = 0.000345267 val = -0.000345267 val = 6.90533e-05 val = -6.90535e-05 Minuit did successfully converge. # of function calls: 13 minimum function Value: 6.773427082119e-21 minimum edm: 6.773427081817e-21 minimum internal state vector: LAVector parameters: -8.230083281546e-11 minimum internal covariance matrix: LASymMatrix parameters: 1 # ext. || Name || type || Value || Error +/- 0 || x || free || -8.230083281546e-11 ||0.7071067811865
  40. 40. Step 3: Bind parts we need subclassable FCNBase MnUserParameters (constructor and Add(string, double, double)) MnMigrad (constructor and operator()) FunctionMinimum (cout)
  41. 41. Recommended structure of a Pybind11 program main.cpp Builds module Avoids imports (fast compile) include <pybind11/pybind11.h> namespace py = pybind11; void init_part1(py::module &); void init_part2(py::module &); PYBIND11_MODULE(mymodule, m) { m.doc() = "Real code would never have such poor documentation..."; init_part1(m); init_part2(m); }
  42. 42. In [6]: In [7]: mkdir -p pyminuit2 %%writefile pyminuit2/pyminuit2.cpp #include <pybind11/pybind11.h> namespace py = pybind11; void init_FCNBase(py::module &); void init_MnUserParameters(py::module &); void init_MnMigrad(py::module &); void init_FunctionMinimum(py::module &); PYBIND11_MODULE(minuit2, m) { init_FCNBase(m); init_MnUserParameters(m); init_MnMigrad(m); init_FunctionMinimum(m); } Overwriting pyminuit2/pyminuit2.cpp
  43. 43. We will put all headers in a collective header (not a good idea unless you are trying to show files one per slide). In [8]: %%writefile pyminuit2/PyHeader.h #pragma once #include <pybind11/pybind11.h> #include <pybind11/functional.h> #include <pybind11/numpy.h> #include <pybind11/stl.h> #include <Minuit2/FCNBase.h> #include <Minuit2/MnMigrad.h> #include <Minuit2/MnApplication.h> #include <Minuit2/MnUserParameters.h> #include <Minuit2/FunctionMinimum.h> namespace py = pybind11; using namespace pybind11::literals; using namespace ROOT::Minuit2; Overwriting pyminuit2/PyHeader.h
  44. 44. Overloads Pure virtual methods cannot be instantiated in C++ Have to provide "Trampoline class" to provide Python class In [9]: %%writefile pyminuit2/FCNBase.cpp #include "PyHeader.h" class PyFCNBase : public FCNBase { public: using FCNBase::FCNBase; double operator()(const std::vector<double> &v) const override { PYBIND11_OVERLOAD_PURE_NAME( double, FCNBase, "__call__", operator(), v);} double Up() const override { PYBIND11_OVERLOAD_PURE(double, FCNBase, Up, );} }; void init_FCNBase(py::module &m) { py::class_<FCNBase, PyFCNBase>(m, "FCNBase") .def(py::init<>()) .def("__call__", &FCNBase::operator()) .def("Up", &FCNBase::Up); } Overwriting pyminuit2/FCNBase.cpp
  45. 45. Overloaded function signatures: C++11 syntax: (bool (MnUserParameters::*)(const std::string &, double)) &MnUserParameters::Add C++14 syntax: py::overload_cast<const std::string &, double> (&MnUserParameters::Add) In [10]: %%writefile pyminuit2/MnUserParameters.cpp #include "PyHeader.h" void init_MnUserParameters(py::module &m) { py::class_<MnUserParameters>(m, "MnUserParameters") .def(py::init<>()) .def("Add", (bool (MnUserParameters::*)(const std::string &, double)) &M nUserParameters::Add) .def("Add", (bool (MnUserParameters::*)(const std::string &, double, dou ble)) &MnUserParameters::Add) ; } Overwriting pyminuit2/MnUserParameters.cpp
  46. 46. Adding default arguments (and named arguments) Using ""_a literal, names and even defaults can be added In [11]: %%writefile pyminuit2/MnMigrad.cpp #include "PyHeader.h" void init_MnMigrad(py::module &m) { py::class_<MnApplication>(m, "MnApplication") .def("__call__", &MnApplication::operator(), "Minimize the function, returns a function minimum", "maxfcn"_a = 0, "tolerance"_a = 0.1); py::class_<MnMigrad, MnApplication>(m, "MnMigrad") .def(py::init<const FCNBase &, const MnUserParameters &, unsigned int>() , "fcn"_a, "par"_a, "stra"_a = 1) ; } Overwriting pyminuit2/MnMigrad.cpp
  47. 47. Lambda functions Pybind11 accepts lambda functions, as well In [12]: %%writefile pyminuit2/FunctionMinimum.cpp #include "PyHeader.h" #include <sstream> #include <Minuit2/MnPrint.h> void init_FunctionMinimum(py::module &m) { py::class_<FunctionMinimum>(m, "FunctionMinimum") .def("__str__", [](const FunctionMinimum &self) { std::stringstream os; os << self; return os.str(); }) ; } Overwriting pyminuit2/FunctionMinimum.cpp
  48. 48. In [13]: %%writefile CMakeLists.txt cmake_minimum_required(VERSION 3.4) project(Minuit2SimpleExamle LANGUAGES CXX) set(CMAKE_POSITION_INDEPENDENT_CODE ON) add_subdirectory(minuit2src) add_executable(simpleminuit simpleminuit.cpp SimpleFCN.h) target_link_libraries(simpleminuit PRIVATE Minuit2::Minuit2) add_subdirectory(pybind11) file(GLOB OUTPUT pyminuit2/*.cpp) pybind11_add_module(minuit2 ${OUTPUT}) target_link_libraries(minuit2 PUBLIC Minuit2::Minuit2) Overwriting CMakeLists.txt
  49. 49. In [14]: !cmake . !cmake --build . -- pybind11 v2.2.3 -- Configuring done -- Generating done -- Build files have been written to: /eos/user/h/hschrein/SWAN_projects/pybi ndings_cc [85/85] Linking CXX shared module minuit2.cpython-36m-x86_64-linux-gnu.so[Ko
  50. 50. Usage We can now use our module! (Built in the current directory by CMake) In [15]: In [16]: import sys if '.' not in sys.path: sys.path.append('.') import minuit2 class SimpleFCN (minuit2.FCNBase): def Up(self): return 0.5 def __call__(self, v): print("val =", v[0]) return v[0]**2;
  51. 51. In [17]: fcn = SimpleFCN() upar = minuit2.MnUserParameters() upar.Add("x", 1., 0.1) migrad = minuit2.MnMigrad(fcn, upar) min = migrad() val = 1.0 val = 1.001 val = 0.999 val = 1.0005980198587356 val = 0.9994019801412644 val = -8.230083281546285e-11 val = 0.00034526688527999595 val = -0.0003452670498816616 val = -8.230083281546285e-11 val = 0.00034526688527999595 val = -0.0003452670498816616 val = 6.905331121533294e-05 val = -6.905347581699857e-05
  52. 52. In [18]: print(min) Minuit did successfully converge. # of function calls: 13 minimum function Value: 6.773427082119e-21 minimum edm: 6.773427081817e-21 minimum internal state vector: LAVector parameters: -8.230083281546e-11 minimum internal covariance matrix: LASymMatrix parameters: 1 # ext. || Name || type || Value || Error +/- 0 || x || free || -8.230083281546e-11 ||0.7071067811865
  53. 53. Done See for a more complete example Pybind11 bindings can talk to each other at the C level! Overall topics covered: ctypes, CFFI : Pure Python, C only CPython: How all bindings work SWIG: Multi-language, automatic Cython: New language Pybind11: Pure C++11 CPPYY: From ROOT's JIT engine An advanced binding in Pybind11 GooFit's built in Minuit2 bindings (https://github.com/GooFit/GooFit/tree/master/python/Minuit2)
  54. 54. Backup: This is the setup.py file for the Miniut2 bindings. With this, you can use the standard Python tools to build! (but slower and more verbose than CMake) In [19]: %%writefile setup.py from setuptools import setup, Extension from setuptools.command.build_ext import build_ext import sys import setuptools from pathlib import Path # Python 3 or Python 2 backport: pathlib2 import pybind11 # Real code should defer this import
  55. 55. sources = set(str(p) for p in Path('Minuit2-6.14.0-Source/src').glob('**/*.cxx') ) sources.remove('Minuit2-6.14.0-Source/src/TMinuit2TraceObject.cxx') ## Add your sources to `sources` sources |= set(str(p) for p in Path('pyminuit2').glob('*.cpp')) ext_modules = [ Extension( 'minuit2', list(sources), include_dirs=[ pybind11.get_include(False), pybind11.get_include(True), 'Minuit2-6.14.0-Source/inc', ], language='c++', define_macros=[('WARNINGMSG', None), ('MATH_NO_PLUGIN_MANAGER', None), ('ROOT_Math_VecTypes', None) ], ), ] class BuildExt(build_ext): """A custom build extension for adding compiler-specific options.""" c_opts = { 'msvc': ['/EHsc'], 'unix': [], } if sys.platform == 'darwin': c_opts['unix'] += ['-stdlib=libc++', '-mmacosx-version-min=10.7'] def build_extensions(self): ct = self.compiler.compiler_type opts = self.c_opts.get(ct, []) if ct == 'unix':
  56. 56. opts.append('-DVERSION_INFO="%s"' % self.distribution.get_version()) opts.append('-std=c++14') opts.append('-fvisibility=hidden') elif ct == 'msvc': opts.append('/DVERSION_INFO="%s"' % self.distribution.get_versio n()) for ext in self.extensions: ext.extra_compile_args = opts build_ext.build_extensions(self) setup( name='minuit2', version='6.14.0', author='Henry Schriener', author_email='hschrein@cern.ch', url='https://github.com/GooFit/Minuit2', description='A Pybind11 Minuit2 binding', long_description='', ext_modules=ext_modules, install_requires=['pybind11>=2.2', 'numpy>=1.10'], cmdclass={'build_ext': BuildExt}, zip_safe=False, ) Overwriting setup.py
  57. 57. In [20]: #!python setup.py build_ext

×