Hybrid programming with    C++ and Python    PyConZA 2012 - Cape Town  James Saunders - BusinessOptics
Overview● Talk should be:    ○ Not super technical    ○ Some C++ knowledge required    ○ No advanced understanding of pyth...
Some ProvisosI am talking about cPython (not applicable toother python implementations)We use linux (though most things ar...
Why would you want to do this?● Write performance intensive code in C++,  but use Python for everything else● Use cool lib...
Our Problem              Django Webstack              C++ Computation                  Engine              Machine learnin...
Using C++ in python (extending)                                    from mycpplib import FortuneTellerclass FortuneTeller {...
The Fundamentals● Python extension modules are shared  libraries  ○ *.so in linux  ○ *.dll in windows  ○ I know nothing ab...
Python.h                                                            BLEHG!static PyObject *my_callback = NULL;            ...
boost::python● Higher level pure C++  ○ No silly IDL● Works nicely with the rest of boost  (Awesome C++ libraries for ever...
boost::python example#include <boost/python.hpp>using namespace boost::python;BOOST_PYTHON_MODULE(mycpplib){    class_<For...
Py++ and automated wrappergeneration● Python package● Takes in C++ header files● Generates boost::python code (C++) to wra...
Basic Strategy                      C++  C++                   wrappingheader    Py++      Source  files   script      fil...
Our first attempt (The horror)● 2500 line monolithic C++ module● Took forever to compile● Had to be hand tweaked to get it...
Making things actually work           Find a better solution                     orHire a full time trauma counselor for t...
Write a little (sort of) DSL●   Declarative●   Abstract the lower-level py++ methods●   Explicit and clear●   Basically a ...
Expose only what you need● By default py++ will expose every class and  all public/protected methods● This means even smal...
Convert where you can● Sometimes its easier to automatically convert  between C++ types and Python types  ○ Some types are...
Conversion ExamplesWorked well:● python datetime to boost::posix_time● Lots of utilities in python for datetime● boost::po...
Use the preprocessor                          What is the C preprocessor?● The C  preprocessor is fast    It the thing tha...
Custom wrapper functions● Sometimes functions just dont wrap nicely  ○ e.g. when they take a vector<Something>      and yo...
Custom wrapper exampleTo wrap a method:                                           Same name as  int my_sum_method(vector<i...
Call policies● Methods can return objects that have to be treated in  different ways.  ○ Objects by value  ○ Raw pointers ...
The GIL● The Global Interpreter Lock ensures only  one python instruction runs at one time.● "the thing in CPython that pr...
Releasing the GIL● You can release       Python code  the GIL to allow      Your long running C++ method  other threads to...
Write your code with wrapping inmind● Sometimes you have to change the way you  write code● Should try to avoid this but B...
Debugging through the layers● Wrapped code can be hard to debug● You can run python under GDB● Step through the Python VM ...
Automate everything● Customisation, wrapper generation and  compilation should all be automated● Use a decent build system...
CMake The final system                          (Build                                          System)                   ...
The end result● Single simple configuration file● Completely automated generation and  compilation● Speedy compilation● Ea...
Using Python from C++ (Embedding)           C++ Computation               Engine           Machine learning in            ...
Embedding vs Passing in objects● Two ways to go about it  ○ Embed an interpreter in your C++, run a script and    use the ...
boost::python again                            Python● Makes it easy to  use python objects        def f(x, y):           ...
Calling methodsSimple ways to call methods and get C++ typesback.string id_number = "7803705112068";object y = x.attr("get...
The GIL againIf you run python code from C++ makesure you still have the GIL aquired.              Python code            ...
Python codeThe GIL again fixed   Your long running C++ method                       Py_BEGIN_ALLOW_THREADS● More Marcos   ...
Lessons and thoughts
The sliding scale between Pythonand C++                         Whats right for you?Python                                ...
If you are gonna do it do it right● Done wrong, wrappings are a nightmare● Done right, they can be quite manageable● Is th...
Alternatives●   Just use libraries: Numpy, etc.●   PyPy (CPPYY)●   Cython●   Weave●   SWIG●   ctypes
ContactJames Saundersjames@businessoptics.biz@james.h.saundershttp://blog.jamessaunders.co.za/www.businessoptics.biztechbl...
Upcoming SlideShare
Loading in...5
×

PyconZa 2012 - hybrid programming in C++ and Python

1,943

Published on

Slides from PyconZA 2012 talk enititled "Hybrid Programming with C++ and Python"

"We run a fairly complicated stack that includes a C++ computation library, we serve this all from a Django based web server. The process of wrapping a complex C++ library for easy use in Python turned out to be both intricate and interesting. We not only use C++ code from Python but also use Python libraries from C++. What I really want to talk about in the basics of how we did it, from how we started trying to do it (which was a monolithic nightmare) to our current system which is fully automated and uses our own Python DSL on top of Py++ (A C++ wrapping code generator) on top of boost/Python on top of our code. The talk should be accessible for people without a great understanding of either Python or C++, but will have enough interesting subtleties for more advanced users to also learn something. In the end we will also briefly talk about alternative solutions and how we might have done things differently if we started again.

Python is a great language, but sometimes you need to use a lower-level library, such as a C++ library. One of the great things about Python is that you can. However if you are trying to wrap (expose to Python) a fairly complex library it can be a lot more tricky. We faced exactly this problem and will share our solution as well as all the little tricks and gotchas along the way. We will focus on the broad principles of hybrid coding (using multiple languages), the technologies we used to communicate between the languages, the systems we built to automate the process, as well as some of the broad lessons and reflections on what we could have done differently and how its made us think differently about development. By the end of the talk the audience should have a reasonably practical understanding of how to wrap and use a complex C++ library in Python and vice versa."

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,943
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
20
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

PyconZa 2012 - hybrid programming in C++ and Python

  1. 1. Hybrid programming with C++ and Python PyConZA 2012 - Cape Town James Saunders - BusinessOptics
  2. 2. Overview● Talk should be: ○ Not super technical ○ Some C++ knowledge required ○ No advanced understanding of python required● Wont dive too deeply into specific code● Will try to be pragmatic● Not exhaustive● Should give an idea of how to expose a complex C++ library to python and vice versa
  3. 3. Some ProvisosI am talking about cPython (not applicable toother python implementations)We use linux (though most things are crossplatform)Compile our code with GCC (Which shouldntmatter but probably does sometimes)Mileage may vary (Like always)
  4. 4. Why would you want to do this?● Write performance intensive code in C++, but use Python for everything else● Use cool libraries in C++/Python from Python/C++● Embed a scripting language● Easily extend complex behaviours in Python● Interact with legacy code (especially when webifying something)
  5. 5. Our Problem Django Webstack C++ Computation Engine Machine learning in scikit-learn
  6. 6. Using C++ in python (extending) from mycpplib import FortuneTellerclass FortuneTeller { public: obj = FortuneTeller (5) FortuneTeller (int luckiness); lucky_numbers = obj. get_lottery () vector<int> get_lottery ();} for num in lucky_numbers: print num
  7. 7. The Fundamentals● Python extension modules are shared libraries ○ *.so in linux ○ *.dll in windows ○ I know nothing about Macs● cPython is written in C and has "native" support for being extended in C● Accessed through Python.h
  8. 8. Python.h BLEHG!static PyObject *my_callback = NULL; ● Low Levelstatic PyObject * ● C basedmy_set_callback(PyObject *dummy, PyObject *args){ ● Lots of PyObject *result = NULL; PyObject *temp; boilerplate if (PyArg_ParseTuple(args, "O:set_callback", &temp)) { if (!PyCallable_Check(temp)) { PyErr_SetString(PyExc_TypeError, "parameter must be callable"); return NULL; } Py_XINCREF(temp); /* Add a reference to new callback */ Py_XDECREF(my_callback); /* Dispose of previous callback */ my_callback = temp; /* Remember new callback */ /* Boilerplate to return "None" */ Py_INCREF(Py_None); result = Py_None; } return result;}
  9. 9. boost::python● Higher level pure C++ ○ No silly IDL● Works nicely with the rest of boost (Awesome C++ libraries for everything)● Takes care of lots of details for you
  10. 10. boost::python example#include <boost/python.hpp>using namespace boost::python;BOOST_PYTHON_MODULE(mycpplib){ class_<FortuneTeller>("FortuneTeller") .def("get_lottery", &FortuneTeller::get_lottery);} But when you have a big library this is still pretty boring
  11. 11. Py++ and automated wrappergeneration● Python package● Takes in C++ header files● Generates boost::python code (C++) to wrap the given header files.● Pretty comprehensive● Reasonably stable (v1.0.0)● Not very active (New maintainer anyone?)● Stack overflow is your friend
  12. 12. Basic Strategy C++ C++ wrappingheader Py++ Source files script files *.hpp *.cpp Python Extension Compile module *.so
  13. 13. Our first attempt (The horror)● 2500 line monolithic C++ module● Took forever to compile● Had to be hand tweaked to get it to compile● Changes involved generating a new version and copying over sections of code from the old version● Terrifying to make changes to the C++● Massively slowed down developmentWorst thing ever!
  14. 14. Making things actually work Find a better solution orHire a full time trauma counselor for the dev team
  15. 15. Write a little (sort of) DSL● Declarative● Abstract the lower-level py++ methods● Explicit and clear● Basically a couple of python functions ○ def limit_class(classname, function_names): ... ○ def exclude_class(classname): ...● Clear process to add and change wrappings● Leave full capabilities of Py++ when needed
  16. 16. Expose only what you need● By default py++ will expose every class and all public/protected methods● This means even small changes to the C++ can mess with the wrappings● Explicitly choose what to expose to python ○ A public interface● Makes changes easier to reason about● Limits unexpected changes in the python interface (use of which is not statically type checked)
  17. 17. Convert where you can● Sometimes its easier to automatically convert between C++ types and Python types ○ Some types are just too difficult to wrap ○ Often types have natural Python analogs● Done for many built in types e.g. strings● Can set automatic converters ○ From C++ return values to python values ○ From Python arguments to C++ arguments● Consider performance● See http://misspent.wordpress. com/2009/09/27/how-to-write-boost-python- converters/
  18. 18. Conversion ExamplesWorked well:● python datetime to boost::posix_time● Lots of utilities in python for datetime● boost::posix_time is super hard to wrapFailed:● Python set to C++ set● Python sets are hashsets, C++ sets are trees● Different semantics (ordering), subtle errors● Expensive to convert
  19. 19. Use the preprocessor What is the C preprocessor?● The C preprocessor is fast It the thing that interprets statements● gccxml (which like these: powers py++) is #include "fileA.hpp" slow or #ifndef FILEA● Use it to aggregate #def FILE A all the headers you ... need into one #endif header file (all.hpp) It is run on your C++ source files● Makes things way before compilation. faster
  20. 20. Custom wrapper functions● Sometimes functions just dont wrap nicely ○ e.g. when they take a vector<Something> and you want to pass a regular python list to them● Write some custom code that does the marshalling between the types you want to work with in python and the types in C++● Inject this into the wrappers (py++ allows you to do this)● !!!Dont do this by hand on the generated files● Can make wrapped objects more pythonic
  21. 21. Custom wrapper exampleTo wrap a method: Same name as int my_sum_method(vector<int> numbers) underlying function, usesto take a python list. overloadingint my_sum_method(bp::list& pylist_numbers) { ::std::vector<int> vector_numbers; Python list as a parameter for (int i = 0; i < len(pylist_numbers); ++i) { int number = bp::extract<int>(pylist_numbers[i]); Extract contents of python list and place it in vector vector_numbers.push_back(number); } return my_sum_method(vector_number);} Call original method
  22. 22. Call policies● Methods can return objects that have to be treated in different ways. ○ Objects by value ○ Raw pointers ○ References● Sometimes Py++ can figure out what to do, sometimes you need to help it.● You can set the call policy for a method in py++, e.g.: myObj.member_functions("get_child").call_policies = call_policies.return_internal_reference()
  23. 23. The GIL● The Global Interpreter Lock ensures only one python instruction runs at one time.● "the thing in CPython that prevents multiple threads from actually running in your Python code in parallel." -- PyPy Blog● But it treats a call out to a C/C++ routine as a single atomic operation● Bad if your methods are long running. Locks up all other threads.
  24. 24. Releasing the GIL● You can release Python code the GIL to allow Your long running C++ method other threads to Py_BEGIN_ALLOW_THREADS run.● But then you have to aquire it when Your C++ code your method ends● Dont screw this up ○ Think about Py_END_ALLOW_THREADS exceptions Python code● Useful Macros
  25. 25. Write your code with wrapping inmind● Sometimes you have to change the way you write code● Should try to avoid this but BE PRAGMATIC● Some constructs do not translate easily● Dont use exotic structures (unions, wierd memory maps, etc.)● Return types that are easily wrapped or converted (have natural analogs)● Keep your code simple
  26. 26. Debugging through the layers● Wrapped code can be hard to debug● You can run python under GDB● Step through the Python VM and eventually into your own extension module● Takes some setting up but works very nicely● Worth doing!● Checkout the Stripe blog: https://stripe. com/blog/exploring-python-using-gdb
  27. 27. Automate everything● Customisation, wrapper generation and compilation should all be automated● Use a decent build system (Sconstruct, Cmake)● Py++ wont regenerate files that havent changed, works well with MAKE● Dont check generated code into your source control system (git,bzr,hg,svn) ○ Make the generation easily reproducible● Dont let anything slow your team down
  28. 28. CMake The final system (Build System) C++ C++ Single wrappingheader header Wrapping MakeFiles Source files file DSL files *.hpp all.hpp script *.cpp Py++ Python Extension Compile module *.so
  29. 29. The end result● Single simple configuration file● Completely automated generation and compilation● Speedy compilation● Easy to update
  30. 30. Using Python from C++ (Embedding) C++ Computation Engine Machine learning in scikit-learn
  31. 31. Embedding vs Passing in objects● Two ways to go about it ○ Embed an interpreter in your C++, run a script and use the results. ○ Pass python objects to your C++ (Extension Module) and do something with them.● If your code is already an extension module the latter is easier to reason about (IMHO)
  32. 32. boost::python again Python● Makes it easy to use python objects def f(x, y): if (y == foo): in C++, kinda feels x[3:7] = bar else: like python x.items += 3 return x● Has high level C++ using boost::python objects that mimic object f(object x, object y) { python objects if (y == "foo") x.slice(3,7) = "bar"; ○ bp::object, bp::list, else etc. x.attr("items") += 3; return x;● No need to touch } PyObject*
  33. 33. Calling methodsSimple ways to call methods and get C++ typesback.string id_number = "7803705112068";object y = x.attr("get_name")(id_number);string name = extract<string>(y);or automatically do the type conversionbp::call_method<string>(x,"get_name",id_number);Pretty simple hey?
  34. 34. The GIL againIf you run python code from C++ makesure you still have the GIL aquired. Python code Your C++ method Your C++ code Python code Your C++ code Python code
  35. 35. Python codeThe GIL again fixed Your long running C++ method Py_BEGIN_ALLOW_THREADS● More Marcos Your C++ code Py_BEGIN_BLOCK_THREADS Python code Py_BEGIN_UNBLOCK_THREADS Your C++ code Py_END_ALLOW_THREADS Python code
  36. 36. Lessons and thoughts
  37. 37. The sliding scale between Pythonand C++ Whats right for you?Python C++ ● Speed of ● Performance development ● Huge amount of ● Elegance existing code ● joie de vivre Actually pretty fast Not that bad to use
  38. 38. If you are gonna do it do it right● Done wrong, wrappings are a nightmare● Done right, they can be quite manageable● Is the extra performance worth the development overhead?● If you are writing the C++ start the wrapping process early
  39. 39. Alternatives● Just use libraries: Numpy, etc.● PyPy (CPPYY)● Cython● Weave● SWIG● ctypes
  40. 40. ContactJames Saundersjames@businessoptics.biz@james.h.saundershttp://blog.jamessaunders.co.za/www.businessoptics.biztechblog.businessoptics.biz
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×