SlideShare a Scribd company logo
Take advantage of C++
from Python
Yung-Yu Chen
PyCon Kyushu
30th June 2018
Why C++
❖ Python is slow
❖ Everything is on heap
❖ Always dynamic types
❖ Hard to access assembly
❖ Convoluted algorithms with ndarray
❖ Access external code written in any language
❖ Detail control and abstraction
Hard problems take time
• Supersonic jet in cross flow; density contour
• 264 cores with 53 hours for 1.3 B variables (66 M elements) by 12,000 time steps
• At OSC, 2011 (10 Gbps InfiniBand)
HPC (high-performance computing) is hard. Physics is harder. Don’t mingle.
Best of both worlds
❖ C++: fast runtime, strong static type checking, industrial grade
❖ Slow to code
❖ Python: fast prototyping, batteries included, easy to use
❖ Slow to run
❖ Hybrid system is everywhere.
❖ TensorFlow, Blender, OpenCV, etc.
❖ C++ crunches numbers. Python controls the flow.
❖ Applications work like libraries, libraries like applications.
pybind11
❖ https://github.com/pybind/pybind11: C++11
❖ Expose C++ entities to Python
❖ Use Python from C++
❖ list, tuple, dict, and str
❖ handle, object, and none
C++11(/14/17/20)
New language features: auto and decltype, defaulted and deleted
functions, final and override, trailing return type, rvalue references,
move constructors/move assignment, scoped enums, constexpr and
literal types, list initialization, delegating and inherited constructors,
brace-or-equal initializers, nullptr, long long, char16_t and char32_t,
type aliases, variadic templates, generalized unions, generalized
PODs, Unicode string literals, user-defined literals, attributes,
lambda expressions, noexcept, alignof and alignas, multithreaded
memory model, thread-local storage, GC interface, range for (based
on a Boost library), static assertions (based on a Boost library)
http://en.cppreference.com/w/cpp/language/history
Python’s friends
❖ Shared pointer: manage resource ownership between
C++ and Python
❖ Move semantics: speed
❖ Lambda expression: ease the wrapping code
Ownership
❖ All Python objects are dynamically allocated on the
heap. Python uses reference counting to know who
should deallocate the object when it is no longer used.
❖ A owner of the reference to an object is responsible for
deallocating the object. With multiple owners, the last
owner (at this time, the reference count is 1) calls the
destructor and deallocate. Other owners simply
decrement the count by 1.
Shared pointer
#include <memory>
#include <vector>
#include <algorithm>
#include <iostream>
class Series {
std::vector<int> m_data;
public:
int sum() const {
const int ret = std::accumulate(
m_data.begin(), m_data.end(), 0);
std::cout << "Series::sum() = " << ret << std::endl;
return ret;
}
static size_t count;
Series(size_t size, int lead) : m_data(size) {
for (size_t it=0; it<size; it++) { m_data[it] = lead+it; }
count++;
}
~Series() { count--; }
};
size_t Series::count = 0;
void use_raw_pointer() {
Series * series_ptr = new Series(10, 2);
series_ptr->sum(); // call member function
// OUT: Series::sum() = 65
// remember to delete the object or we leak memory
std::cout << "before explicit deletion, Series::count = "
<< Series::count << std::endl;
// OUT: before explicit deletion, Series::count = 1
delete series_ptr;
std::cout << "after the resource is manually freed, Series::count = "
<< Series::count << std::endl;
// OUT: after the resource is manually freed, Series::count = 0
}
void use_shared_pointer() {
std::shared_ptr<Series> series_sptr(new Series(10, 3));
series_sptr->sum(); // call member function
// OUT: Series::sum() = 75
// note shared_ptr handles deletion for series_sptr
}
int main(int argc, char ** argv) {
// the common raw pointer
use_raw_pointer();
// now, shared_ptr
use_shared_pointer();
std::cout << "no memory leak: Series::count = "
<< Series::count << std::endl;
// OUT: no memory leak: Series::count = 0
return 0;
}
Move semantics
❖ Number-crunching code needs large arrays as memory buffers.
They aren’t supposed to be copied frequently.
❖ 50,000 × 50,000 takes 20 GB.
❖ Shared pointers should manage large chucks of memory.
❖ New reference to an object: copy constructor of shared pointer
❖ Borrowed reference to an object: const reference to the shared
pointer
❖ Stolen reference to an object: move constructor of shared
pointer
Lambda
❖ Put the code at the place it should be shown
namespace py = pybind11;
cls = py::class_< wrapped_type, holder_type >(mod, pyname, clsdoc):
cls
.def(
py::init([](block_type & block, index_type icl, bool init_sentinel) {
return wrapped_type(block, icl, init_sentinel);
}),
py::arg("block"), py::arg("icl"), py::arg("init_sentinel")=true
)
.def("repr", &wrapped_type::repr, py::arg("indent")=0, py::arg("precision")=0)
.def("__repr__", [](wrapped_type & self){ return self.repr(); })
.def("init_sentinel", &wrapped_type::init_sentinel)
.def_readwrite("cnd", &wrapped_type::cnd)
.def_readwrite("vol", &wrapped_type::vol)
.def_property_readonly(
"nbce",
[](wrapped_type & self) { return self.bces.size(); }
)
.def(
"get_bce",
[](wrapped_type & self, index_type ibce) { return self.bces.at(ibce); }
)
;
Lambda, cont’d
❖ Code as free as Python, as fast as C
#include <unordered_map>
#include <functional>
#include <cstdio>
int main(int argc, char ** argv) {
// Python: fmap = dict()
std::unordered_map<int, std::function<void(int)>> fmap;
// Python: fmap[1] = lambda v: print("v = %d" % v)
fmap.insert({
1, [](int v) -> void { std::printf("v = %dn", v); }
});
// Python: fmap[5] = lambda v: print("v*5 = %d" % (v*5))
fmap.insert({
5, [](int v) -> void { std::printf("v*5 = %dn", v*5); }
});
std::unordered_map<int, std::function<void(int)>>::iterator search;
// Python: fmap[1](100)
search = fmap.find(1);
search->second(100);
// OUT: v = 100
// Python: fmap[5](500)
search = fmap.find(5);
search->second(500);
// OUT: v*5 = 2500
return 0;
}
Manipulate Python
❖ Don’t mingle Python with C++
❖ Python has GIL
❖ Don’t include Python.h if you don’t intend to run
Python
❖ Once it enters your core, it’s hard to get it off
#include <Python.h>
class Core {
private:
int m_value;
PyObject * m_pyobject;
};
Do it in the wrapping layer
cls
.def(
py::init([](py::object pyblock) {
block_type * block = py::cast<block_type *>(pyblock.attr("_ustblk"));
std::shared_ptr<wrapped_type> svr = wrapped_type::construct(block->shared_from_this());
for (auto bc : py::list(pyblock.attr("bclist"))) {
std::string name = py::str(bc.attr("__class__").attr("__name__").attr("lstrip")("GasPlus"));
BoundaryData * data = py::cast<BoundaryData *>(bc.attr("_data"));
std::unique_ptr<gas::TrimBase<NDIM>> trim;
if ("Interface" == name) {
trim = make_unique<gas::TrimInterface<NDIM>>(*svr, *data);
} else if ("NoOp" == name) {
trim = make_unique<gas::TrimNoOp<NDIM>>(*svr, *data);
} else if ("NonRefl" == name) {
trim = make_unique<gas::TrimNonRefl<NDIM>>(*svr, *data);
} else if ("SlipWall" == name) {
trim = make_unique<gas::TrimSlipWall<NDIM>>(*svr, *data);
} else if ("Inlet" == name) {
trim = make_unique<gas::TrimInlet<NDIM>>(*svr, *data);
} else {
/* do nothing for now */ // throw std::runtime_error("BC type unknown");
}
svr->trims().push_back(std::move(trim));
}
if (report_interval) { svr->make_qty(); }
return svr;
}),
py::arg("block")
);
pybind11::list
❖ Read a list and cast contents:
❖ Populate:
#include <pybind11/pybind11.h> // must be first
#include <string>
#include <iostream>
namespace py = pybind11;
PYBIND11_MODULE(_pylist, mod) {
mod.def(
"do",
[](py::list & l) {
// convert contents to std::string and send to cout
std::cout << "std::cout:" << std::endl;
for (py::handle o : l) {
std::string s = py::cast<std::string>(o);
std::cout << s << std::endl;
}
}
);
mod.def(
"do2",
[](py::list & l) {
// create a new list
std::cout << "py::print:" << std::endl;
py::list l2;
for (py::handle o : l) {
std::string s = py::cast<std::string>(o);
s = "elm:" + s;
py::str s2(s);
l2.append(s2); // populate contents
}
py::print(l2);
}
);
} /* end PYBIND11_PLUGIN(_pylist) */
>>> import _pylist
>>> # print the input list
>>> _pylist.do(["a", "b", "c"])
std::cout:
a
b
c
>>> _pylist.do2(["d", "e", "f"])
py::print:
['elm:d', 'elm:e', 'elm:f']
pybind11::tuple
❖ Tuple is immutable, thus
behaves like read-only. The
construction is through another
iterable object.
❖ Read the contents of a tuple:
#include <pybind11/pybind11.h> // must be first
#include <vector>
namespace py = pybind11;
PYBIND11_MODULE(_pytuple, mod) {
mod.def(
"do",
[](py::args & args) {
// build a list using py::list::append
py::list l;
for (py::handle h : args) {
l.append(h);
}
// convert it to a tuple
py::tuple t(l);
// print it out
py::print(py::str("{} len={}").format(t, t.size()));
// print the element one by one
for (size_t it=0; it<t.size(); ++it) {
py::print(py::str("{}").format(t[it]));
}
}
);
} /* end PYBIND11_PLUGIN(_pytuple) */
>>> import _pytuple
>>> _pytuple.do("a", 7, 5.6)
('a', 7, 5.6) len=3
a
7
5.6
pybind11::dict
❖ Dictionary is one of the
most useful container in
Python.
❖ Populate a dictionary:
❖ Manipulate it:
#include <pybind11/pybind11.h> // must be first
#include <string>
#include <stdexcept>
#include <iostream>
namespace py = pybind11;
PYBIND11_MODULE(_pydict, mod) {
mod.def(
"do",
[](py::args & args) {
if (args.size() % 2 != 0) {
throw std::runtime_error("argument number must be even");
}
// create a dict from the input tuple
py::dict d;
for (size_t it=0; it<args.size(); it+=2) {
d[args[it]] = args[it+1];
}
return d;
}
);
mod.def(
"do2",
[](py::dict d, py::args & args) {
for (py::handle h : args) {
if (d.contains(h)) {
std::cout << py::cast<std::string>(h)
<< " is in the input dictionary" << std::endl;
} else {
std::cout << py::cast<std::string>(h)
<< " is not found in the input dictionary" << std::endl;
}
}
std::cout << "remove everything in the input dictionary!" << std::endl;
d.clear();
return d;
}
);
} /* end PYBIND11_PLUGIN(_pydict) */
>>> import _pydict
>>> d = _pydict.do("a", 7, "b", "name", 10, 4.2)
>>> print(d)
{'a': 7, 'b': 'name', 10: 4.2}
>>> d2 = _pydict.do2(d, "b", "d")
b is in the input dictionary
d is not found in the input dictionary
remove everything in the input dictionary!
>>> print("The returned dictionary is empty:", d2)
The returned dictionary is empty: {}
>>> print("The first dictionary becomes empty too:", d)
The first dictionary becomes empty too: {}
>>> print("Are the two dictionaries the same?", d2 is d)
Are the two dictionaries the same? True
pybind11::str
❖ One more trick with
Python strings in
pybind11; user-defined
literal:



#include <pybind11/pybind11.h> // must be first
#include <iostream>
namespace py = pybind11;
using namespace py::literals; // to bring in the `_s` literal
PYBIND11_MODULE(_pystr, mod) {
mod.def(
"do",
[]() {
py::str s("python string {}"_s.format("formatting"));
py::print(s);
}
);
} /* end PYBIND11_PLUGIN(_pystr) */
>>> import _pystr
>>> _pystr.do()
python string formatting
Generic Python objects
❖ Pybind11 defines two generic types for representing
Python objects:
❖ “handle”: base class of all pybind11 classes for Python
types
❖ “object” derives from handle and adds automatic
reference counting
pybind11::handle and object
manually descrases refcount after h.dec_ref(): 3
#include <pybind11/pybind11.h> // must be first
#include <iostream>
namespace py = pybind11;
using namespace py::literals; // to bring in the `_s` literal
PYBIND11_MODULE(_pyho, mod) {
mod.def(
"do",
[](py::object const & o) {
std::cout << "refcount in the beginning: "
<< o.ptr()->ob_refcnt << std::endl;
py::handle h(o);
std::cout << "no increase of refcount with a new pybind11::handle: "
<< h.ptr()->ob_refcnt << std::endl;
{
py::object o2(o);
std::cout << "increased refcount with a new pybind11::object: "
<< o2.ptr()->ob_refcnt << std::endl;
}
std::cout << "decreased refcount after the new pybind11::object destructed: "
<< o.ptr()->ob_refcnt << std::endl;
h.inc_ref();
std::cout << "manually increases refcount after h.inc_ref(): "
<< h.ptr()->ob_refcnt << std::endl;
h.dec_ref();
std::cout << "manually descrases refcount after h.dec_ref(): "
<< h.ptr()->ob_refcnt << std::endl;
}
);
} /* end PYBIND11_PLUGIN(_pyho) */
>>> import _pyho
>>> _pyho.do(["name"])
refcount in the beginning: 3
no increase of refcount with a new pybind11::handle: 3
increased refcount with a new pybind11::object: 4
decreased refcount after the new pybind11::object destructed: 3
manually increases refcount after h.inc_ref(): 4
pybind11::none
❖ It’s worth noting that
pybind11 has “none”
type. In Python, None is
a singleton, and
accessible as Py_None in
the C API.
❖ Access None single from
C++:
#include <pybind11/pybind11.h> // must be first
#include <iostream>
namespace py = pybind11;
using namespace py::literals; // to bring in the `_s` literal
PYBIND11_MODULE(_pynone, mod) {
mod.def(
"do",
[](py::object const & o) {
if (o.is(py::none())) {
std::cout << "it is None" << std::endl;
} else {
std::cout << "it is not None" << std::endl;
}
}
);
} /* end PYBIND11_PLUGIN(_pynone) */
>>> import _pynone
>>> _pynone.do(None)
it is None
>>> _pynone.do(False)
it is not None
Fast Code with C++
Never loop in Python
❖ Sum 100,000,000 integers
❖ The C++ version:
❖ Numpy is better, but not enough
$ python -m timeit -s 'data = range(100000000)' 'sum(data)'
10 loops, best of 3: 2.36 sec per loop
$ time ./run
real 0m0.010s
user 0m0.002s
sys 0m0.004s
#include <cstdio>
int main(int argc, char ** argv) {
long value = 0;
for (long it=0; it<100000000; ++it) { value += it; }
return 0;
}
$ python -m timeit -s 'import numpy as np ; data =
np.arange(100000000, dtype="int64")' 'data.sum()'
10 loops, best of 3: 74.9 msec per loop
Wisely use arrays
❖ Python calls are expensive. Data need to be transferred
from Python to C++ in batch. Use arrays.
❖ C++ code may use arrays as internal representation. For
example, matrices are arrays having a 2-D view.
❖ Arrays are used as both
❖ interface between Python and C++, and
❖ internal storage in the C++ engine
Arrays in Python
❖ What we really mean is numpy(.ndarray)
❖ 12 lines to create vertices for zig-zagging mesh
❖ They get things done, although sometimes look convoluted
# create nodes.
nodes = []
for iy, yloc in enumerate(np.arange(y0, y1+dy/4, dy/2)):
if iy % 2 == 0:
meshx = np.arange(x0, x1+dx/4, dx, dtype='float64')
else:
meshx = np.arange(x0+dx/2, x1-dx/4, dx, dtype='float64')
nodes.append(np.vstack([meshx, np.full_like(meshx, yloc)]).T)
nodes = np.vstack(nodes)
assert nodes.shape[0] == nnode
blk.ndcrd[:,:] = nodes
assert (blk.ndcrd == nodes).all()
Expose memory buffer
class Buffer: public std::enable_shared_from_this<Buffer> {
private:
size_t m_length = 0;
char * m_data = nullptr;
struct ctor_passkey {};
public:
Buffer(size_t length, const ctor_passkey &)
: m_length(length) { m_data = new char[length](); }
static std::shared_ptr<Buffer> construct(size_t length) {
return std::make_shared<Buffer>(length, ctor_passkey());
}
~Buffer() {
if (nullptr != m_data) {
delete[] m_data;
m_data = nullptr;
}
}
/** Backdoor */
template< typename T >
T * data() const { return reinterpret_cast<T*>(m_data); }
};
py::array from(array_flavor flavor) {
// ndarray shape and stride
npy_intp shape[m_table.ndim()];
std::copy(m_table.dims().begin(),
m_table.dims().end(),
shape);
npy_intp strides[m_table.ndim()];
strides[m_table.ndim()-1] = m_table.elsize();
for (ssize_t it = m_table.ndim()-2; it >= 0; --it) {
strides[it] = shape[it+1] * strides[it+1];
}
// create ndarray
void * data = m_table.data();
py::object tmp = py::reinterpret_steal<py::object>(
PyArray_NewFromDescr(
&PyArray_Type,
PyArray_DescrFromType(m_table.datatypeid()),
m_table.ndim(),
shape,
strides,
data,
NPY_ARRAY_WRITEABLE,
nullptr));
// link lifecycle to the underneath buffer
py::object buffer = py::cast(m_table.buffer());
py::array ret;
if (PyArray_SetBaseObject((PyArrayObject *)tmp.ptr(),
buffer.inc_ref().ptr()) == 0) {
ret = tmp;
}
return ret;
}
Internal buffer Expose the buffer as ndarray
❖ Numpy arrays provide the most common construct: a
contiguous memory buffer, and tons of code
❖ N-dimensional arrays (ndarray)
❖ There are variants, but less useful in C++: masked
array, sparse matrices, etc.
Define your meta data
❖ Free to define how the memory is used
class LookupTableCore {
private:
std::shared_ptr<Buffer> m_buffer;
std::vector<index_type> m_dims;
index_type m_nghost = 0;
index_type m_nbody = 0;
index_type m_ncolumn = 0;
index_type m_elsize = 1; ///< Element size in bytes.
DataTypeId m_datatypeid = MH_INT8;
public:
index_type ndim() const { return m_dims.size(); }
index_type nghost() const { return m_nghost; }
index_type nbody() const { return m_nbody; }
index_type nfull() const { return m_nghost + m_nbody; }
index_type ncolumn() const { return m_ncolumn; }
index_type nelem() const { return nfull() * ncolumn(); }
index_type elsize() const { return m_elsize; }
DataTypeId datatypeid() const { return m_datatypeid; }
size_t nbyte() const { return buffer()->nbyte(); }
};
0
bodyghost
Organize arrays
❖ LookupTable is a class
template providing static
information for the dynamic
array core
❖ Now we can put together a
class that keeps track of all
data for computation
template< size_t NDIM >
class UnstructuredBlock {
private:
// geometry arrays.
LookupTable<real_type, NDIM> m_ndcrd;
LookupTable<real_type, NDIM> m_fccnd;
LookupTable<real_type, NDIM> m_fcnml;
LookupTable<real_type, 0> m_fcara;
LookupTable<real_type, NDIM> m_clcnd;
LookupTable<real_type, 0> m_clvol;
// meta arrays.
LookupTable<shape_type, 0> m_fctpn;
LookupTable<shape_type, 0> m_cltpn;
LookupTable<index_type, 0> m_clgrp;
// connectivity arrays.
LookupTable<index_type, FCMND+1> m_fcnds;
LookupTable<index_type, FCNCL > m_fccls;
LookupTable<index_type, CLMND+1> m_clnds;
LookupTable<index_type, CLMFC+1> m_clfcs;
// boundary information.
LookupTable<index_type, 2> m_bndfcs;
std::vector<BoundaryData> m_bndvec;
};
(This case is for unstructured meshes of mixed elements in 2-/3-dimensional Euclidean space)
Fast and hideous
❖ In theory we can write
beautiful and fast code in
C++, and we should.
❖ In practice, as long as it’s
fast, it’s not too hard to
compromise on elegance.
❖ Testability is the bottom
line.
const index_type *
pclfcs = reinterpret_cast<const index_type *>(clfcs().row(0));
prcells = reinterpret_cast<index_type *>(rcells.row(0));
for (icl=0; icl<ncell(); icl++) {
for (ifl=1; ifl<=pclfcs[0]; ifl++) {
ifl1 = ifl-1;
ifc = pclfcs[ifl];
const index_type *
pfccls = reinterpret_cast<const index_type *>(fccls().row(0))
+ ifc*FCREL;
if (ifc == -1) { // NOT A FACE!? SHOULDN'T HAPPEN.
prcells[ifl1] = -1;
continue;
} else if (pfccls[0] == icl) {
if (pfccls[2] != -1) { // has neighboring block.
prcells[ifl1] = -1;
} else { // is interior.
prcells[ifl1] = pfccls[1];
};
} else if (pfccls[1] == icl) { // I am the neighboring cell.
prcells[ifl1] = pfccls[0];
};
// count rcell number.
if (prcells[ifl1] >= 0) {
rcellno[icl] += 1;
} else {
prcells[ifl1] = -1;
};
};
// advance pointers.
pclfcs += CLMFC+1;
prcells += CLMFC;
}; (This looks like C since it really was C.)
Final notes
❖ Avoid Python when you need speed; use it as a shell to
your high-performance library from day one
❖ Resource management is in the core of the hybrid
architecture; do it in C++
❖ Use array (look-up tables) to keep large data
❖ Don’t access PyObject from your core
❖ Always keep in mind the differences in typing systems

More Related Content

What's hot

OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
OpenStackで始めるクラウド環境構築入門 Havana&DevStack編OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
VirtualTech Japan Inc.
 
Ceph アーキテクチャ概説
Ceph アーキテクチャ概説Ceph アーキテクチャ概説
Ceph アーキテクチャ概説
Emma Haruka Iwao
 
LineairDB: Fast and Embedded Transactional Key-Value Storage
LineairDB: Fast and Embedded Transactional Key-Value StorageLineairDB: Fast and Embedded Transactional Key-Value Storage
LineairDB: Fast and Embedded Transactional Key-Value Storage
Sho Nakazono
 
Linuxのsemaphoreとmutexを見る 
Linuxのsemaphoreとmutexを見る Linuxのsemaphoreとmutexを見る 
Linuxのsemaphoreとmutexを見る 
wata2ki
 
Onieで遊んでみようとした話
Onieで遊んでみようとした話Onieで遊んでみようとした話
Onieで遊んでみようとした話
Masaru Oki
 
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
Genaris Omics, Inc.
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on Cassandra
Scalar, Inc.
 
Cephfs jewel mds performance benchmark
Cephfs jewel mds performance benchmarkCephfs jewel mds performance benchmark
Cephfs jewel mds performance benchmark
Xiaoxi Chen
 
詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2
Sho Nakazono
 
OpenvswitchでVPS
OpenvswitchでVPSOpenvswitchでVPS
OpenvswitchでVPS
Daisuke Nakajima
 
iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方
歩 柴田
 
Isolation Level について
Isolation Level についてIsolation Level について
Isolation Level について
Takashi Hoshino
 
LineairDBの紹介
LineairDBの紹介LineairDBの紹介
LineairDBの紹介
Sho Nakazono
 
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
NTT DATA Technology & Innovation
 
PostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVSPostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVS
Noriyoshi Shinoda
 
Unboundの最適化(OSC2011 Tokyo/Spring)
Unboundの最適化(OSC2011 Tokyo/Spring)Unboundの最適化(OSC2011 Tokyo/Spring)
Unboundの最適化(OSC2011 Tokyo/Spring)
Takashi Takizawa
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
npsg
 
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Etsuji Nakai
 
Raspberry Piで始める自宅サーバー超入門
Raspberry Piで始める自宅サーバー超入門Raspberry Piで始める自宅サーバー超入門
Raspberry Piで始める自宅サーバー超入門
VirtualTech Japan Inc./Begi.net Inc.
 
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
Kuniyasu Suzaki
 

What's hot (20)

OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
OpenStackで始めるクラウド環境構築入門 Havana&DevStack編OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
OpenStackで始めるクラウド環境構築入門 Havana&DevStack編
 
Ceph アーキテクチャ概説
Ceph アーキテクチャ概説Ceph アーキテクチャ概説
Ceph アーキテクチャ概説
 
LineairDB: Fast and Embedded Transactional Key-Value Storage
LineairDB: Fast and Embedded Transactional Key-Value StorageLineairDB: Fast and Embedded Transactional Key-Value Storage
LineairDB: Fast and Embedded Transactional Key-Value Storage
 
Linuxのsemaphoreとmutexを見る 
Linuxのsemaphoreとmutexを見る Linuxのsemaphoreとmutexを見る 
Linuxのsemaphoreとmutexを見る 
 
Onieで遊んでみようとした話
Onieで遊んでみようとした話Onieで遊んでみようとした話
Onieで遊んでみようとした話
 
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
NGS現場の会 第3回 モーニング教育セッション 配布用資料 「いまさら聞けない NGS超!入門」
 
Transaction Management on Cassandra
Transaction Management on CassandraTransaction Management on Cassandra
Transaction Management on Cassandra
 
Cephfs jewel mds performance benchmark
Cephfs jewel mds performance benchmarkCephfs jewel mds performance benchmark
Cephfs jewel mds performance benchmark
 
詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2詳説データベース輪読会: 分散合意その2
詳説データベース輪読会: 分散合意その2
 
OpenvswitchでVPS
OpenvswitchでVPSOpenvswitchでVPS
OpenvswitchでVPS
 
iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方iostat await svctm の 見かた、考え方
iostat await svctm の 見かた、考え方
 
Isolation Level について
Isolation Level についてIsolation Level について
Isolation Level について
 
LineairDBの紹介
LineairDBの紹介LineairDBの紹介
LineairDBの紹介
 
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
PostgreSQLをKubernetes上で活用するためのOperator紹介!(Cloud Native Database Meetup #3 発表資料)
 
PostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVSPostgreSQL Unconference #29 Unicode IVS
PostgreSQL Unconference #29 Unicode IVS
 
Unboundの最適化(OSC2011 Tokyo/Spring)
Unboundの最適化(OSC2011 Tokyo/Spring)Unboundの最適化(OSC2011 Tokyo/Spring)
Unboundの最適化(OSC2011 Tokyo/Spring)
 
VPP事始め
VPP事始めVPP事始め
VPP事始め
 
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!Docker活用パターンの整理 ― どう組み合わせるのが正解?!
Docker活用パターンの整理 ― どう組み合わせるのが正解?!
 
Raspberry Piで始める自宅サーバー超入門
Raspberry Piで始める自宅サーバー超入門Raspberry Piで始める自宅サーバー超入門
Raspberry Piで始める自宅サーバー超入門
 
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
3種類のTEE比較(Intel SGX, ARM TrustZone, RISC-V Keystone)
 

Similar to Take advantage of C++ from Python

Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New Rope
Yung-Yu Chen
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
GlobalLogic Ukraine
 
C++totural file
C++totural fileC++totural file
C++totural filehalaisumit
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
Python and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthroughPython and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthrough
gabriellekuruvilla
 
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
Intro To C++ - Class #17: Pointers!, Objects Talking To Each OtherIntro To C++ - Class #17: Pointers!, Objects Talking To Each Other
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
Blue Elephant Consulting
 
PHP 8: Process & Fixing Insanity
PHP 8: Process & Fixing InsanityPHP 8: Process & Fixing Insanity
PHP 8: Process & Fixing Insanity
GeorgePeterBanyard
 
Return of c++
Return of c++Return of c++
Return of c++
Yongwei Wu
 
CS225_Prelecture_Notes 2nd
CS225_Prelecture_Notes 2ndCS225_Prelecture_Notes 2nd
CS225_Prelecture_Notes 2ndEdward Chen
 
C++primer
C++primerC++primer
C++primer
leonlongli
 
Why learn Internals?
Why learn Internals?Why learn Internals?
Why learn Internals?
Shaul Rosenzwieg
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
Andrea Righi
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020
Yung-Yu Chen
 
C language introduction
C language introduction C language introduction
C language introduction
musrath mohammad
 
SRAVANByCPP
SRAVANByCPPSRAVANByCPP
SRAVANByCPP
aptechsravan
 
Introduction Of C++
Introduction Of C++Introduction Of C++
Introduction Of C++
Sangharsh agarwal
 
C++ theory
C++ theoryC++ theory
C++ theory
Shyam Khant
 

Similar to Take advantage of C++ from Python (20)

Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New Rope
 
Boost.Python: C++ and Python Integration
Boost.Python: C++ and Python IntegrationBoost.Python: C++ and Python Integration
Boost.Python: C++ and Python Integration
 
C++ tutorial
C++ tutorialC++ tutorial
C++ tutorial
 
C++totural file
C++totural fileC++totural file
C++totural file
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
Python and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthroughPython and Pytorch tutorial and walkthrough
Python and Pytorch tutorial and walkthrough
 
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
Intro To C++ - Class #17: Pointers!, Objects Talking To Each OtherIntro To C++ - Class #17: Pointers!, Objects Talking To Each Other
Intro To C++ - Class #17: Pointers!, Objects Talking To Each Other
 
tokyotalk
tokyotalktokyotalk
tokyotalk
 
PHP 8: Process & Fixing Insanity
PHP 8: Process & Fixing InsanityPHP 8: Process & Fixing Insanity
PHP 8: Process & Fixing Insanity
 
Return of c++
Return of c++Return of c++
Return of c++
 
Apache Thrift
Apache ThriftApache Thrift
Apache Thrift
 
CS225_Prelecture_Notes 2nd
CS225_Prelecture_Notes 2ndCS225_Prelecture_Notes 2nd
CS225_Prelecture_Notes 2nd
 
C++primer
C++primerC++primer
C++primer
 
Why learn Internals?
Why learn Internals?Why learn Internals?
Why learn Internals?
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020
 
C language introduction
C language introduction C language introduction
C language introduction
 
SRAVANByCPP
SRAVANByCPPSRAVANByCPP
SRAVANByCPP
 
Introduction Of C++
Introduction Of C++Introduction Of C++
Introduction Of C++
 
C++ theory
C++ theoryC++ theory
C++ theory
 

More from Yung-Yu Chen

Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
Yung-Yu Chen
 
SimpleArray between Python and C++
SimpleArray between Python and C++SimpleArray between Python and C++
SimpleArray between Python and C++
Yung-Yu Chen
 
Write code and find a job
Write code and find a jobWrite code and find a job
Write code and find a job
Yung-Yu Chen
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
Yung-Yu Chen
 
Harmonic Stack for Speed
Harmonic Stack for SpeedHarmonic Stack for Speed
Harmonic Stack for Speed
Yung-Yu Chen
 
Your interactive computing
Your interactive computingYour interactive computing
Your interactive computing
Yung-Yu Chen
 
Engineer Engineering Software
Engineer Engineering SoftwareEngineer Engineering Software
Engineer Engineering Software
Yung-Yu Chen
 
Craftsmanship in Computational Work
Craftsmanship in Computational WorkCraftsmanship in Computational Work
Craftsmanship in Computational Work
Yung-Yu Chen
 

More from Yung-Yu Chen (8)

Write Python for Speed
Write Python for SpeedWrite Python for Speed
Write Python for Speed
 
SimpleArray between Python and C++
SimpleArray between Python and C++SimpleArray between Python and C++
SimpleArray between Python and C++
 
Write code and find a job
Write code and find a jobWrite code and find a job
Write code and find a job
 
On the necessity and inapplicability of python
On the necessity and inapplicability of pythonOn the necessity and inapplicability of python
On the necessity and inapplicability of python
 
Harmonic Stack for Speed
Harmonic Stack for SpeedHarmonic Stack for Speed
Harmonic Stack for Speed
 
Your interactive computing
Your interactive computingYour interactive computing
Your interactive computing
 
Engineer Engineering Software
Engineer Engineering SoftwareEngineer Engineering Software
Engineer Engineering Software
 
Craftsmanship in Computational Work
Craftsmanship in Computational WorkCraftsmanship in Computational Work
Craftsmanship in Computational Work
 

Recently uploaded

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 

Recently uploaded (20)

role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 

Take advantage of C++ from Python

  • 1. Take advantage of C++ from Python Yung-Yu Chen PyCon Kyushu 30th June 2018
  • 2. Why C++ ❖ Python is slow ❖ Everything is on heap ❖ Always dynamic types ❖ Hard to access assembly ❖ Convoluted algorithms with ndarray ❖ Access external code written in any language ❖ Detail control and abstraction
  • 3. Hard problems take time • Supersonic jet in cross flow; density contour • 264 cores with 53 hours for 1.3 B variables (66 M elements) by 12,000 time steps • At OSC, 2011 (10 Gbps InfiniBand) HPC (high-performance computing) is hard. Physics is harder. Don’t mingle.
  • 4. Best of both worlds ❖ C++: fast runtime, strong static type checking, industrial grade ❖ Slow to code ❖ Python: fast prototyping, batteries included, easy to use ❖ Slow to run ❖ Hybrid system is everywhere. ❖ TensorFlow, Blender, OpenCV, etc. ❖ C++ crunches numbers. Python controls the flow. ❖ Applications work like libraries, libraries like applications.
  • 5. pybind11 ❖ https://github.com/pybind/pybind11: C++11 ❖ Expose C++ entities to Python ❖ Use Python from C++ ❖ list, tuple, dict, and str ❖ handle, object, and none
  • 6. C++11(/14/17/20) New language features: auto and decltype, defaulted and deleted functions, final and override, trailing return type, rvalue references, move constructors/move assignment, scoped enums, constexpr and literal types, list initialization, delegating and inherited constructors, brace-or-equal initializers, nullptr, long long, char16_t and char32_t, type aliases, variadic templates, generalized unions, generalized PODs, Unicode string literals, user-defined literals, attributes, lambda expressions, noexcept, alignof and alignas, multithreaded memory model, thread-local storage, GC interface, range for (based on a Boost library), static assertions (based on a Boost library) http://en.cppreference.com/w/cpp/language/history
  • 7. Python’s friends ❖ Shared pointer: manage resource ownership between C++ and Python ❖ Move semantics: speed ❖ Lambda expression: ease the wrapping code
  • 8. Ownership ❖ All Python objects are dynamically allocated on the heap. Python uses reference counting to know who should deallocate the object when it is no longer used. ❖ A owner of the reference to an object is responsible for deallocating the object. With multiple owners, the last owner (at this time, the reference count is 1) calls the destructor and deallocate. Other owners simply decrement the count by 1.
  • 9. Shared pointer #include <memory> #include <vector> #include <algorithm> #include <iostream> class Series { std::vector<int> m_data; public: int sum() const { const int ret = std::accumulate( m_data.begin(), m_data.end(), 0); std::cout << "Series::sum() = " << ret << std::endl; return ret; } static size_t count; Series(size_t size, int lead) : m_data(size) { for (size_t it=0; it<size; it++) { m_data[it] = lead+it; } count++; } ~Series() { count--; } }; size_t Series::count = 0; void use_raw_pointer() { Series * series_ptr = new Series(10, 2); series_ptr->sum(); // call member function // OUT: Series::sum() = 65 // remember to delete the object or we leak memory std::cout << "before explicit deletion, Series::count = " << Series::count << std::endl; // OUT: before explicit deletion, Series::count = 1 delete series_ptr; std::cout << "after the resource is manually freed, Series::count = " << Series::count << std::endl; // OUT: after the resource is manually freed, Series::count = 0 } void use_shared_pointer() { std::shared_ptr<Series> series_sptr(new Series(10, 3)); series_sptr->sum(); // call member function // OUT: Series::sum() = 75 // note shared_ptr handles deletion for series_sptr } int main(int argc, char ** argv) { // the common raw pointer use_raw_pointer(); // now, shared_ptr use_shared_pointer(); std::cout << "no memory leak: Series::count = " << Series::count << std::endl; // OUT: no memory leak: Series::count = 0 return 0; }
  • 10. Move semantics ❖ Number-crunching code needs large arrays as memory buffers. They aren’t supposed to be copied frequently. ❖ 50,000 × 50,000 takes 20 GB. ❖ Shared pointers should manage large chucks of memory. ❖ New reference to an object: copy constructor of shared pointer ❖ Borrowed reference to an object: const reference to the shared pointer ❖ Stolen reference to an object: move constructor of shared pointer
  • 11. Lambda ❖ Put the code at the place it should be shown namespace py = pybind11; cls = py::class_< wrapped_type, holder_type >(mod, pyname, clsdoc): cls .def( py::init([](block_type & block, index_type icl, bool init_sentinel) { return wrapped_type(block, icl, init_sentinel); }), py::arg("block"), py::arg("icl"), py::arg("init_sentinel")=true ) .def("repr", &wrapped_type::repr, py::arg("indent")=0, py::arg("precision")=0) .def("__repr__", [](wrapped_type & self){ return self.repr(); }) .def("init_sentinel", &wrapped_type::init_sentinel) .def_readwrite("cnd", &wrapped_type::cnd) .def_readwrite("vol", &wrapped_type::vol) .def_property_readonly( "nbce", [](wrapped_type & self) { return self.bces.size(); } ) .def( "get_bce", [](wrapped_type & self, index_type ibce) { return self.bces.at(ibce); } ) ;
  • 12. Lambda, cont’d ❖ Code as free as Python, as fast as C #include <unordered_map> #include <functional> #include <cstdio> int main(int argc, char ** argv) { // Python: fmap = dict() std::unordered_map<int, std::function<void(int)>> fmap; // Python: fmap[1] = lambda v: print("v = %d" % v) fmap.insert({ 1, [](int v) -> void { std::printf("v = %dn", v); } }); // Python: fmap[5] = lambda v: print("v*5 = %d" % (v*5)) fmap.insert({ 5, [](int v) -> void { std::printf("v*5 = %dn", v*5); } }); std::unordered_map<int, std::function<void(int)>>::iterator search; // Python: fmap[1](100) search = fmap.find(1); search->second(100); // OUT: v = 100 // Python: fmap[5](500) search = fmap.find(5); search->second(500); // OUT: v*5 = 2500 return 0; }
  • 13. Manipulate Python ❖ Don’t mingle Python with C++ ❖ Python has GIL ❖ Don’t include Python.h if you don’t intend to run Python ❖ Once it enters your core, it’s hard to get it off #include <Python.h> class Core { private: int m_value; PyObject * m_pyobject; };
  • 14. Do it in the wrapping layer cls .def( py::init([](py::object pyblock) { block_type * block = py::cast<block_type *>(pyblock.attr("_ustblk")); std::shared_ptr<wrapped_type> svr = wrapped_type::construct(block->shared_from_this()); for (auto bc : py::list(pyblock.attr("bclist"))) { std::string name = py::str(bc.attr("__class__").attr("__name__").attr("lstrip")("GasPlus")); BoundaryData * data = py::cast<BoundaryData *>(bc.attr("_data")); std::unique_ptr<gas::TrimBase<NDIM>> trim; if ("Interface" == name) { trim = make_unique<gas::TrimInterface<NDIM>>(*svr, *data); } else if ("NoOp" == name) { trim = make_unique<gas::TrimNoOp<NDIM>>(*svr, *data); } else if ("NonRefl" == name) { trim = make_unique<gas::TrimNonRefl<NDIM>>(*svr, *data); } else if ("SlipWall" == name) { trim = make_unique<gas::TrimSlipWall<NDIM>>(*svr, *data); } else if ("Inlet" == name) { trim = make_unique<gas::TrimInlet<NDIM>>(*svr, *data); } else { /* do nothing for now */ // throw std::runtime_error("BC type unknown"); } svr->trims().push_back(std::move(trim)); } if (report_interval) { svr->make_qty(); } return svr; }), py::arg("block") );
  • 15. pybind11::list ❖ Read a list and cast contents: ❖ Populate: #include <pybind11/pybind11.h> // must be first #include <string> #include <iostream> namespace py = pybind11; PYBIND11_MODULE(_pylist, mod) { mod.def( "do", [](py::list & l) { // convert contents to std::string and send to cout std::cout << "std::cout:" << std::endl; for (py::handle o : l) { std::string s = py::cast<std::string>(o); std::cout << s << std::endl; } } ); mod.def( "do2", [](py::list & l) { // create a new list std::cout << "py::print:" << std::endl; py::list l2; for (py::handle o : l) { std::string s = py::cast<std::string>(o); s = "elm:" + s; py::str s2(s); l2.append(s2); // populate contents } py::print(l2); } ); } /* end PYBIND11_PLUGIN(_pylist) */ >>> import _pylist >>> # print the input list >>> _pylist.do(["a", "b", "c"]) std::cout: a b c >>> _pylist.do2(["d", "e", "f"]) py::print: ['elm:d', 'elm:e', 'elm:f']
  • 16. pybind11::tuple ❖ Tuple is immutable, thus behaves like read-only. The construction is through another iterable object. ❖ Read the contents of a tuple: #include <pybind11/pybind11.h> // must be first #include <vector> namespace py = pybind11; PYBIND11_MODULE(_pytuple, mod) { mod.def( "do", [](py::args & args) { // build a list using py::list::append py::list l; for (py::handle h : args) { l.append(h); } // convert it to a tuple py::tuple t(l); // print it out py::print(py::str("{} len={}").format(t, t.size())); // print the element one by one for (size_t it=0; it<t.size(); ++it) { py::print(py::str("{}").format(t[it])); } } ); } /* end PYBIND11_PLUGIN(_pytuple) */ >>> import _pytuple >>> _pytuple.do("a", 7, 5.6) ('a', 7, 5.6) len=3 a 7 5.6
  • 17. pybind11::dict ❖ Dictionary is one of the most useful container in Python. ❖ Populate a dictionary: ❖ Manipulate it: #include <pybind11/pybind11.h> // must be first #include <string> #include <stdexcept> #include <iostream> namespace py = pybind11; PYBIND11_MODULE(_pydict, mod) { mod.def( "do", [](py::args & args) { if (args.size() % 2 != 0) { throw std::runtime_error("argument number must be even"); } // create a dict from the input tuple py::dict d; for (size_t it=0; it<args.size(); it+=2) { d[args[it]] = args[it+1]; } return d; } ); mod.def( "do2", [](py::dict d, py::args & args) { for (py::handle h : args) { if (d.contains(h)) { std::cout << py::cast<std::string>(h) << " is in the input dictionary" << std::endl; } else { std::cout << py::cast<std::string>(h) << " is not found in the input dictionary" << std::endl; } } std::cout << "remove everything in the input dictionary!" << std::endl; d.clear(); return d; } ); } /* end PYBIND11_PLUGIN(_pydict) */ >>> import _pydict >>> d = _pydict.do("a", 7, "b", "name", 10, 4.2) >>> print(d) {'a': 7, 'b': 'name', 10: 4.2} >>> d2 = _pydict.do2(d, "b", "d") b is in the input dictionary d is not found in the input dictionary remove everything in the input dictionary! >>> print("The returned dictionary is empty:", d2) The returned dictionary is empty: {} >>> print("The first dictionary becomes empty too:", d) The first dictionary becomes empty too: {} >>> print("Are the two dictionaries the same?", d2 is d) Are the two dictionaries the same? True
  • 18. pybind11::str ❖ One more trick with Python strings in pybind11; user-defined literal:
 
 #include <pybind11/pybind11.h> // must be first #include <iostream> namespace py = pybind11; using namespace py::literals; // to bring in the `_s` literal PYBIND11_MODULE(_pystr, mod) { mod.def( "do", []() { py::str s("python string {}"_s.format("formatting")); py::print(s); } ); } /* end PYBIND11_PLUGIN(_pystr) */ >>> import _pystr >>> _pystr.do() python string formatting
  • 19. Generic Python objects ❖ Pybind11 defines two generic types for representing Python objects: ❖ “handle”: base class of all pybind11 classes for Python types ❖ “object” derives from handle and adds automatic reference counting
  • 20. pybind11::handle and object manually descrases refcount after h.dec_ref(): 3 #include <pybind11/pybind11.h> // must be first #include <iostream> namespace py = pybind11; using namespace py::literals; // to bring in the `_s` literal PYBIND11_MODULE(_pyho, mod) { mod.def( "do", [](py::object const & o) { std::cout << "refcount in the beginning: " << o.ptr()->ob_refcnt << std::endl; py::handle h(o); std::cout << "no increase of refcount with a new pybind11::handle: " << h.ptr()->ob_refcnt << std::endl; { py::object o2(o); std::cout << "increased refcount with a new pybind11::object: " << o2.ptr()->ob_refcnt << std::endl; } std::cout << "decreased refcount after the new pybind11::object destructed: " << o.ptr()->ob_refcnt << std::endl; h.inc_ref(); std::cout << "manually increases refcount after h.inc_ref(): " << h.ptr()->ob_refcnt << std::endl; h.dec_ref(); std::cout << "manually descrases refcount after h.dec_ref(): " << h.ptr()->ob_refcnt << std::endl; } ); } /* end PYBIND11_PLUGIN(_pyho) */ >>> import _pyho >>> _pyho.do(["name"]) refcount in the beginning: 3 no increase of refcount with a new pybind11::handle: 3 increased refcount with a new pybind11::object: 4 decreased refcount after the new pybind11::object destructed: 3 manually increases refcount after h.inc_ref(): 4
  • 21. pybind11::none ❖ It’s worth noting that pybind11 has “none” type. In Python, None is a singleton, and accessible as Py_None in the C API. ❖ Access None single from C++: #include <pybind11/pybind11.h> // must be first #include <iostream> namespace py = pybind11; using namespace py::literals; // to bring in the `_s` literal PYBIND11_MODULE(_pynone, mod) { mod.def( "do", [](py::object const & o) { if (o.is(py::none())) { std::cout << "it is None" << std::endl; } else { std::cout << "it is not None" << std::endl; } } ); } /* end PYBIND11_PLUGIN(_pynone) */ >>> import _pynone >>> _pynone.do(None) it is None >>> _pynone.do(False) it is not None
  • 23. Never loop in Python ❖ Sum 100,000,000 integers ❖ The C++ version: ❖ Numpy is better, but not enough $ python -m timeit -s 'data = range(100000000)' 'sum(data)' 10 loops, best of 3: 2.36 sec per loop $ time ./run real 0m0.010s user 0m0.002s sys 0m0.004s #include <cstdio> int main(int argc, char ** argv) { long value = 0; for (long it=0; it<100000000; ++it) { value += it; } return 0; } $ python -m timeit -s 'import numpy as np ; data = np.arange(100000000, dtype="int64")' 'data.sum()' 10 loops, best of 3: 74.9 msec per loop
  • 24. Wisely use arrays ❖ Python calls are expensive. Data need to be transferred from Python to C++ in batch. Use arrays. ❖ C++ code may use arrays as internal representation. For example, matrices are arrays having a 2-D view. ❖ Arrays are used as both ❖ interface between Python and C++, and ❖ internal storage in the C++ engine
  • 25. Arrays in Python ❖ What we really mean is numpy(.ndarray) ❖ 12 lines to create vertices for zig-zagging mesh ❖ They get things done, although sometimes look convoluted # create nodes. nodes = [] for iy, yloc in enumerate(np.arange(y0, y1+dy/4, dy/2)): if iy % 2 == 0: meshx = np.arange(x0, x1+dx/4, dx, dtype='float64') else: meshx = np.arange(x0+dx/2, x1-dx/4, dx, dtype='float64') nodes.append(np.vstack([meshx, np.full_like(meshx, yloc)]).T) nodes = np.vstack(nodes) assert nodes.shape[0] == nnode blk.ndcrd[:,:] = nodes assert (blk.ndcrd == nodes).all()
  • 26. Expose memory buffer class Buffer: public std::enable_shared_from_this<Buffer> { private: size_t m_length = 0; char * m_data = nullptr; struct ctor_passkey {}; public: Buffer(size_t length, const ctor_passkey &) : m_length(length) { m_data = new char[length](); } static std::shared_ptr<Buffer> construct(size_t length) { return std::make_shared<Buffer>(length, ctor_passkey()); } ~Buffer() { if (nullptr != m_data) { delete[] m_data; m_data = nullptr; } } /** Backdoor */ template< typename T > T * data() const { return reinterpret_cast<T*>(m_data); } }; py::array from(array_flavor flavor) { // ndarray shape and stride npy_intp shape[m_table.ndim()]; std::copy(m_table.dims().begin(), m_table.dims().end(), shape); npy_intp strides[m_table.ndim()]; strides[m_table.ndim()-1] = m_table.elsize(); for (ssize_t it = m_table.ndim()-2; it >= 0; --it) { strides[it] = shape[it+1] * strides[it+1]; } // create ndarray void * data = m_table.data(); py::object tmp = py::reinterpret_steal<py::object>( PyArray_NewFromDescr( &PyArray_Type, PyArray_DescrFromType(m_table.datatypeid()), m_table.ndim(), shape, strides, data, NPY_ARRAY_WRITEABLE, nullptr)); // link lifecycle to the underneath buffer py::object buffer = py::cast(m_table.buffer()); py::array ret; if (PyArray_SetBaseObject((PyArrayObject *)tmp.ptr(), buffer.inc_ref().ptr()) == 0) { ret = tmp; } return ret; } Internal buffer Expose the buffer as ndarray ❖ Numpy arrays provide the most common construct: a contiguous memory buffer, and tons of code ❖ N-dimensional arrays (ndarray) ❖ There are variants, but less useful in C++: masked array, sparse matrices, etc.
  • 27. Define your meta data ❖ Free to define how the memory is used class LookupTableCore { private: std::shared_ptr<Buffer> m_buffer; std::vector<index_type> m_dims; index_type m_nghost = 0; index_type m_nbody = 0; index_type m_ncolumn = 0; index_type m_elsize = 1; ///< Element size in bytes. DataTypeId m_datatypeid = MH_INT8; public: index_type ndim() const { return m_dims.size(); } index_type nghost() const { return m_nghost; } index_type nbody() const { return m_nbody; } index_type nfull() const { return m_nghost + m_nbody; } index_type ncolumn() const { return m_ncolumn; } index_type nelem() const { return nfull() * ncolumn(); } index_type elsize() const { return m_elsize; } DataTypeId datatypeid() const { return m_datatypeid; } size_t nbyte() const { return buffer()->nbyte(); } }; 0 bodyghost
  • 28. Organize arrays ❖ LookupTable is a class template providing static information for the dynamic array core ❖ Now we can put together a class that keeps track of all data for computation template< size_t NDIM > class UnstructuredBlock { private: // geometry arrays. LookupTable<real_type, NDIM> m_ndcrd; LookupTable<real_type, NDIM> m_fccnd; LookupTable<real_type, NDIM> m_fcnml; LookupTable<real_type, 0> m_fcara; LookupTable<real_type, NDIM> m_clcnd; LookupTable<real_type, 0> m_clvol; // meta arrays. LookupTable<shape_type, 0> m_fctpn; LookupTable<shape_type, 0> m_cltpn; LookupTable<index_type, 0> m_clgrp; // connectivity arrays. LookupTable<index_type, FCMND+1> m_fcnds; LookupTable<index_type, FCNCL > m_fccls; LookupTable<index_type, CLMND+1> m_clnds; LookupTable<index_type, CLMFC+1> m_clfcs; // boundary information. LookupTable<index_type, 2> m_bndfcs; std::vector<BoundaryData> m_bndvec; }; (This case is for unstructured meshes of mixed elements in 2-/3-dimensional Euclidean space)
  • 29. Fast and hideous ❖ In theory we can write beautiful and fast code in C++, and we should. ❖ In practice, as long as it’s fast, it’s not too hard to compromise on elegance. ❖ Testability is the bottom line. const index_type * pclfcs = reinterpret_cast<const index_type *>(clfcs().row(0)); prcells = reinterpret_cast<index_type *>(rcells.row(0)); for (icl=0; icl<ncell(); icl++) { for (ifl=1; ifl<=pclfcs[0]; ifl++) { ifl1 = ifl-1; ifc = pclfcs[ifl]; const index_type * pfccls = reinterpret_cast<const index_type *>(fccls().row(0)) + ifc*FCREL; if (ifc == -1) { // NOT A FACE!? SHOULDN'T HAPPEN. prcells[ifl1] = -1; continue; } else if (pfccls[0] == icl) { if (pfccls[2] != -1) { // has neighboring block. prcells[ifl1] = -1; } else { // is interior. prcells[ifl1] = pfccls[1]; }; } else if (pfccls[1] == icl) { // I am the neighboring cell. prcells[ifl1] = pfccls[0]; }; // count rcell number. if (prcells[ifl1] >= 0) { rcellno[icl] += 1; } else { prcells[ifl1] = -1; }; }; // advance pointers. pclfcs += CLMFC+1; prcells += CLMFC; }; (This looks like C since it really was C.)
  • 30. Final notes ❖ Avoid Python when you need speed; use it as a shell to your high-performance library from day one ❖ Resource management is in the core of the hybrid architecture; do it in C++ ❖ Use array (look-up tables) to keep large data ❖ Don’t access PyObject from your core ❖ Always keep in mind the differences in typing systems