Building binary extensions is easier than ever thanks to several key libraries. Pybind11 provides a natural C++ language for extensions without requiring pre-processing or special dependencies. Scikit-build ties the premier C++ build system, CMake, into the Python extension build process. And cibuildwheel makes it easy to build highly compatible wheels for over 80 different platforms using CI or on your local machine. We will look at advancements to all three libraries over the last year, as well as future plans.
The document discusses PyTorch tensor internals. It describes how tensors are the fundamental data structure in PyTorch and are multi-dimensional arrays that contain elements of a single data type. It also discusses how PyTorch tensors integrate C++ and Python code, with tensors built on the ATen library. The document covers how tensors can be constructed from NumPy arrays to avoid data copying using torch.from_numpy().
Implementation of multicast communication in internet
Individual hosts are configured as members of different multicast groups
One particular user may a member of many multicast groups
For a one multicast can be few members/nodes
IP Multicast group is identified by Class D address (224.0.0.0 – 239.255.255.255)
Every IP datagram send to a multicast group is transferred to all members of group
An overview of how to consume 3rd party C++ libraries with CMake.
Methods covered include: find_package, pkg-config and writing a custom CMake Find Module.
IPv6 Segment Routing is a major IPv6 extension that provides a modern version of source routing that is currently being developed within the Internet Engineering Task Force (IETF). We propose the first open-source implementation of IPv6 Segment Routing in the Linux kernel. We first describe it in details and explain how it can be used on both endhosts and routers. We then evaluate and compare its performance with plain IPv6 packet forwarding in a lab environment. Our measurements indicate that the performance penalty of inserting IPv6 Segment Routing Headers or encapsulat- ing packets is limited to less than 15%. On the other hand, the optional HMAC security feature of IPv6 Segment Routing is costly in a pure software implementation. Since our implementation has been included in the official Linux 4.10 kernel, we expect that it will be extended by other researchers for new use cases.
Presented at ANRW'17 https://irtf.org/anrw/2017/program.html on behalf of David Lebrun
The document discusses various methods for hyperparameter optimization, including random search, Bayesian optimization, Hyperband, and cyclical learning rates. Hyperband extends successive halving to more efficiently allocate resources to promising hyperparameter configurations. It was shown to outperform random search and Bayesian optimization in experiments tuning a CNN on image classification tasks. Keras Tuner and PyTorch implementations provide APIs for hyperparameter search including Hyperband.
Faster R-CNN improves object detection by introducing a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. The RPN slides over feature maps and predicts object bounds and objectness at each position. During training, anchors are assigned positive or negative labels based on Intersection over Union with ground truth boxes. Faster R-CNN runs the RPN in parallel with Fast R-CNN for detection, end-to-end in a single network and stage. This achieves state-of-the-art object detection speed and accuracy while eliminating computationally expensive selective search for proposals.
Gradient Steepest method application on Griewank Function Imane Haf
The document discusses applying the gradient method to minimize the Griewank test function. It outlines the gradient method algorithm, shows simulation results for different starting points that converge in few iterations, and improvements made to ensure the algorithm finds the global minimum regardless of starting point by continuing to search after reaching local minima. The conclusions state the gradient method performs well locally but extensions were needed to locate the true global minimum for the Griewank function.
The document discusses PyTorch tensor internals. It describes how tensors are the fundamental data structure in PyTorch and are multi-dimensional arrays that contain elements of a single data type. It also discusses how PyTorch tensors integrate C++ and Python code, with tensors built on the ATen library. The document covers how tensors can be constructed from NumPy arrays to avoid data copying using torch.from_numpy().
Implementation of multicast communication in internet
Individual hosts are configured as members of different multicast groups
One particular user may a member of many multicast groups
For a one multicast can be few members/nodes
IP Multicast group is identified by Class D address (224.0.0.0 – 239.255.255.255)
Every IP datagram send to a multicast group is transferred to all members of group
An overview of how to consume 3rd party C++ libraries with CMake.
Methods covered include: find_package, pkg-config and writing a custom CMake Find Module.
IPv6 Segment Routing is a major IPv6 extension that provides a modern version of source routing that is currently being developed within the Internet Engineering Task Force (IETF). We propose the first open-source implementation of IPv6 Segment Routing in the Linux kernel. We first describe it in details and explain how it can be used on both endhosts and routers. We then evaluate and compare its performance with plain IPv6 packet forwarding in a lab environment. Our measurements indicate that the performance penalty of inserting IPv6 Segment Routing Headers or encapsulat- ing packets is limited to less than 15%. On the other hand, the optional HMAC security feature of IPv6 Segment Routing is costly in a pure software implementation. Since our implementation has been included in the official Linux 4.10 kernel, we expect that it will be extended by other researchers for new use cases.
Presented at ANRW'17 https://irtf.org/anrw/2017/program.html on behalf of David Lebrun
The document discusses various methods for hyperparameter optimization, including random search, Bayesian optimization, Hyperband, and cyclical learning rates. Hyperband extends successive halving to more efficiently allocate resources to promising hyperparameter configurations. It was shown to outperform random search and Bayesian optimization in experiments tuning a CNN on image classification tasks. Keras Tuner and PyTorch implementations provide APIs for hyperparameter search including Hyperband.
Faster R-CNN improves object detection by introducing a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. The RPN slides over feature maps and predicts object bounds and objectness at each position. During training, anchors are assigned positive or negative labels based on Intersection over Union with ground truth boxes. Faster R-CNN runs the RPN in parallel with Fast R-CNN for detection, end-to-end in a single network and stage. This achieves state-of-the-art object detection speed and accuracy while eliminating computationally expensive selective search for proposals.
Gradient Steepest method application on Griewank Function Imane Haf
The document discusses applying the gradient method to minimize the Griewank test function. It outlines the gradient method algorithm, shows simulation results for different starting points that converge in few iterations, and improvements made to ensure the algorithm finds the global minimum regardless of starting point by continuing to search after reaching local minima. The conclusions state the gradient method performs well locally but extensions were needed to locate the true global minimum for the Griewank function.
codecentric AG: CQRS and Event Sourcing Applications with CassandraDataStax Academy
CQRS (Command Query Responsibility Segregation) is a pattern, which separates the process of querying and updating data. As a query only returns data without any side effects, a command is designed to change data. CQRS is often combined with Event Sourcing. This is an architecture in which all changes to an application state are stored as a sequence of events.
Because of its great capability to store time series data Cassandra is the perfect fit for implementing the event store. But there a still a lot of open questions: What about the data modeling? What techniques will be used to process and store data in the Cassandra database? How to access the current state of the application, without replaying every event? And what about failure handling?
In this talk, I will give a brief introduction to CQRS and the Event Sourcing pattern and will then answer the questions above using a real life example of a data store for customer data.
This document discusses techniques for mitigating distributed denial of service (DDoS) attacks, including remotely triggered black hole filtering (RTBH) and BGP FlowSpec. It provides an overview of DDoS attack trends, types, and impacts. It also introduces the open-source FastNetMon tool for DDoS detection using network telemetry and introducing mitigation actions like flow blocking through integration with tools like ExaBGP.
Matrix factorization techniques can be used to address some of the limitations of traditional collaborative filtering approaches for recommender systems. Matrix factorization decomposes the user-item rating matrix into the product of two lower-dimensional matrices, one representing latent factors for users and the other for items. This reduced dimensionality addresses data sparsity and scalability issues. Specifically, singular value decomposition is often used to perform this matrix factorization, which can approximate the original rating matrix while ignoring less important singular values and factor vectors. The decomposed matrices can then be multiplied to predict unknown user ratings.
Slides by Amaia Salvador at the UPC Computer Vision Reading Group.
Source document on GDocs with clickable links:
https://docs.google.com/presentation/d/1jDTyKTNfZBfMl8OHANZJaYxsXTqGCHMVeMeBe5o1EL0/edit?usp=sharing
Based on the original work:
Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems, pp. 91-99. 2015.
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...MongoDB
Isolation, the I in ACID, determines how/when the changes made by one operation become visible to another. Relational databases provide four isolation levels (uncommitted, committed, repeatable reads, and serialiable) to enable the trade off of performance versus the level of cross operation change visibility. In contrast, MongoDB’s isolation levels are controlled by using readConcerns and transactions. This talk will describe how the relational isolation levels compare to MongoDB’s isolation guarantees, how you configure MongoDB to provide the desired isolation level, and the performance implications.
Git is a distributed version control system designed to be efficient, support non-linear development (e.g. branching and merging), and work well on large projects with many developers collaborating simultaneously on the same files. It stores content in a data model of immutable objects (blobs, trees, commits) accessed through references like branches and tags. This allows it to efficiently track file history and versions across branches without duplicating data.
Replacing iptables with eBPF in Kubernetes with CiliumMichal Rostecki
Cilium is an open source project which provides networking, security and load balancing for application services that are deployed using Linux container technologies by using the native eBPF technology in the Linux kernel. In this presentation we talked about:
- The evolution of the BPF filters and explained the advantages of eBPF Filters and its use cases today in Linux especially on how Cilium networking utilizes the eBPF Filters to secure the Kubernetes workload with increased performance when compared to legacy iptables.
- How Cilium uses SOCKMAP for layer 7 policy enforcement - How Cilium integrates with Istio and handles L7 Network Policies with Envoy Proxies.
- The new features since the last release such as running Kubernetes cluster without kube-proxy, providing clusterwide NetworkPolicies, providing fully distributed networking and security observability platform for cloud native workloads etc.
This document provides an overview of fuzzy logic and fuzzy sets. It defines key concepts such as membership functions, operations on fuzzy sets like intersection and union, properties of fuzzy sets including equality and inclusion, and alpha cuts. It also discusses fuzzy rules and how they differ from classical rules by allowing partial truth values. Examples are provided to illustrate fuzzy set concepts and operations. The document is intended as lecture material on fuzzy logic for a course on artificial intelligence and computer science.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
BGP communities allow networks to attach additional routing information and instructions to BGP routes. They are defined as 32-bit integers that can be used to tag routes with information like the source of the route or to trigger actions like changing route attributes. Common uses of BGP communities include controlling route exports, influencing attributes like local preference, and providing informational tags about factors like geography.
This document provides an overview of geospatial analytics in Spark. It discusses the challenges of geospatial analysis including projections, indexing, data curation, and system libraries. It then presents case studies on large-scale geospatial joins, spatial disaggregation, and pattern of life analysis. Live demos are shown for each case study. Key lessons learned are to standardize data formats, leverage datalakes, use domain-driven design, test scaling, and leverage existing work from others.
This document discusses using a loopback interface as the update source for BGP sessions. It explains that when there are multiple paths between BGP neighbors, using a loopback interface ensures the BGP session will not go down if the physical interface fails. It provides the configuration to enable this by specifying the loopback interface in the neighbor update-source command. An example topology is shown connecting routers with EIGRP and configuring BGP between the routers using a loopback interface as the update source.
Tutorial: Using GoBGP as an IXP connecting routerShu Sugimoto
- Show you how GoBGP can be used as a software router in conjunction with quagga
- (Tutorial) Walk through the setup of IXP connecting router using GoBGP
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
https://sites.google.com/view/itri-icl-dla/
(Public Information Share) This is our lightweight DNN inference processor presentation, including a system solution (from Caffe prototxt to HW controls files), hardware features, and an example of object detection (Tiny YOLO) RTL simulation results. We modified open-source NVDLA, small configuration, and developed a RISC-V MCU in this accelerating system.
OpenVPN as a WAN - pfSense Hangout October 2016Netgate
This document summarizes an OpenVPN hangout discussing using OpenVPN as a WAN connection. It covers why to use OpenVPN as a WAN, how to set up an OpenVPN client, assigning the client as an interface, outbound NAT, firewall rules, failover scenarios, policy routing, preventing DNS leaks, and hosting your own OpenVPN server. The document provides guidance on obtaining connection requirements from VPN providers, importing certificates, configuring clients and servers, and other implementation details.
This document lists and describes various JavaScript methods, functions, objects, and concepts including:
- Common methods for built-in objects like String, Array, Number, Date, and Regular Expressions
- DOM methods for manipulating the document object model like getElementById()
- Event handlers for responding to browser events
- Regular expression patterns and modifiers
- Built-in functions like eval(), parseInt(), and setTimeout()
- Using the XMLHttpRequest object to make AJAX requests
- Embedding JavaScript in HTML pages
The document summarizes Henry Schreiner's work on several Python and C++ scientific computing projects. It describes a scientific Python development guide built from the Scikit-HEP summit. It also outlines Henry's work on pybind11 for C++ bindings, scikit-build for building extensions, cibuildwheel for building wheels on CI, and several other related projects.
codecentric AG: CQRS and Event Sourcing Applications with CassandraDataStax Academy
CQRS (Command Query Responsibility Segregation) is a pattern, which separates the process of querying and updating data. As a query only returns data without any side effects, a command is designed to change data. CQRS is often combined with Event Sourcing. This is an architecture in which all changes to an application state are stored as a sequence of events.
Because of its great capability to store time series data Cassandra is the perfect fit for implementing the event store. But there a still a lot of open questions: What about the data modeling? What techniques will be used to process and store data in the Cassandra database? How to access the current state of the application, without replaying every event? And what about failure handling?
In this talk, I will give a brief introduction to CQRS and the Event Sourcing pattern and will then answer the questions above using a real life example of a data store for customer data.
This document discusses techniques for mitigating distributed denial of service (DDoS) attacks, including remotely triggered black hole filtering (RTBH) and BGP FlowSpec. It provides an overview of DDoS attack trends, types, and impacts. It also introduces the open-source FastNetMon tool for DDoS detection using network telemetry and introducing mitigation actions like flow blocking through integration with tools like ExaBGP.
Matrix factorization techniques can be used to address some of the limitations of traditional collaborative filtering approaches for recommender systems. Matrix factorization decomposes the user-item rating matrix into the product of two lower-dimensional matrices, one representing latent factors for users and the other for items. This reduced dimensionality addresses data sparsity and scalability issues. Specifically, singular value decomposition is often used to perform this matrix factorization, which can approximate the original rating matrix while ignoring less important singular values and factor vectors. The decomposed matrices can then be multiplied to predict unknown user ratings.
Slides by Amaia Salvador at the UPC Computer Vision Reading Group.
Source document on GDocs with clickable links:
https://docs.google.com/presentation/d/1jDTyKTNfZBfMl8OHANZJaYxsXTqGCHMVeMeBe5o1EL0/edit?usp=sharing
Based on the original work:
Ren, Shaoqing, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In Advances in Neural Information Processing Systems, pp. 91-99. 2015.
MongoDB World 2019: MongoDB Read Isolation: Making Your Reads Clean, Committe...MongoDB
Isolation, the I in ACID, determines how/when the changes made by one operation become visible to another. Relational databases provide four isolation levels (uncommitted, committed, repeatable reads, and serialiable) to enable the trade off of performance versus the level of cross operation change visibility. In contrast, MongoDB’s isolation levels are controlled by using readConcerns and transactions. This talk will describe how the relational isolation levels compare to MongoDB’s isolation guarantees, how you configure MongoDB to provide the desired isolation level, and the performance implications.
Git is a distributed version control system designed to be efficient, support non-linear development (e.g. branching and merging), and work well on large projects with many developers collaborating simultaneously on the same files. It stores content in a data model of immutable objects (blobs, trees, commits) accessed through references like branches and tags. This allows it to efficiently track file history and versions across branches without duplicating data.
Replacing iptables with eBPF in Kubernetes with CiliumMichal Rostecki
Cilium is an open source project which provides networking, security and load balancing for application services that are deployed using Linux container technologies by using the native eBPF technology in the Linux kernel. In this presentation we talked about:
- The evolution of the BPF filters and explained the advantages of eBPF Filters and its use cases today in Linux especially on how Cilium networking utilizes the eBPF Filters to secure the Kubernetes workload with increased performance when compared to legacy iptables.
- How Cilium uses SOCKMAP for layer 7 policy enforcement - How Cilium integrates with Istio and handles L7 Network Policies with Envoy Proxies.
- The new features since the last release such as running Kubernetes cluster without kube-proxy, providing clusterwide NetworkPolicies, providing fully distributed networking and security observability platform for cloud native workloads etc.
This document provides an overview of fuzzy logic and fuzzy sets. It defines key concepts such as membership functions, operations on fuzzy sets like intersection and union, properties of fuzzy sets including equality and inclusion, and alpha cuts. It also discusses fuzzy rules and how they differ from classical rules by allowing partial truth values. Examples are provided to illustrate fuzzy set concepts and operations. The document is intended as lecture material on fuzzy logic for a course on artificial intelligence and computer science.
http://imatge-upc.github.io/telecombcn-2016-dlcv/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.
BGP communities allow networks to attach additional routing information and instructions to BGP routes. They are defined as 32-bit integers that can be used to tag routes with information like the source of the route or to trigger actions like changing route attributes. Common uses of BGP communities include controlling route exports, influencing attributes like local preference, and providing informational tags about factors like geography.
This document provides an overview of geospatial analytics in Spark. It discusses the challenges of geospatial analysis including projections, indexing, data curation, and system libraries. It then presents case studies on large-scale geospatial joins, spatial disaggregation, and pattern of life analysis. Live demos are shown for each case study. Key lessons learned are to standardize data formats, leverage datalakes, use domain-driven design, test scaling, and leverage existing work from others.
This document discusses using a loopback interface as the update source for BGP sessions. It explains that when there are multiple paths between BGP neighbors, using a loopback interface ensures the BGP session will not go down if the physical interface fails. It provides the configuration to enable this by specifying the loopback interface in the neighbor update-source command. An example topology is shown connecting routers with EIGRP and configuring BGP between the routers using a loopback interface as the update source.
Tutorial: Using GoBGP as an IXP connecting routerShu Sugimoto
- Show you how GoBGP can be used as a software router in conjunction with quagga
- (Tutorial) Walk through the setup of IXP connecting router using GoBGP
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
https://sites.google.com/view/itri-icl-dla/
(Public Information Share) This is our lightweight DNN inference processor presentation, including a system solution (from Caffe prototxt to HW controls files), hardware features, and an example of object detection (Tiny YOLO) RTL simulation results. We modified open-source NVDLA, small configuration, and developed a RISC-V MCU in this accelerating system.
OpenVPN as a WAN - pfSense Hangout October 2016Netgate
This document summarizes an OpenVPN hangout discussing using OpenVPN as a WAN connection. It covers why to use OpenVPN as a WAN, how to set up an OpenVPN client, assigning the client as an interface, outbound NAT, firewall rules, failover scenarios, policy routing, preventing DNS leaks, and hosting your own OpenVPN server. The document provides guidance on obtaining connection requirements from VPN providers, importing certificates, configuring clients and servers, and other implementation details.
This document lists and describes various JavaScript methods, functions, objects, and concepts including:
- Common methods for built-in objects like String, Array, Number, Date, and Regular Expressions
- DOM methods for manipulating the document object model like getElementById()
- Event handlers for responding to browser events
- Regular expression patterns and modifiers
- Built-in functions like eval(), parseInt(), and setTimeout()
- Using the XMLHttpRequest object to make AJAX requests
- Embedding JavaScript in HTML pages
The document summarizes Henry Schreiner's work on several Python and C++ scientific computing projects. It describes a scientific Python development guide built from the Scikit-HEP summit. It also outlines Henry's work on pybind11 for C++ bindings, scikit-build for building extensions, cibuildwheel for building wheels on CI, and several other related projects.
Modern binary build systems have made shipping binary packages for Python much easier than ever before. This talk discusses three of the most popular build systems for Python packages using the new standards developed for packaging.
Digital RSE: automated code quality checks - RSE group meetingHenry Schreiner
Given at a local RSE group meeting. Covers code quality practices, focusing on Python but over multiple languages, with useful tools highlighted throughout.
Talk at PyCon2022 over building binary packages for Python. Covers an overview and an in-depth look into pybind11 for binding, scikit-build for creating the build, and build & cibuildwheel for making the binaries that can be distributed on PyPI.
Flake8 is a Python linter that is fast, simple, and extensible. It can be configured through setup.cfg or .flake8 files to ignore certain checks or select others. The summary recommends using the flake8-bugbear plugin and avoiding all print statements with flake8-print. Linters like Flake8 help find errors, improve code quality, and avoid historical baggage, but one does not need every check and it is okay to build a long ignore list.
This document discusses software quality assurance tooling, focusing on pre-commit. It introduces pre-commit as a tool for running code quality checks before code is committed. Pre-commit allows configuring hooks that run checks and fixers on files matching certain patterns. Hooks can be installed from repositories and support many languages including Python. The document provides examples of pre-commit checks such as disallowing improper capitalization in code comments and files. It also discusses how to configure, run, update and install pre-commit hooks.
This document discusses setting up a CI/CD pipeline using GitHub Actions. It begins with an introduction to CI/CD pipelines and their importance. It then provides an overview of GitHub Actions and how they can be used to automate builds, tests, releases and deployments. The document demonstrates a sample GitHub Actions workflow file and explains its key components like jobs, steps and actions. It also covers topics like workflow events, jobs and steps/actions that can be used in GitHub Actions.
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...Jian-Hong Pan
Flatpak is a framework for distributing desktop applications and supported by most of Linux distributions. This talk shares how to package a HTTP server written in Python as a Flatpak app. And, runs it like a desktop application by launching a browser connecting to the server automatically.
https://hackmd.io/@starnight/Have_an_HTTP_Server_in_Flatpak
This document outlines steps for contributing to an open source project called coala, including forking and cloning a repository, making commits, and creating pull requests. It discusses workflows for git, coding, code review, and rebasing. The document also provides an overview of coala and asks for feedback on the workshop.
This document discusses the history and development of Python packages for high energy physics (HEP) analysis. It describes how experiments initially used ROOT and C++, but Python gained popularity for configuration and analysis. This led to the creation of packages like Scikit-HEP, Uproot, and Awkward Array to bridge the gap between ROOT files and the Python data science stack. Scikit-HEP grew to include many related packages and provides best practices through its developer pages. The future may include adopting Scikit-build for building Python packages with C/C++ extensions and running packages in the browser via WebAssembly.
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPyDong-hee Na
gopy is an excellent tool which generates (and compiles) a CPython extension module from a go package. And I hope more developers could make full use of gopy to migrate their go code into python code. To make gopy more advanced, It is necessary to provide APIs for various Python compiler versions, such as CPython 2/3 and PyPy. This can be improved with CFFI or ctypes. Moreover, many go’s implementations/features are not yet implemented in gopy. So we need to implement implementations such as slices, interfaces, and maps in the go.
My goal is to update gopy by using CFFI to support Python3 and PyPy and write detailed documents
Take your CI to the next level! Learn how to optimize your pipelines for faster and more efficient builds through parallelization, caching, failing early, and more.
carrow - Go bindings to Apache Arrow via C++-APIYoni Davidson
Apache Arrow is a cross-language development platform for in-memory data that specifies a standardized columnar memory format. It provides libraries and messaging for moving data between languages and services without serialization. The presenter discusses their motivation for creating Go bindings for Apache Arrow via C++ to share data between Go and Python programs using the same memory format. They explain several challenges of this approach, such as different memory managers in Go and C++, and solutions like generating wrapper code and handling memory with finalizers.
Pybind11 is a lightweight header-only library for binding C++ types to Python. It allows exposing C++ functions and classes to Python with minimal code. Compared to Boost.Python, pybind11 has simpler binding code, smaller binaries, and supports newer C++ features. The document provides examples of binding basic functions and classes to Python using pybind11.
Facebook is a company that operates at massive scale. In this talk we’ll talk about how we use Python at Facebook.
Be it building back-end services, fast prototyping, automation, scaling operations, or simply gluing together various pieces of our infrastructure, Python is at the heart of it and allows our engineers to quickly deliver working solutions.
We’ll talk about our review process, unit testing, deployment workflow and various open-source framework we use.
Euro python2011 High Performance PythonIan Ozsvald
I ran this as a 4 hour tutorial at EuroPython 2011 to teach High Performance Python coding.
Techniques covered include bottleneck analysis by profiling, bytecode analysis, converting to C using Cython and ShedSkin, use of the numerical numpy library and numexpr, multi-core and multi-machine parallelisation and using CUDA GPUs.
Write-up with 49 page PDF report: http://ianozsvald.com/2011/06/29/high-performance-python-tutorial-v0-1-from-my-4-hour-tutorial-at-europython-2011/
This document provides instructions for installing IBM Connections Component Pack 6.0.0.6 on a Kubernetes cluster. It describes how to set up the prerequisites like Docker and Kubernetes, initialize the Kubernetes master, join worker nodes, install Helm, set up a Docker registry, create persistent volumes, and install each Component Pack microservice using Helm charts. It also covers labeling nodes, pushing images, bootstrapping, and installing monitoring dashboards.
Similar to SciPy22 - Building binary extensions with pybind11, scikit build, and cibuildwheel (20)
The document describes various productivity tools for Python development, including:
- Pre-commit hooks to run checks before committing code
- Hot code reloading in Jupyter notebooks using the %load_ext and %autoreload magic commands
- Cookiecutter for generating project templates
- SSH configuration files and escape sequences for easier remote access
- Autojump to quickly navigate frequently visited directories
- Terminal tips like command history search and referencing the last argument
- Options for tracking Jupyter notebooks with git like stripping outputs or synchronizing notebooks and Python files.
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingHenry Schreiner
This was a PyCon 2022 lightning talk over the Scikit-HEP developer pages. It highlights best practices and guides shown there, and the quick package creation cookiecutter. And finally it demos the Pyodide WebAssembly app embedded into the Scikit-HEP developer pages!
This document provides best practices for using CMake, including:
- Set the cmake_minimum_required version to ensure modern features while maintaining backward compatibility.
- Use targets to define executables and libraries, their properties, and dependencies.
- Fetch remote dependencies at configure time using FetchContent or integrate with package managers like Conan.
- Import library targets rather than reimplementing Find modules when possible.
- Treat CUDA as a first-class language in CMake projects.
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHenry Schreiner
The document describes a machine learning approach for primary vertex reconstruction in high-energy physics experiments. A hybrid method is proposed that uses a 1D convolutional neural network to analyze histograms produced from tracking data. The network is able to find primary vertices with high efficiency and tunable false positive rates, demonstrating the potential of machine learning for this task. Future work involves adding more tracking information and iterating between track association and vertex finding to improve performance.
HOW 2019: A complete reproducible ROOT environment in under 5 minutesHenry Schreiner
The document discusses setting up a ROOT environment using Conda in under 5 minutes. It describes downloading and installing Miniconda and then using Conda commands to create a new environment and install ROOT and its dependencies from the conda-forge channel. The ROOT package provides full ROOT functionality, including compilation and graphics, and supports Linux, macOS, and multiple Python versions.
ACAT 2019: A hybrid deep learning approach to vertexingHenry Schreiner
This document presents a hybrid deep learning approach for vertex finding in high-energy physics experiments. It uses a 1D convolutional neural network to analyze kernel density estimates of track information in order to identify primary vertex positions. The approach achieves primary vertex finding efficiencies of 88-94% with low false positive rates comparable to traditional algorithms. The authors demonstrate tuning of the efficiency-false positive rate tradeoff and discuss plans to improve performance by incorporating additional track information and iterative refinement.
2019 CtD: A hybrid deep learning approach to vertexingHenry Schreiner
This document presents a hybrid deep learning approach for vertex finding using 1D convolutional neural networks. It describes generating 1D kernel densities from tracking information, building target distributions, and using a CNN architecture with an adjustable cost function to optimize the false positive rate versus efficiency. The approach achieves 93.87% efficiency with a 0.251 false positive rate on test data. Future work includes incorporating additional xy information and exploring full 2D kernel densities.
2019 IRIS-HEP AS workshop: Boost-histogram and histHenry Schreiner
The document discusses the current state of histograms in Python and the need for a new histogramming library. It introduces boost-histogram, a C++ histogramming library, and its new Python bindings. The bindings aim to provide a fast, flexible and easily distributable histogram object for Python. Key features discussed include histogram design that treats it as a first-class object, fast filling via multi-threading, a variety of axis and storage types, and performance benchmarks showing it can be over 10x faster than NumPy for filling histograms. Distribution is focused on providing binary wheels for many platforms via continuous integration.
The document discusses the current state of histograms in Python and the need for a new library. It introduces boost-histogram, a C++ histogram library, and its new Python bindings. The bindings aim to provide a fast, flexible, and easily distributable histogram object for Python with support for multiple axis types and storage options. It also discusses plans for an additional wrapper library called hist for easy plotting and interfacing with other tools.
2019 IRIS-HEP AS workshop: Particles and decaysHenry Schreiner
The Scikit-HEP project aims to create an ecosystem for particle physics data analysis in Python. It includes packages like Particle and DecayLanguage that provide tools for working with particle data and decay descriptions. Particle allows users to easily access and search particle property data from sources like the PDG. DecayLanguage allows parsing decay file formats, representing and manipulating decay chains, and converting between decay model representations. Future work includes expanding particle ID support and improving visualization of decay trees.
The document discusses plans for the boost-histogram and hist Python libraries. Boost-histogram is a multidimensional histogram library inspired by ROOT that provides flexibility through many axis and storage types. Hist will provide plotting and analysis functionality by interfacing with libraries like mpl-hep. Future plans include improved indexing, slicing, and NumPy conversions for boost-histogram as well as statistical functions, serialization, and integration with fitters for hist.
The document discusses the new features of Python 3.8, which was recently released. Some key updates include positional-only arguments, the walrus operator for variable assignment, improved static typing support, and performance enhancements. The document also notes additional developer changes and provides resources for obtaining Python 3.8.
This document provides an overview of histograms and various histogram libraries. It introduces boost-histogram, a C++ histogram library that is fast and header-only. It then describes the new Python bindings for boost-histogram, which are designed to be fast and easy to use while resembling the C++ version. Finally, it outlines plans for additional Python histogram tools like hist, Aghast, and Unified Histogram Indexing to integrate boost-histogram into the wider ecosystem.
2019 IML workshop: A hybrid deep learning approach to vertexingHenry Schreiner
A hybrid deep learning approach is proposed for vertex finding using 1D convolutional neural networks on kernel density estimates from tracking data. The approach generates 1D histograms from 3D tracking data and uses a CNN to classify primary vertex positions. In a proof-of-concept on simulated data, it achieves primary vertex finding efficiencies and false positive rates comparable to traditional algorithms, with tunable efficiency-false positive tradeoffs. Future work includes incorporating additional tracking features, associating tracks to vertices, and deploying the inference engine for the LHCb trigger.
CHEP 2019: Recent developments in histogram librariesHenry Schreiner
This document discusses recent developments in Python histogram libraries. It describes Boost.Histogram, a C++ histogramming library that serves as the foundation for the boost-histogram Python package. Boost.Histogram provides fast, customizable histogram filling and manipulation. The document also outlines plans for hist, a Python analysis frontend, and aghast, a library for converting between histogram formats. Together, boost-histogram, hist, and aghast comprise the Scikit-HEP histogramming framework.
LHCb Computing Workshop 2018: PV finding with CNNsHenry Schreiner
The document discusses using a convolutional neural network (CNN) to quickly find primary vertices (PVs) in high-energy physics events recorded by the LHCb experiment. A prototype tracking algorithm is used to generate a 1D kernel density estimate (KDE) histogram from hit triplets. This histogram is then used to train a CNN to predict the locations of PVs. Initial results show the CNN approach can find PVs with 70-75% efficiency and a false positive rate of 0.08-0.13, outperforming current algorithms. Further work aims to improve resolution, find secondary vertices, and integrate the approach into iterative tracking.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Infrastructure Challenges in Scaling RAG with Custom AI modelsZilliz
Building Retrieval-Augmented Generation (RAG) systems with open-source and custom AI models is a complex task. This talk explores the challenges in productionizing RAG systems, including retrieval performance, response synthesis, and evaluation. We’ll discuss how to leverage open-source models like text embeddings, language models, and custom fine-tuned models to enhance RAG performance. Additionally, we’ll cover how BentoML can help orchestrate and scale these AI components efficiently, ensuring seamless deployment and management of RAG systems in the cloud.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Presentation of the OECD Artificial Intelligence Review of Germany
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuildwheel
1. Henry Schreiner
Princeton University
Joe Rickerby
Nord Projects
Ralf Grosse-Kunstleve
Google
Wenzel Jakob
Ecole Polytechnique Fédérale de Lausanne
Matthieu Darbois
PyPA
Aaron Gokaslan
Facebook Research
Jean-Christophe
Fillion-Robin
Kitware
Matt McCormick
Kitware
Building binary
extensions
with pybind11, scikit-build,
and cibuildwheel
Maintainers track
SciPy 2022 - Austin
July 15, 2022
2. Package: pybind11
2
Header-only pure C++11 CPython/PyPy interface
Trivial to add to a project
No special build requirements
No dependencies
No precompile phase
Not a new language
Think of it like the missing C++ API for CPython
Designed for one purpose!
3. Example of usage
3
#include <pybind11/pybind11.h>
int add(int i, int j) {
return i + j;
}
PYBIND11_MODULE(example, m) {
m.def("add", &add);
}
Standard include
Normal C++ to bind
Create a Python module
Signature statically inferred
Docs and parameter names optional
g++ -shared -fPIC example.cpp $(pipx run pybind11 --includes) -o example$(python3-config --extension-suffix)
Complete, working, no-install example (linux version)!
4. Many great features
4
Simple enough for tiny projects
714K code mentions of pybind11 on GitHub
Powerful enough for huge projects
SciPy, PyTorch, dozens of Google projects
Small binaries prioritized
Perfect for WebAssembly with Pyodide
Powerful object lifetime controls
py::keep_alive, std::shared_ptr, and more
NumPy support without NumPy headers
No need to lock minimum NumPy at build
Supports interop in both directions
Can even be used to embed Python in C++
Most STL containers and features supported
Including C++17 additions, like std::variant (or boost)
Vectorize methods or functions
py::vectorize, even on a lambda function
Trampoline classes and multiple inheritance
Complex C++ is supported
Complete control of the Python representation
Special methods, inheritance, pickling, and more
Bu
ff
er protocol and array classes
Includes Eigen support too
Cross-extension ABI
One extension can return a type wrapped in another one
5. New in 2.8
5
2.7 was released at last SciPy!
py::raise_from
Chaining exceptions
Local exception mapping
Custom rules for each module
Array improvements
View and reshape arrays
Custom __new__ support
Fixing old bug
Python builtins as callbacks
Better consistency
Embedded interpreter improvements
argv setting, __
fi
le__ set
py::slice None support
Using std::optional
Rerun CMake with di
ff
erent Python
Updates cache if stale
mingw support added
Tested in CI now too
py::custom_type_setup
Advanced GC tricks and more
6. New in 2.9
6
py::args can be followed
No longer required to be last
py::const_name replaces py::_
More readable, gettext clash removed
More built-in exception chaining
Using recent py::raise_from addition
py::multiple_inheritance detection
Only explicitly required if bases are hidden
Codebase formatted with clang-format
Using clang-format wheel & pre-commit
Better support for std::string_view
Interoperable with built-in types
Lots of new checks, small
fi
xes, better support for PyPy, CPython 3.11, etc.
7. New in 2.10
7
Removed Python 2.7 & 3.5 support
Removed 1K+ LoC, simpler, cleaner
Removed MSVC 2015 (& limited 2017)
Hard to
fi
nd & not very useful with 3.5 gone
New docs theme (Furo)
Cleaner, ToC for pages, dark-mode
Support std::monostate
std::variant as optional or non-constructible
Better exception handling
Easier without Python 2.7
py::any set and py::frozenset
Copy to set (like py::set)
py::capsule::set_name
Can manipulate later (DLPack)
NumPy type enhancements
More accessors and type number constructor
Python 3.11 beta support
First version working with all changes
CMake
fi
xes
Needed for Pyodide (web assembly) & VCPKG
8. New in 2.10
7
Removed Python 2.7 & 3.5 support
Removed 1K+ LoC, simpler, cleaner
Removed MSVC 2015 (& limited 2017)
Hard to
fi
nd & not very useful with 3.5 gone
New docs theme (Furo)
Cleaner, ToC for pages, dark-mode
Support std::monostate
std::variant as optional or non-constructible
Better exception handling
Easier without Python 2.7
py::any set and py::frozenset
Copy to set (like py::set)
py::capsule::set_name
Can manipulate later (DLPack)
NumPy type enhancements
More accessors and type number constructor
Python 3.11 beta support
First version working with all changes
CMake
fi
xes
Needed for Pyodide (web assembly) & VCPKG
2.10.0 releasing today
9. Upcoming plans
8
Refactors
Decoupling unit tests
Breaking into smaller, IWUY headers
https://github.com/pybind/pybind11/wiki/Roadmap
Optional Precompilation
Would dramatically speed up compile
Huge change, so being held back by other work
Merge the smart holder branch
Full interoperability with std::unique_ptr and std::shared_ptr
Opt-in with <pybind11/smart_holder.h>
Safe passing to C++, including trampolines
Further ideas (not promised yet)
Better MyPy integration (some minor work done)
Better Scikit-build integration
Upstreaming some improvements from nanobind (next slide)
10. Maintenance: Release notes
9
Somewhat interesting release note system
Pull requests get a template to
fi
ll out
<!--
Title (above): please place [branch_name] at the beginning if you are targeting a branch other than master.
*Do not target stable*. It is recommended to use conventional commit format, see conventionalcommits.org, but
not required.
-->
## Description
<!-- Include relevant issues or PRs here, describe what changed and why -->
## Suggested changelog entry:
<!-- Fill in the below block with the expected RestructuredText entry. Delete if no entry needed;
but do not delete header or rst block if an entry is needed! Will be collected via a script. -->
```rst
```
<!-- If the upgrade guide needs updating, note that here too -->
11. Maintenance: Release notes (2)
10
GHA bot tags PR’s after they get merged
name: Labeler
on:
pull_request_target:
types: [closed]
jobs:
label:
name: Labeler
runs-on: ubuntu-latest
steps:
- uses: actions/labeler@main
if: github.event.pull_request.merged == true
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
configuration-path: .github/labeler_merged.yml + some labeler con
fi
g
fi
les
You might need
permissions: contents: …
12. Maintenance: Release notes (3)
11
Script runner in nox
@nox.session(reuse_venv=True)
def make_changelog(session: nox.Session) -> None:
"""
Inspect the closed issues and make entries for a changelog.
"""
session.install("ghapi", "rich")
session.run("python", "tools/make_changelog.py")
Controls script dependencies
13. PRs can be updated,
fi
nal polish on copy/paste
Labels manually removed (in bulk)
14. Bonus package: nanobind
13
C++17+ & Python 3.8+ only
Similar API to pybind11
Intentionally more limited than pybind11
Focus on small, e
ffi
cient bindings
Some ideas can be backported to pybind11
https://github.com/wjakob/nanobind
15. Bonus package: build
14
SDist and wheel builder
pipx run build Builds SDist from source, wheel from SDist
pipx run build --sdist --wheel Builds SDist and wheel from source
New features this year Planned
Prettier printout
Better error messages
Python 3.11 support
Use Flit (better bootstrapping)
Redesign environment support
Threadsafe API
.github/work
fl
ows/wheels.yml
on: [push, pull_request]
jobs:
build_wheels:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build wheels
run: pipx run build
- uses: actions/upload-artifact@v3
with:
path: ./wheelhouse/*.whl
16. Package: 🎡 cibuildwheel
15
Multiplatform redistributable wheel builder .github/work
fl
ows/wheels.yml
on: [push, pull_request]
jobs:
build_wheels:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os:
- ubuntu-22.04
- windows-2022
- macos-11
steps:
- uses: actions/checkout@v4
- name: Build wheels
uses: pypa/cibuildwheel@v2.8.0
- uses: actions/upload-artifact@v3
with:
path: ./wheelhouse/*.whl
Supports all major CI providers
GitHub Action provided too!
Can run locally, on servers, etc.
2.0 was released at last SciPy!
Supports:
Targeting macOS 10.9+
Apple Silicon cross-compiling 3.8+
All variants of manylinux (including emulation)
musllinux
PyPy 3.7-3.9
Repairing and testing wheels
Reproducible pinned defaults (can unpin)
A
ffi
liated (shared maintainer) with manylinux
Close collaboration with PyPy devs
Joined the PyPA in 2021
Used by matplotlib, mypy, scikit-learn, and more
18. Cibuildwheel: what’s new
17
New in cibuildwheel 2.1-2.4
Local Windows & MacOS runs
TOML overrides array
manylinux2014 default
musllinux
Environment variable passthrough
Experimental Windows ARM
New in cibuildwheel 2.5-2.8
ABI3 wheel support
Build from SDist
tomllib on Python 3.11 (host)
CPython 3.11 beta support
manylinux_2_28 support (RHEL 8)
Linux builds from Windows
Podman support
Support for Pythonless wheels
In Development
New setup-python environment
Python 3.6 removed as host
Python 3.11b4 in progress as of today
19. Cibuildwheel: what might be
18
Cross compiling Windows ARM
Supporting setuptools, hopefully more
Cross compiling Linux
Stuck with missing RH dev toolkit trick
Local run improvements
Skip an environment variable
20. cibuildwheel tips
19
Build wheels locally
pipx run cibuildwheel --platform linux
i
i
i
i
i
i
:
d
d
d
d
d
d
d
d
d
d
d
d
d
d
d
: "
"
"
"
"
"
"
"
"
"
"
u
u
u
u
u
u
u
u
u
u
u
u
:
Keep the GitHub Action up-to-date!
.github/dependabot.yml
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
Use pyproject.toml con
fi
g
21. cibuildwheel tips
19
Build wheels locally
pipx run cibuildwheel --platform linux
Keep the GitHub Action up-to-date!
.github/dependabot.yml
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
No longer needed,
Dependabot respects version level!
(v1 -> v2, not v2.0.0)
Use pyproject.toml con
fi
g
22. GHA: building a composite action
20
GitHub Action’s new setup-python@v4.1.0 feature is perfect for composite pipx actions!
name: mypkg
description: 'Runs a package'
runs:
using: composite
steps:
- uses: actions/setup-python@v4
id: python
with:
python-version: "3.7 - 3.10"
update-environment: false
- run: >
pipx run
--python '${{ steps.python.outputs.python-path }}'
--spec '${{ github.action_path }}'
mypkg
shell: bash
No side e
ff
ects!
Used in nox &
cibuildwheel
Package you want to run
23. Package: scikit-build
21
CMake based build backend for Python
Scikit-build is a CMake-setuptools adaptor from KitWare, the makers of CMake
First introduced as PyCMake at SciPy 2014 and renamed in 2016
Includes CMake for Python and ninja for Python
CMake and Ninja for Python
All manylinux archs • Apple Silicon • cibuildwheel • nox
Currently designed as a setuptools wrapper
But that is changing…
24. Pure Python Packaging
22
# pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "example"
version = "0.1.0"
Many pure-Python backends today!
Hatch, PDM, Flit, and more!
Even modern setuptools supports this
But what if you want to add compiled extensions?
Fall back 8+ years into distutils/setuptools hacks
Not tied to a single solution!
Adding distutils to the stdlib was a disaster
Setuptools extending distutils was also a disaster
But modern standards free us from this!
See https://scikit-hep.org/developer
Or https://packaging.python.org
25. Packaging timeline
23
2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026
Scikit-build
introduced
(as PyCMake)
PEP 517/518 enables
non-setuptools backends
to be developed and used
PEP 621 de
fi
nes
a standard
con
fi
g format
PEP 632 removes distutils
from the standard library
in Python 3.12
Proposed project
PEP 660
editable
installs
Python Enhancement Proposals (PEPs) drive Python development - and provide standards for packaging now too!
Before 2015, the only method to distribute packages was distutils/setuptools
Good pure Python backends exist now (hatchling) - scikit-build could be a great compiled / CMake one!
26. Scikit-build (current design)
24
setup.py
setup(…)
Extensions
Command dict
Config settings
distutils
setuptools
MANIFEST.in
setup.cfg
pyproject.toml
CMakeLists.txt
Raw setuptools for C++?
No C++ standard support
No parallel compiles
No common libraries
No custom libraries
Hacks for cross compiling
NumPy alone needs 13K
lines to support this!
27. Scikit-build (current design)
24
setup.py
setup(…)
Extensions
Command dict
Config settings
distutils
setuptools
MANIFEST.in
setup.cfg
pyproject.toml
CMakeLists.txt
Raw setuptools for C++?
No C++ standard support
No parallel compiles
No common libraries
No custom libraries
Hacks for cross compiling
NumPy alone needs 13K
lines to support this!
scikit-build
Scikit-build wraps CMake!
Best in class feature support
~60% of all C++ projects
Very popular in the sciences (incl. HEP)
Common and custom library support
28. Scikit-build (current design)
24
setup.py
setup(…)
Extensions
Command dict
Config settings
distutils
setuptools
Problems with setuptools bridge
Inherits problems from setuptools
Deep dependencies on internals
Poor interactions with new standards
Deep nesting is hard to work with / debug
Internals poorly documented
Not composable with other plugins
MANIFEST.in
setup.cfg
pyproject.toml
CMakeLists.txt
Raw setuptools for C++?
No C++ standard support
No parallel compiles
No common libraries
No custom libraries
Hacks for cross compiling
NumPy alone needs 13K
lines to support this!
scikit-build
Scikit-build wraps CMake!
Best in class feature support
~60% of all C++ projects
Very popular in the sciences (incl. HEP)
Common and custom library support
29. Scikit-build (new design)
25
pyproject.toml
CMakeLists.txt
build-system
require
scikit-build-core
project configuration
Name, etc.
tool.scikit-build
Optional custom config
Planned other interfaces:
Extension-based setuptools wrapper
Can be used to support classic “scikit-build”
Plugin mode (see next page)
Design considerations:
Good testing suite
Excellent error messages
Public API
CMake code:
Redesign with modern “FindPython” support
Provide extension discovery system
You could use pybind11
from pip in CMake!
(Scikit-build-core may be a placeholder name)
30. Extensionlib (w/ PEP?)
26
Since the proposal, “extensionlib” and/or a PEP
has been discussed at PyCon 2022. A PoC lib by
@ofek exists. Stay tuned!
Provides a common API for build-
backends to extension creation tools
We could be a plugin for all builders via a single interface!
31. Extensionlib (w/ PEP?)
26
Since the proposal, “extensionlib” and/or a PEP
has been discussed at PyCon 2022. A PoC lib by
@ofek exists. Stay tuned!
Provides a common API for build-
backends to extension creation tools
We could be a plugin for all builders via a single interface!
[build-system]
requires = ["hatchling", “scikit-build-core”, "mypyc"]
build-backend = "hatchling.build"
[project]
name = "example"
version = "0.1.0"
[[build-system.extensions]]
build-backend = "scikit_build_core.extension"
src = "src1/CMakeLists.txt"
[[build-system.extensions]]
build-backend = "scikit_build_core.extension"
src = "src2/CMakeLists.txt"
[[build-system.extensions]]
build-backend = "mypyc.extension"
src = "src/*.py"
(Rough example of idea)
32. NSF Proposal
27
Scikit-build family:
scikit-build, cmake, ninja,
moderncmakedomain,
scikit-build-examples
pybind family:
pybind11,
python_example,
cmake_example,
scikit_build_example
PyPA family:
cibuildwheel,
build
scikit-hep family:
Developer pages, cookie
Updates can touch many repos!
Year 1 plans:
Introduce scikit-build-core: modern PEP 517 backend
Support PEP 621 con
fi
guration
Support use as plugin (possibly via extesionlib)
Tighter CMake integration (con
fi
g from PyPI packages)
Distutils-free code ready for Python 3.12
Year 2 plans:
Convert projects to Scikit-build
Year 3 plans:
Website, tutorials, outreach
https://iscinumpy.dev/post/scikit-build-proposal/
33. NSF Proposal
27
Scikit-build family:
scikit-build, cmake, ninja,
moderncmakedomain,
scikit-build-examples
pybind family:
pybind11,
python_example,
cmake_example,
scikit_build_example
PyPA family:
cibuildwheel,
build
scikit-hep family:
Developer pages, cookie
Updates can touch many repos!
Year 1 plans:
Introduce scikit-build-core: modern PEP 517 backend
Support PEP 621 con
fi
guration
Support use as plugin (possibly via extesionlib)
Tighter CMake integration (con
fi
g from PyPI packages)
Distutils-free code ready for Python 3.12
Year 2 plans:
Convert projects to Scikit-build
Year 3 plans:
Website, tutorials, outreach
On Wednesday, NSF 2209877 was awarded!
https://iscinumpy.dev/post/scikit-build-proposal/
34. Scikit-build, since last time
28
CMake
Current release (3.23) will likely drop manylinux1 wheels
Scikit-build 0.13
ARM on Windows support
MSVC 2022 support
Target link libraries support via keyword
Scikit-build 0.14
Install target
Scikit-build 0.15
Cygwin support
(Limited) FindPython support
Final Python 2.7 & 3.5 release
Final MSVC 2015 release
+ lots of
fi
xes
& minor improvements
35. Questions?
29
Special thank you to Google for sponsoring our expanded CI, and for Ralf Grosse-Kunstleve’s time.
Henry Schreiner supported by IRIS-HEP and the NSF under Cooperative Agreement OAC-1836650
To get in touch, currently we are using GitHub Discussions on all three packages.
36. henryiii
henryschreiner3
Plumbum • POVM • PyTest GHA annotate-failures
https://iscinumpy.dev
https://scikit-hep.org
https://iris-hep.org
C++ & Python
Building Python Packages
Scikit-HEP: Other
Other C++
Scikit-HEP: Histograms
pybind11 (python_example, cmake_example, scikit_build_example) • Conda-Forge ROOT
cibuildwheel • build • scikit-build (cmake, ninja, sample-projects) • Scikit-HEP/cookie
boost-histogram • Hist • UHI • uproot-browser
Vector • Particle • DecayLanguage • repo-review
Other Python
Jekyll-Indico
Other Ruby
CLI11 • GooFit
Modern CMake • CMake Workshop
Computational Physics Class
Python CPU, GPU, Compiled minicourses
Level Up Your Python
My books and workshops
42. Cookiecutter
36
pipx run cookiecutter gh:scikit-hep/cookie
11 backends to pick from
Generation tested by nox
In sync with the developer pages
Setuptools
Setuptools PEP 621
Flit Hatch PDM
Poetry
Scikit-build
Setuptools C++
Maturin (Rust)
+
more!