This document discusses the history and development of Python packages for high energy physics (HEP) analysis. It describes how experiments initially used ROOT and C++, but Python gained popularity for configuration and analysis. This led to the creation of packages like Scikit-HEP, Uproot, and Awkward Array to bridge the gap between ROOT files and the Python data science stack. Scikit-HEP grew to include many related packages and provides best practices through its developer pages. The future may include adopting Scikit-build for building Python packages with C/C++ extensions and running packages in the browser via WebAssembly.
Modern binary build systems have made shipping binary packages for Python much easier than ever before. This talk discusses three of the most popular build systems for Python packages using the new standards developed for packaging.
Digital RSE: automated code quality checks - RSE group meetingHenry Schreiner
Given at a local RSE group meeting. Covers code quality practices, focusing on Python but over multiple languages, with useful tools highlighted throughout.
Flake8 is a Python linter that is fast, simple, and extensible. It can be configured through setup.cfg or .flake8 files to ignore certain checks or select others. The summary recommends using the flake8-bugbear plugin and avoiding all print statements with flake8-print. Linters like Flake8 help find errors, improve code quality, and avoid historical baggage, but one does not need every check and it is okay to build a long ignore list.
Talk at PyCon2022 over building binary packages for Python. Covers an overview and an in-depth look into pybind11 for binding, scikit-build for creating the build, and build & cibuildwheel for making the binaries that can be distributed on PyPI.
This document discusses software quality assurance tooling, focusing on pre-commit. It introduces pre-commit as a tool for running code quality checks before code is committed. Pre-commit allows configuring hooks that run checks and fixers on files matching certain patterns. Hooks can be installed from repositories and support many languages including Python. The document provides examples of pre-commit checks such as disallowing improper capitalization in code comments and files. It also discusses how to configure, run, update and install pre-commit hooks.
Python modules allow programmers to split code into multiple files for easier maintenance. A module is simply a Python file with a .py extension. The import statement is used to include modules. Modules can be organized into packages, which are directories containing an __init__.py file. Popular third party modules like ElementTree, Psyco, EasyGUI, SQLObject, and py.test make Python even more powerful.
PyCon 2013 : Scripting to PyPi to GitHub and MoreMatt Harrison
This document discusses various aspects of developing and distributing Python projects, including versioning, configuration, logging, file input, shell invocation, environment layout, project layout, documentation, automation with Makefiles, packaging, testing, GitHub, Travis CI, and PyPI. It recommends using semantic versioning, the logging module, parsing files with the file object interface, invoking shell commands with subprocess, using virtualenv for sandboxed environments, Sphinx for documentation, Makefiles to automate tasks, setuptools for packaging, and GitHub, Travis CI and PyPI for distribution.
Modern binary build systems have made shipping binary packages for Python much easier than ever before. This talk discusses three of the most popular build systems for Python packages using the new standards developed for packaging.
Digital RSE: automated code quality checks - RSE group meetingHenry Schreiner
Given at a local RSE group meeting. Covers code quality practices, focusing on Python but over multiple languages, with useful tools highlighted throughout.
Flake8 is a Python linter that is fast, simple, and extensible. It can be configured through setup.cfg or .flake8 files to ignore certain checks or select others. The summary recommends using the flake8-bugbear plugin and avoiding all print statements with flake8-print. Linters like Flake8 help find errors, improve code quality, and avoid historical baggage, but one does not need every check and it is okay to build a long ignore list.
Talk at PyCon2022 over building binary packages for Python. Covers an overview and an in-depth look into pybind11 for binding, scikit-build for creating the build, and build & cibuildwheel for making the binaries that can be distributed on PyPI.
This document discusses software quality assurance tooling, focusing on pre-commit. It introduces pre-commit as a tool for running code quality checks before code is committed. Pre-commit allows configuring hooks that run checks and fixers on files matching certain patterns. Hooks can be installed from repositories and support many languages including Python. The document provides examples of pre-commit checks such as disallowing improper capitalization in code comments and files. It also discusses how to configure, run, update and install pre-commit hooks.
Python modules allow programmers to split code into multiple files for easier maintenance. A module is simply a Python file with a .py extension. The import statement is used to include modules. Modules can be organized into packages, which are directories containing an __init__.py file. Popular third party modules like ElementTree, Psyco, EasyGUI, SQLObject, and py.test make Python even more powerful.
PyCon 2013 : Scripting to PyPi to GitHub and MoreMatt Harrison
This document discusses various aspects of developing and distributing Python projects, including versioning, configuration, logging, file input, shell invocation, environment layout, project layout, documentation, automation with Makefiles, packaging, testing, GitHub, Travis CI, and PyPI. It recommends using semantic versioning, the logging module, parsing files with the file object interface, invoking shell commands with subprocess, using virtualenv for sandboxed environments, Sphinx for documentation, Makefiles to automate tasks, setuptools for packaging, and GitHub, Travis CI and PyPI for distribution.
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingHenry Schreiner
This was a PyCon 2022 lightning talk over the Scikit-HEP developer pages. It highlights best practices and guides shown there, and the quick package creation cookiecutter. And finally it demos the Pyodide WebAssembly app embedded into the Scikit-HEP developer pages!
This document discusses getting started with a first Python project. It covers installing Python and choosing an IDE, following coding best practices like PEP8 style guidelines, using built-in data structures, testing tools, virtual environments, project structure, and deployment tools like Supervisor. The goal is to help new Python programmers understand the basics of starting their first project.
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...Henry Schreiner
Building binary extensions is easier than ever thanks to several key libraries. Pybind11 provides a natural C++ language for extensions without requiring pre-processing or special dependencies. Scikit-build ties the premier C++ build system, CMake, into the Python extension build process. And cibuildwheel makes it easy to build highly compatible wheels for over 80 different platforms using CI or on your local machine. We will look at advancements to all three libraries over the last year, as well as future plans.
Python is a dynamic, object-oriented programming language that can be used for many types of software development. It has strong support for integration with other languages and tools, extensive standard libraries, and can be learned quickly. Python plays a key role in production pipelines and workflows due to its productivity, clear code, strong libraries, and integration capabilities. It is widely used at companies like Google and in applications like web frameworks.
Python is a popular programming language created by Guido van Rossum in 1991. It is easy to use, powerful, and versatile, making it suitable for beginners and experts alike. Python code can be written and executed in the browser using Google Colab, which provides a Jupyter notebook environment and access to computing resources like GPUs. The document then discusses installing Python using Anaconda, basic Python concepts like indentation, variables, strings, conditionals, and loops.
Python packaging: how did we get here, and where are we going?takluyver
An overview of Python packaging history, with details of what PEP 517 is, some of the new tools using it, and how you might write a PEP 517 backends and frontends.
See also notes taken by an attendee: https://twitter.com/drakekin/status/1173195932151746561
Introduction to underlying technologies, the rationale of using Python and Qt as a development platform on Maemo and a short demo of a few projects built with these tools. Comparison of different bindings (PyQt vs PySide). PyQt/PySide development environments, how to develop most efficiently, how to debug, how to profile and optimize, platform caveats and gotchas.
This document provides an introduction to Python for .NET developers. Python is an interpreted, object-oriented programming language that is portable, extensible, and combines power with clear syntax. It differs from .NET in that it uses dynamic typing, whitespace is significant, and functions must be defined before use. Python code is often organized into modules stored in files with a .py extension. The document discusses Python implementations, modules, strengths, weaknesses and success stories. It also introduces the IronPython implementation and some Python web frameworks.
The document discusses best practices for writing a C/C++ Python extension in 2017. It covers available options like ctypes, cffi, Cython, and SWIG. It then focuses on building a binary Python extension using ctypes, including debugging crashes by generating core files and using lldb/gdb. It also discusses memory issues and using valgrind and clang sanitizers. It recommends abusing Python unit tests for testing C code. Finally, it covers shipping the extension, including manylinux wheels, testing wheels on different Linux distributions with Docker, and publishing source and wheel distributions.
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Codemotion
PyMI: siamo un gruppo di Sviluppatrici, Sviluppatori, Appassionati e Appassionate di Python a Milano. Ci incontriamo una volta al mese in Mikamai/LinkMe. Abbiamo degli eventi ricorrenti e molto apprezzati: "Pillole di Python" e "PyBirra". * Presentazione del gruppo * Python Blueprint: the language, the tools, the packages and the ecosystem.
This document summarizes several tools for Python dependency management: pip-tools, Pipenv, poetry, and hatch. It discusses how each tool handles specifying dependencies, installing packages, creating reproducible environments, and publishing packages. While pip-tools and Pipenv focus on dependencies, poetry aims to be a single tool for all project tasks including building and publishing. Hatch simplifies the development workflow by wrapping multiple common tools. The document concludes that the best tool depends on whether a library or application is being built, and which fits the user's infrastructure.
Rust + python: lessons learnt from building a toy filesystemChengHui Weng
In this slides I listed what I have learnt when I was working on my toy FUSE based file system in Rust for Python. By using PyO3, to bind Rust with Python becomes really easy, but the unavoidable type conversions affect the whole Rust code design and efficiency.
This is as a lighting talk for WebHack#16 meet up: https://webhack.connpass.com/event/99735/
The document discusses py7zr, a Python library that provides generic compression, decompression, and archiving capabilities for the 7zip format. Py7zr utilizes Python's built-in lzma support for decompression and allows compression and decompression of 7z files without external dependencies. It aims to provide a pure Python implementation of 7zip's compression algorithms with a focus on code quality, testing, and documentation.
This document discusses different approaches for creating Python extensions and bindings to C/C++ libraries. It summarizes the author's experience using ctypes to create a minimal binding called PyMiniRacer to the V8 JavaScript engine. The author argues that combining ctypes, which allows shipping a single Python-independent binary, with pre-built wheel distributions can provide an optimal solution for packaging and distributing Python extensions.
Intro to Pinax: Kickstarting Your Django AppsRoger Barnes
This document provides an overview and introduction to Pinax, an open source framework built on Django that aims to provide common functionality needed for web applications out of the box. It discusses what Pinax is, when it should be used, its key features like reusable apps and starter projects, alternatives to Pinax, and tips for development using Pinax like testing, deployment and source control best practices.
This is a python course for beginners, intended both for frontal class learning as well as self-work.
The Course is designed for 2 days and then another week of HW assignments.
CHEP 2018: A Python upgrade to the GooFit package for parallel fittingHenry Schreiner
A Python upgrade to the GooFit package for parallel fitting
9 Jul 2018, 15:30
15m
Hall 3 (National Palace of Culture)
presentation Track 5 – Software development T5 - Software development
Speaker
Henry Fredrick Schreiner (University of Cincinnati (US))
Description
The GooFit highly parallel fitting package for GPUs and CPUs has been substantially upgraded in the past year. Python bindings have been added to allow simple access to the fitting configuration, setup, and execution. A Python tool to write custom GooFit code given a (compact and elegant) MINT3/AmpGen amplitude description allows the corresponding C++ code to be written quickly and correctly. New PDFs have been added. The most recent release was built on top of the December 2017 2.0 release that added easier builds, new platforms, and a more robust and efficient underlying function evaluation engine.
This document discusses SWIG (Simplified Wrapper and Interface Generator), which is a tool that takes C/C++ declarations as input and generates bindings to other languages like Python, Tcl, Perl, and Guile. SWIG allows functions, variables, constants, and C++ classes to be accessed from these scripting languages. It handles data type conversions and run-time type checking. The document provides examples of using SWIG to expose a simple C function and C++ class to Python.
The PyConTW (http://tw.pycon.org) organizer wishes to improve the quality and quantity of the programming cummunities in Taiwan. Though Python is their core tool and methodology, they know it's worth to learn and communicate with wide-ranging communities. Understanding cultures and ecosystem of a language takes me about three to six months. This six-hour course wraps up what I - an experienced Java developer - have learned from Python ecosystem and the agenda of the past PyConTW.
你可以在以下鏈結找到中文內容:
http://www.codedata.com.tw/python/python-tutorial-the-1st-class-1-preface
The document summarizes Henry Schreiner's work on several Python and C++ scientific computing projects. It describes a scientific Python development guide built from the Scikit-HEP summit. It also outlines Henry's work on pybind11 for C++ bindings, scikit-build for building extensions, cibuildwheel for building wheels on CI, and several other related projects.
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packagingHenry Schreiner
This was a PyCon 2022 lightning talk over the Scikit-HEP developer pages. It highlights best practices and guides shown there, and the quick package creation cookiecutter. And finally it demos the Pyodide WebAssembly app embedded into the Scikit-HEP developer pages!
This document discusses getting started with a first Python project. It covers installing Python and choosing an IDE, following coding best practices like PEP8 style guidelines, using built-in data structures, testing tools, virtual environments, project structure, and deployment tools like Supervisor. The goal is to help new Python programmers understand the basics of starting their first project.
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...Henry Schreiner
Building binary extensions is easier than ever thanks to several key libraries. Pybind11 provides a natural C++ language for extensions without requiring pre-processing or special dependencies. Scikit-build ties the premier C++ build system, CMake, into the Python extension build process. And cibuildwheel makes it easy to build highly compatible wheels for over 80 different platforms using CI or on your local machine. We will look at advancements to all three libraries over the last year, as well as future plans.
Python is a dynamic, object-oriented programming language that can be used for many types of software development. It has strong support for integration with other languages and tools, extensive standard libraries, and can be learned quickly. Python plays a key role in production pipelines and workflows due to its productivity, clear code, strong libraries, and integration capabilities. It is widely used at companies like Google and in applications like web frameworks.
Python is a popular programming language created by Guido van Rossum in 1991. It is easy to use, powerful, and versatile, making it suitable for beginners and experts alike. Python code can be written and executed in the browser using Google Colab, which provides a Jupyter notebook environment and access to computing resources like GPUs. The document then discusses installing Python using Anaconda, basic Python concepts like indentation, variables, strings, conditionals, and loops.
Python packaging: how did we get here, and where are we going?takluyver
An overview of Python packaging history, with details of what PEP 517 is, some of the new tools using it, and how you might write a PEP 517 backends and frontends.
See also notes taken by an attendee: https://twitter.com/drakekin/status/1173195932151746561
Introduction to underlying technologies, the rationale of using Python and Qt as a development platform on Maemo and a short demo of a few projects built with these tools. Comparison of different bindings (PyQt vs PySide). PyQt/PySide development environments, how to develop most efficiently, how to debug, how to profile and optimize, platform caveats and gotchas.
This document provides an introduction to Python for .NET developers. Python is an interpreted, object-oriented programming language that is portable, extensible, and combines power with clear syntax. It differs from .NET in that it uses dynamic typing, whitespace is significant, and functions must be defined before use. Python code is often organized into modules stored in files with a .py extension. The document discusses Python implementations, modules, strengths, weaknesses and success stories. It also introduces the IronPython implementation and some Python web frameworks.
The document discusses best practices for writing a C/C++ Python extension in 2017. It covers available options like ctypes, cffi, Cython, and SWIG. It then focuses on building a binary Python extension using ctypes, including debugging crashes by generating core files and using lldb/gdb. It also discusses memory issues and using valgrind and clang sanitizers. It recommends abusing Python unit tests for testing C code. Finally, it covers shipping the extension, including manylinux wheels, testing wheels on different Linux distributions with Docker, and publishing source and wheel distributions.
Christian Strappazzon - Presentazione Python Milano - Codemotion Milano 2017Codemotion
PyMI: siamo un gruppo di Sviluppatrici, Sviluppatori, Appassionati e Appassionate di Python a Milano. Ci incontriamo una volta al mese in Mikamai/LinkMe. Abbiamo degli eventi ricorrenti e molto apprezzati: "Pillole di Python" e "PyBirra". * Presentazione del gruppo * Python Blueprint: the language, the tools, the packages and the ecosystem.
This document summarizes several tools for Python dependency management: pip-tools, Pipenv, poetry, and hatch. It discusses how each tool handles specifying dependencies, installing packages, creating reproducible environments, and publishing packages. While pip-tools and Pipenv focus on dependencies, poetry aims to be a single tool for all project tasks including building and publishing. Hatch simplifies the development workflow by wrapping multiple common tools. The document concludes that the best tool depends on whether a library or application is being built, and which fits the user's infrastructure.
Rust + python: lessons learnt from building a toy filesystemChengHui Weng
In this slides I listed what I have learnt when I was working on my toy FUSE based file system in Rust for Python. By using PyO3, to bind Rust with Python becomes really easy, but the unavoidable type conversions affect the whole Rust code design and efficiency.
This is as a lighting talk for WebHack#16 meet up: https://webhack.connpass.com/event/99735/
The document discusses py7zr, a Python library that provides generic compression, decompression, and archiving capabilities for the 7zip format. Py7zr utilizes Python's built-in lzma support for decompression and allows compression and decompression of 7z files without external dependencies. It aims to provide a pure Python implementation of 7zip's compression algorithms with a focus on code quality, testing, and documentation.
This document discusses different approaches for creating Python extensions and bindings to C/C++ libraries. It summarizes the author's experience using ctypes to create a minimal binding called PyMiniRacer to the V8 JavaScript engine. The author argues that combining ctypes, which allows shipping a single Python-independent binary, with pre-built wheel distributions can provide an optimal solution for packaging and distributing Python extensions.
Intro to Pinax: Kickstarting Your Django AppsRoger Barnes
This document provides an overview and introduction to Pinax, an open source framework built on Django that aims to provide common functionality needed for web applications out of the box. It discusses what Pinax is, when it should be used, its key features like reusable apps and starter projects, alternatives to Pinax, and tips for development using Pinax like testing, deployment and source control best practices.
This is a python course for beginners, intended both for frontal class learning as well as self-work.
The Course is designed for 2 days and then another week of HW assignments.
CHEP 2018: A Python upgrade to the GooFit package for parallel fittingHenry Schreiner
A Python upgrade to the GooFit package for parallel fitting
9 Jul 2018, 15:30
15m
Hall 3 (National Palace of Culture)
presentation Track 5 – Software development T5 - Software development
Speaker
Henry Fredrick Schreiner (University of Cincinnati (US))
Description
The GooFit highly parallel fitting package for GPUs and CPUs has been substantially upgraded in the past year. Python bindings have been added to allow simple access to the fitting configuration, setup, and execution. A Python tool to write custom GooFit code given a (compact and elegant) MINT3/AmpGen amplitude description allows the corresponding C++ code to be written quickly and correctly. New PDFs have been added. The most recent release was built on top of the December 2017 2.0 release that added easier builds, new platforms, and a more robust and efficient underlying function evaluation engine.
This document discusses SWIG (Simplified Wrapper and Interface Generator), which is a tool that takes C/C++ declarations as input and generates bindings to other languages like Python, Tcl, Perl, and Guile. SWIG allows functions, variables, constants, and C++ classes to be accessed from these scripting languages. It handles data type conversions and run-time type checking. The document provides examples of using SWIG to expose a simple C function and C++ class to Python.
The PyConTW (http://tw.pycon.org) organizer wishes to improve the quality and quantity of the programming cummunities in Taiwan. Though Python is their core tool and methodology, they know it's worth to learn and communicate with wide-ranging communities. Understanding cultures and ecosystem of a language takes me about three to six months. This six-hour course wraps up what I - an experienced Java developer - have learned from Python ecosystem and the agenda of the past PyConTW.
你可以在以下鏈結找到中文內容:
http://www.codedata.com.tw/python/python-tutorial-the-1st-class-1-preface
The document summarizes Henry Schreiner's work on several Python and C++ scientific computing projects. It describes a scientific Python development guide built from the Scikit-HEP summit. It also outlines Henry's work on pybind11 for C++ bindings, scikit-build for building extensions, cibuildwheel for building wheels on CI, and several other related projects.
The document describes various productivity tools for Python development, including:
- Pre-commit hooks to run checks before committing code
- Hot code reloading in Jupyter notebooks using the %load_ext and %autoreload magic commands
- Cookiecutter for generating project templates
- SSH configuration files and escape sequences for easier remote access
- Autojump to quickly navigate frequently visited directories
- Terminal tips like command history search and referencing the last argument
- Options for tracking Jupyter notebooks with git like stripping outputs or synchronizing notebooks and Python files.
This document provides best practices for using CMake, including:
- Set the cmake_minimum_required version to ensure modern features while maintaining backward compatibility.
- Use targets to define executables and libraries, their properties, and dependencies.
- Fetch remote dependencies at configure time using FetchContent or integrate with package managers like Conan.
- Import library targets rather than reimplementing Find modules when possible.
- Treat CUDA as a first-class language in CMake projects.
HOW 2019: Machine Learning for the Primary Vertex ReconstructionHenry Schreiner
The document describes a machine learning approach for primary vertex reconstruction in high-energy physics experiments. A hybrid method is proposed that uses a 1D convolutional neural network to analyze histograms produced from tracking data. The network is able to find primary vertices with high efficiency and tunable false positive rates, demonstrating the potential of machine learning for this task. Future work involves adding more tracking information and iterating between track association and vertex finding to improve performance.
HOW 2019: A complete reproducible ROOT environment in under 5 minutesHenry Schreiner
The document discusses setting up a ROOT environment using Conda in under 5 minutes. It describes downloading and installing Miniconda and then using Conda commands to create a new environment and install ROOT and its dependencies from the conda-forge channel. The ROOT package provides full ROOT functionality, including compilation and graphics, and supports Linux, macOS, and multiple Python versions.
ACAT 2019: A hybrid deep learning approach to vertexingHenry Schreiner
This document presents a hybrid deep learning approach for vertex finding in high-energy physics experiments. It uses a 1D convolutional neural network to analyze kernel density estimates of track information in order to identify primary vertex positions. The approach achieves primary vertex finding efficiencies of 88-94% with low false positive rates comparable to traditional algorithms. The authors demonstrate tuning of the efficiency-false positive rate tradeoff and discuss plans to improve performance by incorporating additional track information and iterative refinement.
2019 CtD: A hybrid deep learning approach to vertexingHenry Schreiner
This document presents a hybrid deep learning approach for vertex finding using 1D convolutional neural networks. It describes generating 1D kernel densities from tracking information, building target distributions, and using a CNN architecture with an adjustable cost function to optimize the false positive rate versus efficiency. The approach achieves 93.87% efficiency with a 0.251 false positive rate on test data. Future work includes incorporating additional xy information and exploring full 2D kernel densities.
2019 IRIS-HEP AS workshop: Boost-histogram and histHenry Schreiner
The document discusses the current state of histograms in Python and the need for a new histogramming library. It introduces boost-histogram, a C++ histogramming library, and its new Python bindings. The bindings aim to provide a fast, flexible and easily distributable histogram object for Python. Key features discussed include histogram design that treats it as a first-class object, fast filling via multi-threading, a variety of axis and storage types, and performance benchmarks showing it can be over 10x faster than NumPy for filling histograms. Distribution is focused on providing binary wheels for many platforms via continuous integration.
The document discusses the current state of histograms in Python and the need for a new library. It introduces boost-histogram, a C++ histogram library, and its new Python bindings. The bindings aim to provide a fast, flexible, and easily distributable histogram object for Python with support for multiple axis types and storage options. It also discusses plans for an additional wrapper library called hist for easy plotting and interfacing with other tools.
2019 IRIS-HEP AS workshop: Particles and decaysHenry Schreiner
The Scikit-HEP project aims to create an ecosystem for particle physics data analysis in Python. It includes packages like Particle and DecayLanguage that provide tools for working with particle data and decay descriptions. Particle allows users to easily access and search particle property data from sources like the PDG. DecayLanguage allows parsing decay file formats, representing and manipulating decay chains, and converting between decay model representations. Future work includes expanding particle ID support and improving visualization of decay trees.
The document discusses plans for the boost-histogram and hist Python libraries. Boost-histogram is a multidimensional histogram library inspired by ROOT that provides flexibility through many axis and storage types. Hist will provide plotting and analysis functionality by interfacing with libraries like mpl-hep. Future plans include improved indexing, slicing, and NumPy conversions for boost-histogram as well as statistical functions, serialization, and integration with fitters for hist.
The document discusses the new features of Python 3.8, which was recently released. Some key updates include positional-only arguments, the walrus operator for variable assignment, improved static typing support, and performance enhancements. The document also notes additional developer changes and provides resources for obtaining Python 3.8.
This document provides an overview of histograms and various histogram libraries. It introduces boost-histogram, a C++ histogram library that is fast and header-only. It then describes the new Python bindings for boost-histogram, which are designed to be fast and easy to use while resembling the C++ version. Finally, it outlines plans for additional Python histogram tools like hist, Aghast, and Unified Histogram Indexing to integrate boost-histogram into the wider ecosystem.
2019 IML workshop: A hybrid deep learning approach to vertexingHenry Schreiner
A hybrid deep learning approach is proposed for vertex finding using 1D convolutional neural networks on kernel density estimates from tracking data. The approach generates 1D histograms from 3D tracking data and uses a CNN to classify primary vertex positions. In a proof-of-concept on simulated data, it achieves primary vertex finding efficiencies and false positive rates comparable to traditional algorithms, with tunable efficiency-false positive tradeoffs. Future work includes incorporating additional tracking features, associating tracks to vertices, and deploying the inference engine for the LHCb trigger.
CHEP 2019: Recent developments in histogram librariesHenry Schreiner
This document discusses recent developments in Python histogram libraries. It describes Boost.Histogram, a C++ histogramming library that serves as the foundation for the boost-histogram Python package. Boost.Histogram provides fast, customizable histogram filling and manipulation. The document also outlines plans for hist, a Python analysis frontend, and aghast, a library for converting between histogram formats. Together, boost-histogram, hist, and aghast comprise the Scikit-HEP histogramming framework.
LHCb Computing Workshop 2018: PV finding with CNNsHenry Schreiner
The document discusses using a convolutional neural network (CNN) to quickly find primary vertices (PVs) in high-energy physics events recorded by the LHCb experiment. A prototype tracking algorithm is used to generate a 1D kernel density estimate (KDE) histogram from hit triplets. This histogram is then used to train a CNN to predict the locations of PVs. Initial results show the CNN approach can find PVs with 70-75% efficiency and a false positive rate of 0.08-0.13, outperforming current algorithms. Further work aims to improve resolution, find secondary vertices, and integrate the approach into iterative tracking.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Maruthi Prithivirajan, Head of ASEAN & IN Solution Architecture, Neo4j
Get an inside look at the latest Neo4j innovations that enable relationship-driven intelligence at scale. Learn more about the newest cloud integrations and product enhancements that make Neo4j an essential choice for developers building apps with interconnected data and generative AI.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
3. Scikit-HEP
New languages: C++ & Python
3
Python 1.0
1994 1998 2002 2006 2010 2014 2018 2022
FORTRAN
ROOT
(C++)
PyROOT
(Official bindings)
ROOTPy (third party
pythonizations)
New PyROOT
Experiments start adopting
Python for configuration
(ROOT is both a toolkit and a file format)
4. ROOT: C++ toolkit (& interpreter)
4
Physicists only need to learn a single language
Python bindings introduced later as PyROOT
import ROOT
import numpy as np
file = ROOT.TFile("tree.root", "recreate")
tree = ROOT.TTree("name", "title")
px = np.zeros(1, dtype=float)
phi = np.zeros(1, dtype=float)
tree.Branch("px", px, "normal/D")
tree.Branch("phi", phi, "uniform/D")
for i in range(10000):
px[0] = ROOT.gRandom.Gaus(20,2)
phi[0] = ROOT.gRandom.Uniform(2*3.1416)
tree.Fill()
file.Write()
file.Close()
Creating memory for pointer
Assigning pointers to fill from
Write on file (tree is contained in file)
This is classic PyROOT!
More Pythonic methods exist now
Based on https://wiki.physik.uzh.ch/cms/root:pyroot_ttree
5. Python: a configuration language
5
C++
Components
Python
configuration
Compiled
Runtime
Experiments started driving their C++ applications with Python
6. Sneaking into analysis
6
More students were entering with Python knowledge
The Python data-science stack was growing
root-numpy & root-pandas
bridged the gap between ROOT & NumPy/Pandas
By 2015, most analysis work could be done in the Python data science stack!
A few domain specific things were “missing”
Data IO Fitting Jagged Data Vectors Particle info
11. A new package
11
Eduardo Rodrigues created the Scikit-HEP org with the scikit-hep package in 2016
vectors units statistics
Toolkit
Initially all-in-one toolkit approach with topical modules
But this would change…
vectors units statistics
Toolset
vs.
12. Joining forces
12
He also created an organization and invited some other popular packages
iminuit
pyjet
numpythia
root-numpy
root-pandas
ROOTPy was winding down
We got several of the related packages
Several simulation bindings
And the standalone MINUIT binding
The most popular package to join
13. First major new package
13
import uproot
import numpy as np
rng = np.random.default_rng(12345)
px = rng.normal(20, 2, 10_000)
phi = rng.uniform(0, 2*np.pi, 10_000)
with uproot.create("tree.root") as f:
f["tree_name"] = {"px": px, "phi": phi}
Just* IO
Pure Python (pip install uproot)
Answer to a key need!
*: Missing functionality in the Python ecosystem became new packages (Awkward, Vector)
ROOT file reader (and then writer)
14. First new general package
14
V0: pure Python
V1+: Compiled
Regular array
Jagged structured data format
NumPy like manipulation
Assignable “behaviors”
Custom Numba support
Originally HEP focused,
now being developed for 6+ fields
15. First new general package
14
V0: pure Python
V1+: Compiled
Jagged array
Jagged structured data format
NumPy like manipulation
Assignable “behaviors”
Custom Numba support
Originally HEP focused,
now being developed for 6+ fields
16. Uptake (one expirement - CMS)
15
Use
of Scientific
Python
in
HEP
ROOT (C++ and PyROOT)
fi
Scienti c
Python
PyROOT
Use of Scikit-HEP packages
(as a baseline for scale)
CMSSW config
(Python but not data analysis)
Scikit-HEP
18. Case study: Histograms
17
Development of reliable bindings
Development of reliable binary building tools
Development of a Protocol based ecosystem
Work developed here adapted to other packages
19. Bindings tools
18
Chose pybind11 over Cython / SWIG
Simpler build (no preprocessing)
Simpler distribution - no NumPy build dependence
Powerful generation capabilities via template meta programming
Integrated improvements upstream
Awkward 1.0 and iMinuit 2 rewrite followed
20. Building tools
19
Building redistributable wheels is not straight forward
Linux macOS Windows
Manylinux docker image Official download Anything
Python source
Post-process step Auditwheel Delocate Develwheel
Architectures 64/32/ARM/PPC/… Intel/AS/Universal 32/64/ARM
+ Testing, PyPy, Musllinux, …
21. First attempt: azure-wheel-helpers
20
Early attempt at shared infrastructure
Built using git subtree (before Azure templates & remote config support)
Adopted by:
Boost-histogram
awkward-array
pyjet
numpythia
iminuit
Covered by blog-post series
https://iscinumpy.dev/categories/azure-devops/
Great learning experience - but some problems:
Tied to one CI system
Mostly YAML files - poor testability / Code QA
Small number of users
22. Using cibuildwheel
21
🎡cibuildwheel
We merged our improvements to cibuildwheel & joined that project!
All projects moved (and now pyhepmc added too)
Great positives:
Not tied to a CI system (easy move to GHA)
Python package - great testability / code QA
Large number of users
Contributions from Scikit-HEP:
Static configuration in TOML
Static overrides
Direct SDist support
build (the tool) support
Automatic Python version limit detection
Better globbing support
Full static typing & more Code QA
From azure-wheel-helpers:
Better Windows support
VSC versioning support
Better PEP 518 support
And we helped get
cibuildwheel into the PyPA!
23. Protocol ecosystem
22
Boost::Histogram
C++ Boost library
boost-histogram
Python wrapper
Hist
Fully featured
histograms
mplhep
Matplotlib HEP
Histoprint
Terminal plotting
Uproot
ROOT file IO
How should these be related?
24. Protocol ecosystem
23
Boost::Histogram
C++ Boost library
boost-histogram
Python wrapper
Hist
Fully featured
histograms
mplhep
Matplotlib plots
Histoprint
Terminal plotting
Uproot
ROOT file IO
Static, not runtime dependency!
PlottableHistogram
UHI
Static Protocol
Defines expected behaviors for producers
Consumers expect these behaviors
25. Example of a Protocol
24
from typing import Protocol
class Duck(Protocol):
def quack() -> str: ...
Definition
class MyDuck:
def quack() -> str:
return "quack"
if typing.TYPE_CHECKING:
_: Duck = typing.cast(MyDuck, None)
Producer
def need_a_duck(duck: Duck) -> None:
print(duck.quack())
Consumer
No runtime dependence, unlike ABC!
All static type annotations
Validation by checker
26. Success of histograms
25
YODA
YODA
histograms
in Coffea
B
o
o
s
t
:
:
H
i
s
t
o
g
r
a
m
,
h
i
s
t
,
m
p
l
h
e
p
c
o
m
b
o
ROOT
mainstream Python adoption
in HEP: when many histogram
libraries lived and died
histograms
in rootpy
histogram part of ROOT
(395 C++ files)
30. Many contributions upstream
29
We can all benefit!
Helping with maintaining several projects
Affiliated status helped
Upstreaming features and fixes
36. Cookiecutter
35
pipx run cookiecutter gh:scikit-hep/cookie
11 backends to pick from
Generation tested by nox
In sync with the developer pages
Setuptools
Setuptools PEP 621
Flit Hatch PDM
Poetry
Scikit-build
Setuptools C++
Maturin (Rust)
+
more!
39. Scikit-build
38
CMake bridge for Python packaging
Year 1 plans:
Introduce scikit-build-core: modern PEP 517 backend
Support PEP 621 configuration
Support use as plugin (possibly via extesionlib)
Tighter CMake integration (config from PyPI packages)
Distutils-free code ready for Python 3.12
Year 2 plans:
Convert selected projects to Scikit-build
Year 3 plans:
Website, tutorials, outreach
https://iscinumpy.dev/post/scikit-build-proposal/
40. Scikit-build
38
CMake bridge for Python packaging
Year 1 plans:
Introduce scikit-build-core: modern PEP 517 backend
Support PEP 621 configuration
Support use as plugin (possibly via extesionlib)
Tighter CMake integration (config from PyPI packages)
Distutils-free code ready for Python 3.12
Year 2 plans:
Convert selected projects to Scikit-build
Year 3 plans:
Website, tutorials, outreach
Yesterday, NSF 2209877 was awarded!
https://iscinumpy.dev/post/scikit-build-proposal/
41. Web Assembly
39
Already have boost-histogram (pybind11 project) added to pyodide!
iminuit (better CMake support) next, then work on Awkward Array!
numpy.org
42. Questions?
40
• History of Python in HEP
• ROOT, PyROOT, Analysis
• Beginnings of a Scikit
• New package, joining forces
• Uproot, Awkward Array
• Building Scikit-HEP
• Histograms case study
• All together: uproot-browser
• Broader ecosystem
• Scikit-HEP Developer Pages
• The Future
• Scikit-build
• Web Assembly
https://scikit-hep.org
https://iscinumpy.dev