Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ACAT 2017: GooFit 2.0

35 views

Published on

The GooFit package provides physicists a simple, familiar syntax for manipulating probability density functions and performing fits, but is highly optimized for data analysis on NVIDIA GPUs and multithreaded CPU backends. GooFit is being updated to version 2.0, bringing a host of new features. A completely revamped and redesigned build system makes GooFit easier to install, develop with, and run on virtually any system. Unit testing, continuous integration, and advanced logging options are improving the stability and reliability of the system. Developing new PDFs now uses standard CUDA terminology and provides a lower barrier for new users. The system now has built-in support for multiple graphics cards or nodes using MPI, and is being tested on a wide range of different systems.
GooFit also has significant improvements in performance on some GPU architectures due to optimized memory access. Support for time-dependent four body amplitude analyses has also been added.

Published in: Science
  • Be the first to comment

  • Be the first to like this

ACAT 2017: GooFit 2.0

  1. 1. GooFit 2.0 Henry Schreiner1 Christoph Hasse2 Bradley Hittle3 Himadri Pandey1 Michael Sokoloff1 Karen Tomko3 August 22, 2017 1 University of Cincinnati 2 Technical University of Dortmund 3 Ohio Supercomputer Center ACAT 2017
  2. 2. /GooFit/GooFit LGPLv3 • Resembles the RooFit package • Built for CUDA GPUs • Also supports OpenMP on CPUs Classic features • PDF composition • PDFs for amplitude analyses • Binned and unbinned fits GooFit 2.0 H. Schreiner August 22, 2017 1/14 ACAT 2017
  3. 3. Features of a fit Variables PDF1 PDF2 CombinePDF DataSet Fitter Compose PDFs Core Powerful composition of PDFs • Long list of built-in PDFs • Combination PDFs using addition, multiplication, convolution, composition, etc. • 100+ Variables possible • Advanced users can create new PDFs • Expert users can extend GooFit core GooFit 2.0 H. Schreiner August 22, 2017 2/14 ACAT 2017
  4. 4. Common GPUs Classic CPUs DP 2.7 MHz × 2 cores × 16 FLOP/cyc = 86.4 MFLOP/s 2.4 MHz × 24 cores × 16 FLOP/cyc = 921.6 MFLOP/s NVIDIA GPUs • Programing language: CUDA • Massively parallel identical operations (SIMD) • Separate memory model (coprocessor) TFLOPS Name Stream processors Clock SP DP Cost Gamer GTX 1050 Ti 768 1290 Mhz 1.98 0.062 $150 GTX 1080 Ti 3,584 1596 Mhz 11.3 0.33 $850 Server Tesla K40 2,880 745 Mhz 4.29 1.43 $3,000 Tesla P100 3,584 1329 Mhz 9.3 4.7 $10,000 GooFit 2.0 H. Schreiner August 22, 2017 3/14 ACAT 2017
  5. 5. Why use GooFit? 1 2 4 12 24 48 44 85 213 418 843 OpenMP threads on 24 core Xeon Time[s] πππ0 , 16 time-dependent amplitudes • Original RooFit code: 19,489 s single core 2 Cores Core 2 Duo 1,159 s GPU GeForce GTX 1050 Ti 86.4 s GPU Tesla K40 64.0 s MPI Tesla K40 ×2 39.3 s GPU Tesla P100 20.3 s 1 2 4 12 24 48 36 60 118 336 625 1,240 OpenMP threads on 24 core Xeon Time[s] ZachFit: M(D∗+ ) − M(D0 ) • 142,576 events in unbinned fit 2 Cores Core 2 Duo 738 s GPU GeForce GTX 1050 Ti 60.3 s GPU Tesla K40 96.6 s MPI Tesla K40 ×2 54.3 s GPU Tesla P100 23.5 s [Phys.Rev.Lett. 111 (2013) no.11, 111801] [Phys.Rev. D93 (2016) no.11, 112014] [Phys.Rev. D88 (2013) no.5, 052003] [CHEP 2013] GooFit 2.0 H. Schreiner August 22, 2017 4/14 ACAT 2017
  6. 6. Why use GooFit? Reduce time to insight! 1 2 4 12 24 48 44 85 213 418 843 OpenMP threads on 24 core Xeon Time[s] πππ0 , 16 time-dependent amplitudes • Original RooFit code: 19,489 s single core 2 Cores Core 2 Duo 1,159 s GPU GeForce GTX 1050 Ti 86.4 s GPU Tesla K40 64.0 s MPI Tesla K40 ×2 39.3 s GPU Tesla P100 20.3 s 1 2 4 12 24 48 36 60 118 336 625 1,240 OpenMP threads on 24 core Xeon Time[s] ZachFit: M(D∗+ ) − M(D0 ) • 142,576 events in unbinned fit 2 Cores Core 2 Duo 738 s GPU GeForce GTX 1050 Ti 60.3 s GPU Tesla K40 96.6 s MPI Tesla K40 ×2 54.3 s GPU Tesla P100 23.5 s GooFit 2.0 H. Schreiner August 22, 2017 4/14 ACAT 2017
  7. 7. New build features Modernization 2013 2014 2015 2016 2017 0.1 0.2 0.3 OpenMP 0.4 Work in forks Minor updates 1.0 CMake 2.0 CMake enabled features • Supports IDEs like XCode and QtCreator • Support for macOS, CPP backend • Datafiles auto-download (New datasets!) • Auto-library download and discovery • Helpers at /CLIUtils/cmake Other build features • Initial unit tests added • Travis CI builds every push • Travis powered documentation system • Docker images available • Example setup on new systems GooFit 2.0 H. Schreiner August 22, 2017 5/14 ACAT 2017
  8. 8. It's easy to get started docker run -it ubuntu apt-get update apt-get install -y git cmake make g++ git clone --branch=stable https://github.com/GooFit/GooFit.git cd GooFit make Simple installation • More systems available on • Or use Docker images: goofit/goofit-omp and goofit/goofit-cuda GooFit 2.0 H. Schreiner August 22, 2017 6/14 ACAT 2017
  9. 9. New design features Compose PDFs Core New design features • C++11, extensive code cleanup • Color logging, optimization warnings • Simultaneous fit to disjoint datasets • MPI support (GPUs or nodes) • Optimizations for newer NVIDIA cards ./MyAnalysis generate_toy --params=file.ini --release_K892_mass --A12=0.3 --plot /CLIUtils/CLI11 BSD • Command line parser using C++11 • Help generation, Subcommands, validation, config files, 100% test coverage • Designed to support toolkit customization /GooFit/Minuit2 LGPLv2.1 • Extracted from ROOT 6.08, no changes • Full CMake build system GooFit 2.0 H. Schreiner August 22, 2017 7/14 ACAT 2017
  10. 10. New physics features Three body time-dependent amplitude analyses (TDAA) • Mixing in D0 → π+ π− π0 TDAA (BaBar) [Phys.Rev. D93 (2016) no.11, 112014] • Mixing and CP violation search in D0 → K0 S π− π+ , TDAA (LHCb) [CERN-THESIS-2015-348] (paper in preparation) Four body time-integrated and time-dependent amplitude analyses • Mixing parameters in D0 → K+ π− π+ π− TDAA (LHCb) [CHEP 2016] Toy Monte Carlo generation using /MultithreadCorner/MCBooster • MIPWA in GooFit, such as D+ → h− h + h + (LHCb) [CHEP 2016] GooFit 2.0 H. Schreiner August 22, 2017 8/14 ACAT 2017
  11. 11. GooFit 2.1: Python bindings Compose PDFs Core pip install scikit-build cmake pip install -v goofit Downloads and installs all of GooFit! Prototype in GooFit 2.0 • Intended to be interface to composition • All backends supported • Only one PDF in GooFit 2.0 • Using CMake and PyBind11 Working on features 2.1 Himadri Pandey • All PDFs added, including composition PDFs • Install development version with pip • Pythonization of objects ongoing • Converting/adding examples/documentation GooFit 2.0 H. Schreiner August 22, 2017 9/14 ACAT 2017
  12. 12. GooFit 2.1: Python bindings example from goofit import * import numpy as np xvar = Variable("xvar", -10, 10) xdata = UnbinnedDataSet(xvar) npdata = np.random.normal(1, 2.5, 100000) xdata.from_numpy([npdata], filter=True) mean = Variable("mean", 0, -10, 10) sigma = Variable("sigma", 1, 0, 5) gauss = GaussPdf("gauss", xvar, mean, sigma) exppdf.fitTo(data) grid, values = gauss.evaluatePdf(xval) GooFit 2.0 H. Schreiner August 22, 2017 10/14 ACAT 2017
  13. 13. GooFit 2.1: Python bindings example from goofit import * import numpy as np xvar = Variable("xvar", -10, 10) xdata = UnbinnedDataSet(xvar) npdata = np.random.normal(1, 2.5, 100000) xdata.from_numpy([npdata], filter=True) mean = Variable("mean", 0, -10, 10) sigma = Variable("sigma", 1, 0, 5) gauss = GaussPdf("gauss", xvar, mean, sigma) exppdf.fitTo(data) grid, values = gauss.evaluatePdf(xval) −10 −5 0 5 10 xvar Data for red line PDF plot GooFit 2.0 H. Schreiner August 22, 2017 10/14 ACAT 2017
  14. 14. GooFit future Compose PDFs Core New PDF indexing system • Designed by Bradley Hittle at OSC • Simpler PDF authoring • We can alter core later Other plans • /multithreadcorner/Hydra (optional inclusion) • GooFit 2torial: henryiii.gitbooks.io/GooFit • Drop support for CMake < 3.8 GooFit 2.0 H. Schreiner August 22, 2017 11/14 ACAT 2017
  15. 15. Summary GooFit 2.0 Available now • Much easier to start using Automated CMake builds More supported platforms including macOS Easy backend selection: OpenMP, CUDA, CPP, partial TBB • More examples and PDFs • Added example datasets • Added MPI and performance enhancements GooFit 2.1 Coming soon • Python bindings for composition • Development releases on PyPI now GooFit 2.x series • Simpler PDF authoring • Extended documentation • Hydra inclusion • CMake 3.8+ requirement GooFit 2.0 H. Schreiner August 22, 2017 12/14 ACAT 2017
  16. 16. Acknowledgments • A. Augusto Alves Jr. • Rolf Andreassen • Christoph Hasse • Bradley Hittle • Zachary Huard • Brian Maddock • Himadri Pandey • Michael Sokoloff • Liang Sun • Karen Tomko Development of GooFit 2.x is supported under NSF grant PHY-1414736. GooFit was originally developed under PHY-1005530. Any opinions, findings, and conclusions or recommendations expressed in this presentation are those of the developers and do not necessarily reflect the views of the National Science Foundation. goofit.github.io GooFit 2.0 H. Schreiner August 22, 2017 13/14 ACAT 2017
  17. 17. Questions? henry.schreiner@uc.edu Give our projects a try and a star on ! /GooFit/GooFit /GooFit/Minuit2 /CLIUtils/CLI11 /CLIUtils/cmake /multithreadcorner/Hydra GooFit 2.0 H. Schreiner August 22, 2017 14/14 ACAT 2017
  18. 18. Running Timing Examples General notes • You can pick cards with the prefix: CUDA_VISIBLE_DEVICES=0,1 πππ0 • time ./pipipi0DPFit canonical dataFiles/cocktail_pp_0.txt --blindSeed=0 • time mpiexec -np 2 ./pipipi0DPFit canonical dataFiles/cocktail_pp_0.txt --blindSeed=0 ZachFit • time ./zachFit 0 1 • time mpiexec -np 2 ./zachFit 0 1 GooFit 2.0 H. Schreiner August 22, 2017 15/14 ACAT 2017
  19. 19. Command line parsing ./MyAnalysis generate_toy --params=file.ini --release_K892_mass --A12=0.3 --plot Recurring theme • Analyses require 40+ options and multiple procedures • Found a lot of duplicated code for argument parsing • Many bugs related to parsing (usually segfaults) • Needed powerful solution, with direct access to values GooFit 2.0 H. Schreiner August 22, 2017 16/14 ACAT 2017
  20. 20. CLI11 /CLIUtils/CLI11 BSD • No dependencies • Compiles to single header file Features • Nested subcommands • Configuration files • 100% test coverage • CI tests on macOS/Linux/Windows • + GooFit’s features Use in GooFit • Testbed for new build features GooFit::Application • Auto logging • Optimization warnings • GPU switches • MPI support • Completely optional GooFit 2.0 H. Schreiner August 22, 2017 17/14 ACAT 2017

×