More Related Content


OpenACC and Hackathons Monthly Highlights: April 2023

  2. 2 WHO IS OPENACC? The OpenACC Organization is dedicated to helping the research and developer community advance science by expanding their accelerated and parallel computing skills. 3 Areas of Focus Ecosystem Development Training/Education OpenACC Specification Par t i ci pat i ng i n w or k t hatenabl es and/oradvances i nt er oper abi l i t yof par al l elpr ogr am m i ng m odel s i n com pi l er s ort ool s w i t h t he vi si on of t r ansi t i oni ng t o st andar d l anguage par al l el i sm . M anagi ng one oft he l eadi ng hackat hon ser i es,ourevent s have been i nst r um ent ali n t r ai ni ng t housands ofdom ai n exper t s and accel er at i ng over550 sci ent i f i c appl i cat i ons usi ng a var i et yof pr ogr am m i ng m odel s and t ool s. Devel opi ng and ut i l i zi ng t he OpenACC di r ect i ves- based pr ogr am m i ng m odelt o por t , accel er at e,oropt i m i ze sci ent i f i c appl i cat i ons.
  3. 3 silica IFPEN, RMM-DIIS on P100 silica IFPEN, RMM-DIIS on P100 OPENACC SPECIFICATION MOMENTUM Wide Adoption Across Key HPC Codes ANSYS Fluent Gaussian VASP LSDalton MPAS GAMERA GTC XGC ACME FLASH COSMO Numeca 400+ APPS* USING OPENACC Prof. Georg Kresse Computational Materials Physics University of Vienna For VASP, OpenACC is the way forward for GPU acceleration. Performance is similar to CUDA, and OpenACC dramatically decreases GPU development and maintenance efforts. We’re excited to collaborate with NVIDIA and PGI as an early adopter of Unified Memory. “ “ VASP Top Quantum Chemistry and Material Science Code * Applications in production and development
  4. 4 LEARN MORE Develop skills, build community, and increase your opportunities: Become an Open Hackathons mentor and help scientists leverage supercomputers to optimize their codes, explore their AI projects, and accelerate their discoveries. OPEN HACKATHONS MENTOR PROGRAM Help Shape Science with Technology Becoming a mentor for the Open Hackathons has been a truly rewarding experience. The targeted pathways help bootstrap your knowledge in a specific domain while each hackathon challenges you to learn about different nuances associated with using HPC technologies and apply your knowledge to solve real-world problems. Expanding your professional network through interactions with other researchers and mentors is an added bonus! “ “ Kevin Griffin, Senior Developer Technology Engineer, NVIDIA Certified HPC Open Hackathons Mentor
  5. 5 UPCOMING OPEN HACKATHONS & BOOTCAMPS COMPLETE LIST OF EVENTS Event Date Call Closes Event June 8-9, 2023 May 28, 2023 Python GPU Programming Bootcamp June 28, 2023 June 14, 2023 Hackathon Readiness Bootcamp: Americas June 28, 2023 June 14, 2023 Hackathon Readiness Bootcamp: Europe June 28, 2023 June 6, 2023 Hong Kong AI for Science Bootcamp June 29, 2023 June 14, 2023 Hackathon Readiness Bootcamp: Asia July 12, 19-21, 2023 May 24, 2023 NERSC Open Hackathon July 27-28, 2023 July 4, 2023 NCHC N-Ways to GPU Programming Bootcamp July 25, August 1-4, 2023 June 20, 2023 Wuhan University Open Hackathon August 15, 22-24, 2023 May 31, 2023 BNL Open Hackathon Digital in 2023: Our virtual events continue to offer the same high-touch training and mentorship without the hassle of travel!
  6. 6 READ ARTICLE Team ASiMoV-CCS, with members from EPCC and Rolls- Royce, took part in the UK National Open Hackathon to demonstrate the capability to run the ASiMoV-CCS code, designed for running large-scale engineering simulations, on GPUs. The Asimov consortium, led by Rolls-Royce and EPCC, was awarded an EPSRC Prosperity Partnership worth £14.7m to develop the next generation of engineering simulation and modeling techniques. The aim is to develop the world’s first high-fidelity simulation of a complete gas- turbine engine during operation. GLIMPSE FROM THE UK NATIONAL HACKATHON Driving Next-Gen Engineering
  7. 7 READ THE ARTICLE Fortran still underpins some government systems that require extensive calculations, including those used for weather and climate science, medicine, and biomedical science among others. Researchers at Los Alamos National Laboratory released a report on April 18th evaluating the long-term risks of relying on Fortran for mission-critical code supporting nuclear security. They found several challenges to using Fortran for another 15 years. WHAT ARE THE RISKS OF RELYING ON FORTRAN? Los Alamos National Lab Issues Evaluation Report
  8. 8 RESOURCES To accelerate multiphysics applications, making use of not only GPUs but also FPGAs has been emerging. Multiphysics applications are simulations involving multiple physical models and multiple simultaneous physical phenomena. Operations with different performance characteristics appear in the simulation, making the acceleration of simulation speed using only GPUs difficult. Therefore, we aim to improve the overall performance of the application by using FPGAs to accelerate operations with characteristics which cause lower GPU efficiency. However, the application is currently implemented through multilingual programming, where the computation kernel running on the GPU is written in CUDA and the computation kernel running on the FPGA is written in OpenCL. This method imposes a heavy burden on programmers; therefore, we are currently working on a programming environment that enables to use both accelerators in a GPU– FPGA equipped high-performance computing (HPC) cluster system with OpenACC. To this end, we port the entire code only with OpenACC from the CUDA-OpenCL mixture. On this basis, this study quantitatively investigates the performance of the OpenACC GPU implementation compared to the CUDA implementation for ARGOT, a radiative transfer simulation code for fundamental astrophysics which is a multiphysics application. We observe that the OpenACC implementation achieves performance and scalability comparable to the CUDA implementation on the Cygnus supercomputer equipped with NVIDIA V100 GPUs. READ PAPER Paper: Accelerating Radiative Transfer Simulation on NVIDIA GPUs with OpenACC Ryohei Kobayashi, Norihisa Fujita, Yoshiki Yamaguchi, Taisuke Boku, Kohji Yoshikawa, Makito Abe and Masayuki Umemura
  9. 9 RESOURCES We present thread-safe, highly-optimized lattice Boltzmann implementations, specifically aimed at exploiting the high memory bandwidth of GPUbased architectures. At variance with standard approaches to LB coding, the proposed strategy, based on the reconstruction of the post-collision distribution via Hermite projection, enforces data locality and avoids the onset of memory dependencies, which may arise during the propagation step, with no need to resort to more complex streaming strategies. The thread-safe lattice Boltzmann achieves peak performances, both in two and three dimensions and it allows to sensibly reduce the allocated memory ( tens of GigaBytes for order billions lattice nodes simulations) by retaining the algorithmic simplicity of standard LB computing. Our findings open attractive prospects for high-performance simulations of complex flows on GPU-based architectures. READ PAPER Paper: Thread-Safe Lattice Boltzmann for High- Performance Computing on GPUs Andrea Montessoria, Marco Lauricellab, Adriano Tiribocchib, Mihir Durvec, Michele La Roccaa, Giorgio Amatid, Fabio Bonaccorsoe, and Sauro Succi Figure 4. Velocity field during droplet impact. The color maps in panels (a-f) stand for the vertical component of the flow field (values are normalized with the velocity of the droplet at impact) while the superimposed quivers denote the direction, verse, and magnitude of the local velocity field.
  10. 10 RESOURCES Decomposition and solver are the main performance bottlenecks of multi-block structured CFD simulation involving complex industrial configurations such as aero-engine, shock- boundary layer interactions, turbulence modeling and so on. In this article, we proposed several optimization strategies to improve the computing efficiency of multi-block structured CFD simulation based on Sunway TaihuLight super computing system, including: (1) a load balancing decomposition approach combined with recursive segmentation of undirected graphs and block mapping for multi-structured blocks, (2) two-level parallelism that utilizes MPI+ OpenACC2.0* hybrid parallel paradigms with various performance optimizations such as data preprocessing, reducing unnecessary loops of subroutine calls, collapse, and tile syntax, memory access optimization between the main memory and local data memory (LDM), and (3) a carefully orchestrated pipeline and register communication strategy between computing processor elements (CPEs) to tackle the dependence of LU-SGS (Lower-Upper Symmetric Gauss–Seidel). Numerical simulations were conducted to evaluate the proposed optimization strategies. The results showed that our parallel implementation provides high load balance and efficiency, achieving a speedup of 8× +for one loop step, and a speed up of 2× +for strong correlation kernels. READ PAPER Paper: Optimization Strategies for Multi-Block Structured CFD Simulation Based on Sunway TaihuLight Xiaojing Lv, Wenhao Leng, Zhao Liu, Chengsheng Wu, Fang Li, and Jiuxiu Xu Figure 1. The architecture of SW26010 CPU.
  11. 11 RESOURCES Website: Technical Resources VISIT SITE Explore a wealth of resources for parallelization and accelerated computing across HPC, AI and Big Data. Review a collection of videos, presentations, GitHub repos, tutorials, libraries, and more to help you advance your skills and expand your knowledge.
  12. 12 STAY IN THE KNOW: JOIN THE COMMUNITY OPENACC AND HACKATHON UPDATES JOIN TODAY The OpenACC Organization is dedicated to helping the research and developer community advance science by expanding their accelerated and parallel computing skills. Take an active role in influencing the future of both the OpenACC specification and the organization itself by becoming a member of the community. Keep abreast of the new tools, latest resources, recent research, and upcoming events.
  13. WWW.OPENACC.ORG Learn more at