SlideShare a Scribd company logo
1 of 27
Download to read offline
Evaluating HPX and Kokkos on RISC-V using an
Astrophysics Application Octo-Tiger
Patrick Diehl
Joint work with: Gregor Daiß, Steven R. Brandt, Alireza Kheirkhahan,
Hartmut Kaiser, Christopher Taylor, and John Leidel
Louisiana State University
patrickdiehl@lsu.edu
April 25, 2024
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 1 / 27
Motivation
What is RISC-V?
RISC-V was introduced in 2015 as an open standard instruction set
architecture (ISA); RISC-V is an iteration on established reduced
instruction set computer (RISC) principles
Why is it interesting for the HPC community?
The RISC-V ISA is completely open for use by anyone and is
royalty-free.
RISC-V is extensible; processor features can be added to provide
customized capabilities (ie: Cache management, SIMD, and Vector
Machine support are optional)
European Processor Initiative (EPI), which aims to develop a
vendor-independent European CPU for high-performance computing,
has identified RISC-V as a target for future investment.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 2 / 27
Overview
1 Astrophysical application
2 Software stack
Octo-Tiger
Kokkos
HPX
3 Porting the software stack to RISC-V
4 In-house RISC-V Test System
5 Performance measurements
Node level scaling
Distributed scaling
6 Energy consumption
7 Conclusion and Outlook
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 3 / 27
Astrophysical application
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 4 / 27
Example simulation
Astrophysical event: Merging of two stars – Flow on the surface which
corresponds in layman’s terms to the weather on the stars.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 5 / 27
Software stack
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 6 / 27
Octo-Tiger
Astrophysics open source program1 simulating the evolution of star
systems based on the fast multipole method on adaptive Octrees.
Modules
Hydro
Gravity
Radiation
Supports
Communication:
MPI/libfabric/LCI/GASNet +
OpenSHMEM
Backends: CUDA, HIP, SYCL
Reference
Marcello, Dominic C., et al. ”octo-tiger: a new, 3D hydrodynamic code for stellar mergers that uses hpx
parallelization.” Monthly Notices of the Royal Astronomical Society 504.4 (2021): 5345-5382.
1
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 7 / 27
Kokkos: C++ Performance Portability Programming
EcoSystem
Kokkos is a C++ library2 for writing performance portable applications
targeting all major HPC platforms
CPU
OpenMP
HPX
GPU
Native: CUDA & HIP
SYCL: CUDA & HIP
Reference
Trott, Christian R., et al. ”Kokkos 3: Programming model extensions for the exascale era.” IEEE Transactions on
Parallel and Distributed Systems 33.4 (2021): 805-817.
2
https://github.com/kokkos/kokkos
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 8 / 27
HPX
HPX is a open source C++ Standard Library for Concurrency and
Parallelism3.
Features
HPX exposes a uniform, standards-oriented API for ease of
programming parallel and distributed applications.
HPX provides unified syntax and semantics for local and remote
operations.
HPX exposes a uniform, flexible, and extendable performance counter
framework which can enable runtime adaptivity.
Reference
Kaiser, Hartmut, et al. ”HPX-the C++ standard library for parallelism and concurrency.” Journal of Open Source
Software 5.53 (2020): 2352.
3
https://github.com/STEllAR-GROUP/hpx
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 9 / 27
Porting the software stack to RISC-V
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 10 / 27
Porting HPX
Most parts of HPX are implemented ISO C++. However, small
portions of the runtime system are implemented using assembly.
The HPX context-switching software implementation can optionally
utilize Boost.Context support or a native independently provided
assembly implementation for a targeted ISA. Note HPX already relies
on Boost.
We had to do some single source code modification within the HPX
timer. The RISC-V HPX port implements timing using the RISC-V
RDTIME instruction. RDTIME is a pseudo-instruction that reads
bits from the time Control and Status Register (CSR).
Recall, that since we had an ISO C++ compiler and had Boost support, the
code changes were minimal.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 11 / 27
Porting Kokkos and Octo-Tiger
Kokkos
Building Kokkos required no changes to the code base and GCC
compiled the Kokkos without any issues.
However, Kokkos’s build system CMake files required some minor
changes. The RISC-V architecture was not detected, and incorrect
compiler flags were added for the architecture and vectorization.
Octo-Tiger
Octo-Tiger needed no porting after HPX and Kokkos were already
ported.
Due to the abstraction levels provided by HPX and Kokkos porting the
software stack was a walk in the park.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 12 / 27
In-house RISC-V Test System
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 13 / 27
In-house RISC-V test system I
Image of the in-house cluster using
two VisionFive2 Open Source
RISC-V single board computers with
Quad-core StarFive JH7110 64-bit
CPU and 8GB LPDDR4 System
Memory.
Official image based on an older
Ubuntu version
Ubuntu Linux image based on
23.04 had the versions, we need
or the Slurm integration and
recent compilers.
The Ubuntu image does not
support USB and PCIe on the
VisionFive2.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 14 / 27
In-house RISC-V test system II
Two MILK-V with desktop
computers with 64-core SOPHON
SG2042 64-bit CPU and 128 GB
DDR System Memory.
Linux OS: Fedora Linux 38
Slurm integration
GNU compiler collection
MPI
We have the full HPC stack on
RISC-V. At least to run Kokkos and
HPX!
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 15 / 27
Performance measurements
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 16 / 27
Node-level scaling (MILK-V)
20 21 22 23 24 25 26
210
212
214
216
# cores
Processed
sub-grids
per
second
DWD (Initial mesh)
Level 10 Level 10 (optimized)
Level 11 Level 11 (optimized)
0.47
1.79
10.8
61.33
GFLOP/s
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 17 / 27
Distributed runs (Single-board computer)
0 200 400 600 800 1,000
1-RISC
1-Fugaku
2-RISC-TCP
2-RISC-MPI
2-Fugaku-MPI
91
168
140
778
1,091
Cells processed per second
Figure: For a comparison, runs on a single and two Supercomputer Fugaku nodes
are shown (each using only four cores out of the 48 available ones for a better
comparison).
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 18 / 27
Distributed runs (MILK-V) I
0 200 400 600 800
1
1
2
2
105.92
176.11
163.64
225.14
Processed sub-grids per time step
#
nodes
Level 11 (initial mesh)
RISC-V
A64FX
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 19 / 27
Distributed runs (MILK-V) II
0 200 400 600 800 1,000
1
16
740.31
635.74
Processed sub-grids per time step
#
nodes
v1309
RISC-V
A64FX
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 20 / 27
Energy consumption
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 21 / 27
How to measure energy consumption?
We want to compare the RISC-V boards and Supercomputer Fugaku for
the astrophysics application.
On Supercomputer Fugaku the power consumption was measured
with the PowerAPI interface provided by Riken.
On the RISC-V boards, no hardware counters for power measurements
are present. Here, we attached a power meter to the USB power
source and measured the power consumption while running the Linux
command stress –cpu 4 and while running Octo-Tiger with four cores.
Would be nice to have hardware counters to get more sophisticated
measurements for RISC-V!
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 22 / 27
Energy consumption (Single-board computer)
0 0.5 1 1.5 2
1-RISC
1-Fugaku
2-RISC-TCP
2-RISC-MPI
2-Fugaku-MPI
1.19
1.28
1.53
0.92
1.46
Wh
Figure: On Supercomputer Fugaku, the power consumption was measured using
PowerAPI. Due to missing hardware counters, the power consumption was
measured using a power meter on RISC-V.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 23 / 27
Energy consumption (MILK-V)
0 1,000 2,000 3,000
1
1
2
2
1,854.7
2,230.8
2,000.7
2,908.3
Wh
#
nodes
Level 11 (initial mesh)
RISC-V
A64FX
Recall that an A64FX node has 48 cores and a RISC-V node has 64 cores
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 24 / 27
Conclusion and Outlook
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 25 / 27
Conclusion and Outlook
Conclusion
Porting the software stack was rather easy due to the advanced C++
compilers on RISC-V.
HPX and Octo-Tiger scaled from one up to four cores. However, more
RAM and more cores are needed for sophisticated benchmarking.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 26 / 27
This work is licensed under a Creative Com-
mons “Attribution-NonCommercial-ShareAlike
3.0 Unported” license.
P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 27 / 27

More Related Content

Similar to Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger

Deployment of an HPC Cloud based on Intel hardware
Deployment of an HPC Cloud based on Intel hardwareDeployment of an HPC Cloud based on Intel hardware
Deployment of an HPC Cloud based on Intel hardwareIntel IT Center
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practice
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practiceEmbedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practice
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practiceEmbeddedFest
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchDirk Petersen
 
Berlin Embedded Linux meetup: How to Linux on RISC-V
Berlin Embedded Linux meetup: How to Linux on RISC-VBerlin Embedded Linux meetup: How to Linux on RISC-V
Berlin Embedded Linux meetup: How to Linux on RISC-VDrew Fustini
 
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC
 
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Kento Aoyama
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC
 
OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights: June 2021OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights: June 2021OpenACC
 
CV-RENJINIK-27062016
CV-RENJINIK-27062016CV-RENJINIK-27062016
CV-RENJINIK-27062016Renjini K
 
SX Aurora TSUBASA (Vector Engine) a Brand-new Vector Supercomputing power in...
SX Aurora TSUBASA  (Vector Engine) a Brand-new Vector Supercomputing power in...SX Aurora TSUBASA  (Vector Engine) a Brand-new Vector Supercomputing power in...
SX Aurora TSUBASA (Vector Engine) a Brand-new Vector Supercomputing power in...inside-BigData.com
 
OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC
 
Eric Theis resume61.1
Eric Theis resume61.1Eric Theis resume61.1
Eric Theis resume61.1Eric Theis
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...Edge AI and Vision Alliance
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open SourceRCCSRENKEI
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...Edge AI and Vision Alliance
 
OpenACC Monthly Highlights: January 2021
OpenACC Monthly Highlights: January 2021OpenACC Monthly Highlights: January 2021
OpenACC Monthly Highlights: January 2021OpenACC
 

Similar to Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger (20)

Deployment of an HPC Cloud based on Intel hardware
Deployment of an HPC Cloud based on Intel hardwareDeployment of an HPC Cloud based on Intel hardware
Deployment of an HPC Cloud based on Intel hardware
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practice
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practiceEmbedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practice
Embedded Fest 2019. Wei Fu. Linux on RISC-V--Fedora and Firmware in practice
 
Scientific Computing @ Fred Hutch
Scientific Computing @ Fred HutchScientific Computing @ Fred Hutch
Scientific Computing @ Fred Hutch
 
Berlin Embedded Linux meetup: How to Linux on RISC-V
Berlin Embedded Linux meetup: How to Linux on RISC-VBerlin Embedded Linux meetup: How to Linux on RISC-V
Berlin Embedded Linux meetup: How to Linux on RISC-V
 
Japan's post K Computer
Japan's post K ComputerJapan's post K Computer
Japan's post K Computer
 
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptx
 
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020
 
HPC in higher education
HPC in higher educationHPC in higher education
HPC in higher education
 
OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights: June 2021OpenACC Monthly Highlights: June 2021
OpenACC Monthly Highlights: June 2021
 
CV-RENJINIK-27062016
CV-RENJINIK-27062016CV-RENJINIK-27062016
CV-RENJINIK-27062016
 
SX Aurora TSUBASA (Vector Engine) a Brand-new Vector Supercomputing power in...
SX Aurora TSUBASA  (Vector Engine) a Brand-new Vector Supercomputing power in...SX Aurora TSUBASA  (Vector Engine) a Brand-new Vector Supercomputing power in...
SX Aurora TSUBASA (Vector Engine) a Brand-new Vector Supercomputing power in...
 
OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021
 
Eric Theis resume61.1
Eric Theis resume61.1Eric Theis resume61.1
Eric Theis resume61.1
 
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre..."APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
"APIs for Accelerating Vision and Inferencing: Options and Trade-offs," a Pre...
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
 
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
“Khronos Standard APIs for Accelerating Vision and Inferencing,” a Presentati...
 
LPC4300_two_cores
LPC4300_two_coresLPC4300_two_cores
LPC4300_two_cores
 
OpenACC Monthly Highlights: January 2021
OpenACC Monthly Highlights: January 2021OpenACC Monthly Highlights: January 2021
OpenACC Monthly Highlights: January 2021
 

More from Patrick Diehl

D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsPatrick Diehl
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Patrick Diehl
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranFramework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranPatrick Diehl
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...Patrick Diehl
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsPatrick Diehl
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Patrick Diehl
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchPatrick Diehl
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Patrick Diehl
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Patrick Diehl
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Patrick Diehl
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsPatrick Diehl
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersPatrick Diehl
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsPatrick Diehl
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...Patrick Diehl
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Patrick Diehl
 

More from Patrick Diehl (18)

D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in FortranFramework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local models
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task Bench
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics models
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi Clusters
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic models
 
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...EMI 2021 - A comparative review of peridynamics and phase-field models for en...
EMI 2021 - A comparative review of peridynamics and phase-field models for en...
 
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
Google Summer of Code mentor summit 2020 - Session 2 - Open Science and Open ...
 

Recently uploaded

Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 

Recently uploaded (20)

Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 

Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger

  • 1. Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger Patrick Diehl Joint work with: Gregor Daiß, Steven R. Brandt, Alireza Kheirkhahan, Hartmut Kaiser, Christopher Taylor, and John Leidel Louisiana State University patrickdiehl@lsu.edu April 25, 2024 P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 1 / 27
  • 2. Motivation What is RISC-V? RISC-V was introduced in 2015 as an open standard instruction set architecture (ISA); RISC-V is an iteration on established reduced instruction set computer (RISC) principles Why is it interesting for the HPC community? The RISC-V ISA is completely open for use by anyone and is royalty-free. RISC-V is extensible; processor features can be added to provide customized capabilities (ie: Cache management, SIMD, and Vector Machine support are optional) European Processor Initiative (EPI), which aims to develop a vendor-independent European CPU for high-performance computing, has identified RISC-V as a target for future investment. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 2 / 27
  • 3. Overview 1 Astrophysical application 2 Software stack Octo-Tiger Kokkos HPX 3 Porting the software stack to RISC-V 4 In-house RISC-V Test System 5 Performance measurements Node level scaling Distributed scaling 6 Energy consumption 7 Conclusion and Outlook P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 3 / 27
  • 4. Astrophysical application P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 4 / 27
  • 5. Example simulation Astrophysical event: Merging of two stars – Flow on the surface which corresponds in layman’s terms to the weather on the stars. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 5 / 27
  • 6. Software stack P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 6 / 27
  • 7. Octo-Tiger Astrophysics open source program1 simulating the evolution of star systems based on the fast multipole method on adaptive Octrees. Modules Hydro Gravity Radiation Supports Communication: MPI/libfabric/LCI/GASNet + OpenSHMEM Backends: CUDA, HIP, SYCL Reference Marcello, Dominic C., et al. ”octo-tiger: a new, 3D hydrodynamic code for stellar mergers that uses hpx parallelization.” Monthly Notices of the Royal Astronomical Society 504.4 (2021): 5345-5382. 1 P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 7 / 27
  • 8. Kokkos: C++ Performance Portability Programming EcoSystem Kokkos is a C++ library2 for writing performance portable applications targeting all major HPC platforms CPU OpenMP HPX GPU Native: CUDA & HIP SYCL: CUDA & HIP Reference Trott, Christian R., et al. ”Kokkos 3: Programming model extensions for the exascale era.” IEEE Transactions on Parallel and Distributed Systems 33.4 (2021): 805-817. 2 https://github.com/kokkos/kokkos P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 8 / 27
  • 9. HPX HPX is a open source C++ Standard Library for Concurrency and Parallelism3. Features HPX exposes a uniform, standards-oriented API for ease of programming parallel and distributed applications. HPX provides unified syntax and semantics for local and remote operations. HPX exposes a uniform, flexible, and extendable performance counter framework which can enable runtime adaptivity. Reference Kaiser, Hartmut, et al. ”HPX-the C++ standard library for parallelism and concurrency.” Journal of Open Source Software 5.53 (2020): 2352. 3 https://github.com/STEllAR-GROUP/hpx P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 9 / 27
  • 10. Porting the software stack to RISC-V P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 10 / 27
  • 11. Porting HPX Most parts of HPX are implemented ISO C++. However, small portions of the runtime system are implemented using assembly. The HPX context-switching software implementation can optionally utilize Boost.Context support or a native independently provided assembly implementation for a targeted ISA. Note HPX already relies on Boost. We had to do some single source code modification within the HPX timer. The RISC-V HPX port implements timing using the RISC-V RDTIME instruction. RDTIME is a pseudo-instruction that reads bits from the time Control and Status Register (CSR). Recall, that since we had an ISO C++ compiler and had Boost support, the code changes were minimal. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 11 / 27
  • 12. Porting Kokkos and Octo-Tiger Kokkos Building Kokkos required no changes to the code base and GCC compiled the Kokkos without any issues. However, Kokkos’s build system CMake files required some minor changes. The RISC-V architecture was not detected, and incorrect compiler flags were added for the architecture and vectorization. Octo-Tiger Octo-Tiger needed no porting after HPX and Kokkos were already ported. Due to the abstraction levels provided by HPX and Kokkos porting the software stack was a walk in the park. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 12 / 27
  • 13. In-house RISC-V Test System P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 13 / 27
  • 14. In-house RISC-V test system I Image of the in-house cluster using two VisionFive2 Open Source RISC-V single board computers with Quad-core StarFive JH7110 64-bit CPU and 8GB LPDDR4 System Memory. Official image based on an older Ubuntu version Ubuntu Linux image based on 23.04 had the versions, we need or the Slurm integration and recent compilers. The Ubuntu image does not support USB and PCIe on the VisionFive2. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 14 / 27
  • 15. In-house RISC-V test system II Two MILK-V with desktop computers with 64-core SOPHON SG2042 64-bit CPU and 128 GB DDR System Memory. Linux OS: Fedora Linux 38 Slurm integration GNU compiler collection MPI We have the full HPC stack on RISC-V. At least to run Kokkos and HPX! P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 15 / 27
  • 16. Performance measurements P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 16 / 27
  • 17. Node-level scaling (MILK-V) 20 21 22 23 24 25 26 210 212 214 216 # cores Processed sub-grids per second DWD (Initial mesh) Level 10 Level 10 (optimized) Level 11 Level 11 (optimized) 0.47 1.79 10.8 61.33 GFLOP/s P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 17 / 27
  • 18. Distributed runs (Single-board computer) 0 200 400 600 800 1,000 1-RISC 1-Fugaku 2-RISC-TCP 2-RISC-MPI 2-Fugaku-MPI 91 168 140 778 1,091 Cells processed per second Figure: For a comparison, runs on a single and two Supercomputer Fugaku nodes are shown (each using only four cores out of the 48 available ones for a better comparison). P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 18 / 27
  • 19. Distributed runs (MILK-V) I 0 200 400 600 800 1 1 2 2 105.92 176.11 163.64 225.14 Processed sub-grids per time step # nodes Level 11 (initial mesh) RISC-V A64FX P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 19 / 27
  • 20. Distributed runs (MILK-V) II 0 200 400 600 800 1,000 1 16 740.31 635.74 Processed sub-grids per time step # nodes v1309 RISC-V A64FX P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 20 / 27
  • 21. Energy consumption P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 21 / 27
  • 22. How to measure energy consumption? We want to compare the RISC-V boards and Supercomputer Fugaku for the astrophysics application. On Supercomputer Fugaku the power consumption was measured with the PowerAPI interface provided by Riken. On the RISC-V boards, no hardware counters for power measurements are present. Here, we attached a power meter to the USB power source and measured the power consumption while running the Linux command stress –cpu 4 and while running Octo-Tiger with four cores. Would be nice to have hardware counters to get more sophisticated measurements for RISC-V! P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 22 / 27
  • 23. Energy consumption (Single-board computer) 0 0.5 1 1.5 2 1-RISC 1-Fugaku 2-RISC-TCP 2-RISC-MPI 2-Fugaku-MPI 1.19 1.28 1.53 0.92 1.46 Wh Figure: On Supercomputer Fugaku, the power consumption was measured using PowerAPI. Due to missing hardware counters, the power consumption was measured using a power meter on RISC-V. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 23 / 27
  • 24. Energy consumption (MILK-V) 0 1,000 2,000 3,000 1 1 2 2 1,854.7 2,230.8 2,000.7 2,908.3 Wh # nodes Level 11 (initial mesh) RISC-V A64FX Recall that an A64FX node has 48 cores and a RISC-V node has 64 cores P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 24 / 27
  • 25. Conclusion and Outlook P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 25 / 27
  • 26. Conclusion and Outlook Conclusion Porting the software stack was rather easy due to the advanced C++ compilers on RISC-V. HPX and Octo-Tiger scaled from one up to four cores. However, more RAM and more cores are needed for sophisticated benchmarking. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 26 / 27
  • 27. This work is licensed under a Creative Com- mons “Attribution-NonCommercial-ShareAlike 3.0 Unported” license. P. Diehl (CCT/Physics/LSU) HPX and Kokkos on RISC-V April 25, 2024 27 / 27