SlideShare a Scribd company logo
1 of 62
Download to read offline
Brad Richardson
Framework for Extensible,
Asynchronous Task
Scheduling (FEATS) in Fortran
Co-contributors: Damian Rouson,
Harris Snyder and Robert Singleterry
Agenda
• Motivations
• Example/Demo Application
• Implementation Details
• Performance Studies
• Conclusions
Motivations
https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdf
Motivations
● Target an under-served user base: Fortran
○ Enable scientists and engineers to develop efficient applications for HPC beyond the
“embarrassingly parallel” problems
● Explore the native parallel features of Fortran
○ Don’t force “reformulating” the problem to be able to interoperate with C/C++ or other
external libraries
Example Applications
if (this_image() == 1) then
write(output_unit, "(A)") &
"Enter values for a, b and c in `a*x**2 + b*x + c`:"
flush(output_unit)
read(input_unit, *) a, b, c
end if
call co_broadcast(a, 1)
call co_broadcast(b, 1)
call co_broadcast(c, 1)
solver = dag_t( &
[ vertex_t([integer::], a_t(a)) &
, vertex_t([integer::], b_t(b)) &
, vertex_t([integer::], c_t(c)) &
, vertex_t([2], b_squared_t()) &
, vertex_t([1, 3], four_ac_t()) &
, vertex_t([4, 5], square_root_t()) &
, vertex_t([2, 6], minus_b_pm_square_root_t()) &
, vertex_t([1], two_a_t()) &
, vertex_t([8, 7], division_t()) &
, vertex_t([9], printer_t()) &
])
Quadratic Solver
-b±√(b2
-4*a*c)
2*a
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 1 -12
c = -12.0000000
b = 1.0000000
a = 1.0000000
2*a = 2.0000000
b**2 = 1.0000000
4*a*c = -48.0000000
sqrt(b**2 - 4*a*c) = 7.0000000 -7.0000000
-b +- sqrt(b**2 - 4*a*c) = 6.0000000 -8.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 3.0000000 -4.0000000
The roots are 3.0000000 -4.0000000
Enter values for a, b and c in `a*x**2 + b*x + c`:
1 -1 -30
b = -1.0000000
c = -30.0000000
a = 1.0000000
b**2 = 1.0000000
4*a*c = -1.2000000E+02
2*a = 2.0000000
sqrt(b**2 - 4*a*c) = 11.0000000 -11.0000000
-b +- sqrt(b**2 - 4*a*c) = 12.0000000 -10.0000000
(-b +- sqrt(b**2 - 4*a*c)) / (2*a) = 6.0000000 -5.0000000
The roots are 6.0000000 -5.0000000
Quadratic Solver
LU Decomposition
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix
as the product of a lower triangular matrix and an upper triangular matrix (see matrix decomposition). The
product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form
of Gaussian elimination.
- Wikipedia
I.e. A = LU =
call read_matrix_in(matrix, matrix_size)
num_tasks = 4 + (matrix_size-1)*2 + sum([((matrix_size-step)*3, step=1,matrix_size-1)])
allocate(vertices(num_tasks))
vertices(1) = vertex_t([integer::], initial_t(matrix))
vertices(2) = vertex_t([1], print_matrix_t(0))
do step = 1, matrix_size-1
do row = step+1, matrix_size
latest_matrix = 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) ! reconstructed matrix from last step
task_base = sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1))
vertices(3+task_base) = vertex_t([latest_matrix], calc_factor_t(row=row, step=step))
vertices(4+task_base) = vertex_t([latest_matrix, 3+task_base], row_multiply_t(step=step))
vertices(5+task_base) = vertex_t([latest_matrix, 4+task_base], row_subtract_t(row=row))
end do
reconstruction_step = 3 + sum([(3*(matrix_size-i), i = 1, step)]) + 2*(step-1)
vertices(reconstruction_step) = vertex_t( &
[ 1 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) &
, [(5 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
], &
reconstruct_t(step=step)) ! depends on previous reconstructed matrix and just subtracted rows
! print the just reconstructed matrix
vertices(reconstruction_step+1) = vertex_t([reconstruction_step], print_matrix_t(step))
end do
vertices(num_tasks-1) = vertex_t( &
[([(3 + sum([(3*(matrix_size-i), i = 1, step-1)]) + 2*(step-1) + 3*(row-(step+1)) &
, row=step+1, matrix_size)] &
, step=1, matrix_size-1)] &
, back_substitute_t(n_rows=matrix_size)) ! depends on all "factors"
vertices(num_tasks) = vertex_t([num_tasks-1], print_matrix_t(matrix_size))
dag = dag_t(vertices)
LU Decomposition
LU Decomposition - 1x1
LU Decomposition - 2x2
LU Decomposition - 3x3
LU Decomposition - 4x4
LU Decomposition - 5x5
LU Decomposition - 3x3
LU Decomposition - 3x3
Step: 0
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
9.7038173699999994 9.7847460899999999E+02 7.5112048300000004E+02
5.6446673599999997E+02 8.9235162400000002E+02 8.0637854000000004E+02
Step: 1
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 -2.0138880965761025E+04 -2.8483383952264314E+04
Step: 3
1.0000000000000000 0.0000000000000000 0.0000000000000000
0.8780400555586139 1.0000000000000000 0.0000000000000000
51.0751991036728938 -32.6440176682580301 1.0000000000000000
Step: 2
11.0516795999999999 4.1176995799999997E+02 5.7346350099999995E+02
0.0000000000000000 6.1692409220031186E+02 2.4759655872112285E+02
0.0000000000000000 0.0000000000000000 -2.0400837514772094E+04
Implementation
Derived Types
Run Implementation (#1)
• One scheduler image and multiple executor images
• Mailbox, and task assignment coarrays
• Events to signal ready for work and task completed
• Directed acyclic graph (DAG) to define task dependencies
Coarrays Needed
• type(payload_t), allocatable :: mailbox(:)[:]
• type(event_type), allocatable :: ready_for_next_task(:)[:]
• type(event_type) :: task_assigned[*]
• integer :: task_identifier[*]
• integer, allocatable :: task_assignment_history(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
■ If we’re executing on more than one image
Data Layout
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Scheduler Steps
• Find an executor that has posted it is ready
• While we do this, we keep track of what tasks have been completed
• Find an uncompleted task with all dependencies completed
• “Wait” for the ready executor (balances posts/waits)
• Assign the task to the executor
• Post that the executor has been assigned a task
• Repeat
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Executor
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Find Next Task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for ready image
event wait(...)
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Assign task
Scheduler Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post task assigned
event post(...)
Executor Steps
• Post ready for a task
• Wait until it has been assigned a task
• Collect payloads from executors that ran dependent tasks
• We access the history kept by the scheduler to determine this
• Execute task and store result in mailbox
• Repeat
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Post ready for task
event post(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Wait for task
assignment
event wait(...)
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Collect inputs
Executor Steps
n_tasks
mailbox
ready_for_task
task_assigned
task_id
task_assign_hist
scheduler
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
n_tasks
executor
DAG
n_images
n_tasks
Execute task and
store result
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
• Every image attempts to claim a task as soon as possible
• Mailbox, task claimed and num tasks completed coarrays
• Events to signal task completed
• Directed acyclic graph (DAG) to define task dependencies
Run Implementation (#2)
Coarrays Needed
• integer(atomic_int_kind), allocatable :: taken_on(:)[:]
• integer(atomic_int_kind), allocatable :: num_tasks_completed[:]
• type(event_type), allocatable :: task_completed(:)[:]
• type(payload_t), allocatable :: mailbox(:)[:]
Startup Procedure
● User:
○ Define DAG
■ Define tasks and their dependencies
○ call run(dag)
NOTE: All images must have the same dag to start
● Framework:
○ Allocate coarrays needed for communication
Data Layout
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execution Steps
• Find task that hasn’t been claimed, and all of its dependencies have been completed
• Attempt to claim that task
• another image may have claimed it by now
• Collect payloads from images that ran dependent tasks
• “Wait” on event that it was completed
• Execute task and store result in mailbox
• Post to all images that the task has been completed and increment task completed counter
• Repeat
Find Ready Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Claim Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Collect Payloads
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Execute Task
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Notify Task Completion
mailbox
task_completed
num_tasks_completed
taken_on
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
n_tasks
image
DAG
n_tasks
n_tasks
Performance
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - shared memory (nagfor)
Compilation flags= -O4 -Onoteams -Orounding -coarray
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - single node (OpenCoarrays)
Compilation flags= -O3
On system with: Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz and 16 GB RAM
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - crayftn
Compilation flags= -O3
On: Perlmutter
Performance - OpenCoarrays on Perlmutter
Compilation flags= -O3
On: Perlmutter
Conclusions
Fortran’s Advantages
• Coarrays and Events a great match
• Coarray to communicate task inputs and outputs
• Events to signal task start and completion
• Teams should allow for scalable implementation
• Partition DAG and have multiple schedulers work on independent regions with separate
teams of executors
• Polymorphism
• Different kinds of task can exist that capture different kinds of “input” data at startup
• Fortan’s History
• Likely lots of applications that could be easily adapted
Fortran’s Disadvantages
• Can’t use polymorphic objects in coarrays
• A strategic change to the standard could enable this
• Coindexed objects with polymorphic components not allowed in variable definition
context, but could be allowed in expressions
• See https://github.com/j3-fortran/fortran_proposals/issues/251
• No introspection
• Automatic task detection, fusion or splitting not possible
• Fortran’s History
• Many existing applications have shared global state
• Presents data races in task based execution
Conclusions
• It “works”
• Scaling/performance will depend not only on algorithm, but also compiler and platform
• There are limitations
• Future Work
• Propose changes to Fortran standard to improve utility/flexibility
• Investigate performance issues
• Explore alternative scheduling algorithms
• Find “beta” testers, i.e. target applications
• Questions?
• https://github.com/sourceryinstitute/feats
Thank You

More Related Content

Similar to Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptxMouDhara1
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonCarlos V.
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Languagemspline
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Codemotion
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveHostedbyConfluent
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxcarliotwaycave
 
C++ process new
C++ process newC++ process new
C++ process new敬倫 林
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Brendan Eich
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1SethTisue
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryDaniel Cuneo
 
Functions in python
Functions in pythonFunctions in python
Functions in pythonIlian Iliev
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiDatabricks
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptAnishaJ7
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptxAdrien Melquiond
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)Mumtaz Ali
 

Similar to Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran (20)

06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
 
Dafunctor
DafunctorDafunctor
Dafunctor
 
Functional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with PythonFunctional Programming inside OOP? It’s possible with Python
Functional Programming inside OOP? It’s possible with Python
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
 
Metaprogramming
MetaprogrammingMetaprogramming
Metaprogramming
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
Functional Programming You Already Know - Kevlin Henney - Codemotion Rome 2015
 
Matlab-1.pptx
Matlab-1.pptxMatlab-1.pptx
Matlab-1.pptx
 
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep DiveThe Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
The Nuts and Bolts of Kafka Streams---An Architectural Deep Dive
 
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docxINFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
INFORMATIVE ESSAYThe purpose of the Informative Essay assignme.docx
 
6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx6-Python-Recursion PPT.pptx
6-Python-Recursion PPT.pptx
 
C++ process new
C++ process newC++ process new
C++ process new
 
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)Value Objects, Full Throttle (to be updated for spring TC39 meetings)
Value Objects, Full Throttle (to be updated for spring TC39 meetings)
 
“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1“Tasks” in NetLogo 5.0beta1
“Tasks” in NetLogo 5.0beta1
 
Time Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal RecoveryTime Series Analysis:Basic Stochastic Signal Recovery
Time Series Analysis:Basic Stochastic Signal Recovery
 
Functions in python
Functions in pythonFunctions in python
Functions in python
 
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin HuaiA Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.ppt
 
Day 4a iteration and functions.pptx
Day 4a   iteration and functions.pptxDay 4a   iteration and functions.pptx
Day 4a iteration and functions.pptx
 
dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)dynamic programming complete by Mumtaz Ali (03154103173)
dynamic programming complete by Mumtaz Ali (03154103173)
 

More from Patrick Diehl

Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerPatrick Diehl
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerPatrick Diehl
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsPatrick Diehl
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Patrick Diehl
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondPatrick Diehl
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...Patrick Diehl
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuPatrick Diehl
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsPatrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Patrick Diehl
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchPatrick Diehl
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Patrick Diehl
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Patrick Diehl
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...Patrick Diehl
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerPatrick Diehl
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Patrick Diehl
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsPatrick Diehl
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersPatrick Diehl
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsPatrick Diehl
 

More from Patrick Diehl (20)

Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V using an Astrophysics Application Octo-Tiger
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-TigerEvaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
Evaluating HPX and Kokkos on RISC-V Using an Astrophysics Application Octo-Tiger
 
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and ToolsD-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
D-HPC Workshop Panel : S4PST: Stewardship of Programming Systems and Tools
 
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
Benchmarking the Parallel 1D Heat Equation Solver in Chapel, Charm++, C++, HP...
 
Subtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff HammondSubtle Asynchrony by Jeff Hammond
Subtle Asynchrony by Jeff Hammond
 
JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...JOSS and FLOSS for science: Examples for promoting open source software and s...
JOSS and FLOSS for science: Examples for promoting open source software and s...
 
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer FugakuSimulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
Simulating Stellar Merger using HPX/Kokkos on A64FX on Supercomputer Fugaku
 
A tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local modelsA tale of two approaches for coupling nonlocal and local models
A tale of two approaches for coupling nonlocal and local models
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...Challenges for coupling approaches for classical linear elasticity and bond-b...
Challenges for coupling approaches for classical linear elasticity and bond-b...
 
Quantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task BenchQuantifying Overheads in Charm++ and HPX using Task Bench
Quantifying Overheads in Charm++ and HPX using Task Bench
 
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...Interactive C++ code development using C++Explorer and GitHub Classroom for e...
Interactive C++ code development using C++Explorer and GitHub Classroom for e...
 
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
Porting our astrophysics application to Arm64FX and adding Arm64FX support us...
 
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
An asynchronous and task-based implementation of peridynamics utilizing HPX—t...
 
Recent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-TigerRecent developments in HPX and Octo-Tiger
Recent developments in HPX and Octo-Tiger
 
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
Quasistatic Fracture using Nonliner-Nonlocal Elastostatics with an Analytic T...
 
A review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics modelsA review of benchmark experiments for the validation of peridynamics models
A review of benchmark experiments for the validation of peridynamics models
 
Deploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi ClustersDeploying a Task-based Runtime System on Raspberry Pi Clusters
Deploying a Task-based Runtime System on Raspberry Pi Clusters
 
On the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic modelsOn the treatment of boundary conditions for bond-based peridynamic models
On the treatment of boundary conditions for bond-based peridynamic models
 

Recently uploaded

LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsCharlene Llagas
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 

Recently uploaded (20)

LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Heredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of TraitsHeredity: Inheritance and Variation of Traits
Heredity: Inheritance and Variation of Traits
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 

Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran