2. OpenACC is a directives-
based programming approach
to parallel computing
designed for performance
and portability on CPUs
and GPUs for HPC.
main()
{
<serial code>
#pragma acc kernels
{
<parallel code>
}
}
Add Simple Compiler Directive
3. SINGLE CODE FOR MULTIPLE PLATFORMS
POWER
Sunway
x86 CPU
x86 Xeon Phi
NVIDIA GPU
PEZY-SC
9x 10x 11x
52x
9x 10x 11x
77x
0x
20x
40x
60x
80x
SpeedupvsSingleHaswellCore
PGI OpenACC
Intel OpenMP
IBM OpenMP
Dual Haswell Dual Broadwell Dual POWER8 1 Tesla
P100
OpenACC - Performance Portable Programming Model for HPC
1 Tesla
V100
AWE Hydrodynamics CloverLeaf mini-App, bm32 data set
Systems: Haswell: 2x16 core Haswell server, four K80s, CentOS 7.2 (perf-hsw10), Broadwell: 2x20 core Broadwell server, eight P100s (dgx1-prd-01), Minsky: POWER8+NVLINK, four P100s,
RHEL 7.3 (gsn1).
Compilers: Intel 17.0, IBM XL 13.1.3, PGI 16.10.
Benchmark: CloverLeaf v1.3 downloaded from http://uk-mac.github.io/CloverLeaf the week of November 7 2016; CloverlLeaf_Serial; CloverLeaf_ref (MPI+OpenMP); CloverLeaf_OpenACC
(MPI+OpenACC)
Data compiled by PGI November 2016, Volta data collected June 2017
5. PGI 17.7 IS NOW AVAILABLE
• Tesla V100 GPU Support
• OpenACC for CUDA Unified Memory
• C++ Enhancements
• Use C++14 Lambdas with Capture in
OpenACC Regions
• Enhanced cuSOLVER Library Interoperability
• OpenMP 4.5 for Multicore CPUs
• PGI Unified Binary for Tesla and Multicore
• LLVM/x86-64 Code Generator (beta feature)
• New Profiling Features for OpenACC and
CUDA Unified Memory
https://www.pgicompilers.com/products/new-in-pgi.htm
6. ARTICLES AND BLOG POSTS
Brookhaven Lab Hosts "Brookathon," a Five-Day
GPU Hackathon:
https://www.bnl.gov/newsroom/news.php?a=212273
InsideHPC Video interview with Sunita
Chandrasekaran and Michael Wolfe:
https://insidehpc.com/2017/07/video-openacc-update-
isc-2017/
HPCWire article discussing OpenACC and
OpenMP:
https://www.hpcwire.com/2017/07/03/optimizing-codes-
heterogeneous-hpc-clusters-using-openacc/
7. NEW RESOURCES
Paper: Evaluation of a Directive-Based GPU
Programming Approach for High-Order Unstructured
Mesh Computational Fluid Dynamics
“While it is in general possible to write an optimized code using OpenCL (or
CUDA) that outperforms OpenACC, we find that the directive based
approach offered by OpenACC results in a flexible, unified and hence smaller
code-base that is easier to maintain, is readily portable and promotes
algorithm development.”
READ NOW
Tutorial: Complete Parallel Programming with
OpenACC Tutorial Series with Michael Wolfe
WATCH NOW
8. CALL FOR SUBMISSIONS
EVENT DUE DATE LINK
ORNL Hackathon Aug 31, 2017
https://www.olcf.ornl.gov/training-
event/2017-gpu-hackathons/
WACCPD Workshop, SC17, Denver,
USA Aug 30, 2017 http://waccpd.org/
COMPLETE LIST OF EVENTS
9. UPCOMING EVENTS
EVENT & LOCATION DATE LINK
NASA GPU Hackathon
NASA Langley, Virginia, USA
Aug 21 - 25, 2017
https://www.olcf.ornl.gov/training-
event/2017-gpu-hackathons
6th NVIDIA GPU Workshop
Latin America, São Paulo, SP, Brazil
Aug 23 - 24, 2017 http://info.nvidia.com/vi-nvidia-workshop-
latin-america-august.html
CSCS Hackathon
Lugano, Switzerland
Sep 4 - 8, 2017
https://www.olcf.ornl.gov/training-
event/2017-gpu-hackathons
ORNL Hackathon
Knoxville, Tennessee, USA
Oct 16 - 20, 2017
https://www.olcf.ornl.gov/training-
event/2017-gpu-hackathons
WACCPD Workshop,
SC17, Denver, USA
Nov 13, 2017 http://waccpd.org/
Scalable Parallel Programming
Using OpenACC for Multicore,
GPUs, and Manycore
SC17, Denver, USA
Nov 13, 2017
http://sc17.supercomputing.org/presenta
tion/?id=tut135&sess=sess224
COMPLETE LIST OF EVENTS