How Parallelware technology eases
HPC software development for
POWER systems
Parallelware Analyzer and GPU programming for big code bases
Manuel Arenaz
manuel.arenaz@appentra.com
OpenPOWER Academic Discussion Group Workshop 2018
Saturday, 10 November 2018 | Dallas, US
https://indico-jsc.fz-juelich.de/event/76/
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Core Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
Is there a need for Parallelware tools on Power?
Is there a need for Parallelware tools on Power?
Incredible computational power in one full node of Summit!
Full Node Capabilities
Processor POWER9 V100
Count 2 6
FLOPS (SP) 2.161 TFLOPS
2 × 22 × 49.12 GFLOPs
94.2 TFLOPs
6 × 15.7 TFLOPs
FLOPS (DP) 1.081 TFLOPS
2 × 22 × 24.56 GFLOPs
46.8 TFLOPS
6 × 7.8 TFLOPs
AI FLOPS - 750 TFLOPS
6 × 125 TFLOPs
Memory 512 GiB (DRR4)
16 × 32 GiB
96 GiB (HBM2)
6 × 16 GiB
Bandwidth 341.33 GB/s
16 × 21.33 GB/s
900 GB/s/GPU
What are Summit’s parallel programming challenges?
Is there a need for Parallelware tools on Power?
● Parallel programming of many-core processors
● Parallel programming of multiple GPUs
(multi-GPU)
● Data movement through a heterogeneous
complex memory hierarchy
● Training of computational researchers and
engineers
● Porting of existing codes to (pre-)exascale systems
How can
Parallelware tools
help to address
these challenges?
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
Parallelware Tools Suite: Software Components
GUI Desktop
Emerging
Technology
Command
Line Tool
Emerging
Technology
1. Parallelware Trainer
2. Parallelware Analyzer
Parallelware Technology (libpw)
Parallelware front-end Parallelware back-endParallelware middle-end
Semantic
Analysis
Engine
C
OpenACC 2.0
Multi-Threading
Offloading
OpenMP 4.5
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
Parallelware Technology: Roadmap 2018-2020
GUI Desktop
Emerging
Technology
Command
Line Tool
Emerging
Technology
The H2020 FETHPC projects MAESTRO and EPEEC
will enable an incremental development following a
co-design approach guided by (pre-)exascale
applications.
Parallelware Technology (libpw)
Parallelware front-end Parallelware back-endParallelware middle-end
Semantic
Analysis
Engine
C
OpenACC 2.0
Multi-Threading
Offloading
OpenMP 4.5
C++
Fortran
Tasking
OmpSs
FPGAs
http://www.prace-ri.eu/pracesc18-presentations/
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
An interactive
tool that acts as
your mentor
“ Tell me, I will forget,
Show me, I may remember,
Involve me, I will understand.”
Emerging
Technologies
Parallelware Trainer (v1.0 Sep 2018)
Technical features:
● Identification of parallelization opportunities.
● Assistance with the introduction of correct
OpenMP and OpenACC directives.
● Correct data scoping, including private/shared
variables.
● Support for C programming language.
● Use any compiler and any build/compilation tool
in Windows, Linux and MacOSX.
● Develop, test and benchmark all within the
same interface.
Benefits:
● Faster, more effective learning.
● Work on realistic codes rather than toy
examples, including your own code.
● Reduced learning curve.
● Parallelize code within minutes.
● Immediate identification of where and how
to parallelize.
● Support for multithreading, offloading to
GPUs.
https://www.appentra.com/products/parallelware-trainer/
Parallelware Trainer (v1.0 Sep 2018)
Project Explorer Code Editor Version Manager
Output Consoles
PASSIVE LEARNING
Lectures
INTERACTIVE
LEARNING
Exercises
Demonstrations
LEARNING BY
DOING WITH
MENTORS
Problem solving
HACKATHONS
WORKSHOPS
COURSES
interactivity Learning effort
Scalability
Learning effort Interactivity
Scalability
interactivityLearning effort
Scalability
THE HPC EDUCATION & TRAINING PYRAMID
Parallelware Trainer
New knowledge base (glossary of parallel programming)
Parallelware Trainer
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2020
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
A command-line
reporting tool to
improve
productivity of HPC
application
developers
Emerging
Technologies
Methodological Approach to Parallel Programming
Tuned OpenACC
parallel versionCode
Analysis
Working OpenMP
Parallel version
Profiled
version
Working
version
MS0 MS1 MS2 MS3
Working OpenACC
parallel versionPreparing for
the hackathon
Tuned OpenMP
parallel version
Highly optimized OpenMP+OpenACC parallel
version
MS4 MS5
MS6 MS7
MS8
Biggest parallelisation barrier is “Code Analysis”: data scoping
across procedure boundaries in codes using complex in-memory data layouts
Parallelware Analyzer (Beta)
● Help to understand where and how to parallelize in real codes.
● Reports to facilitate understanding the code from different perspectives.
● Batch processing of files/directories of big code bases.
Report “--datascoping”
Parallelware Analyzer (Beta)
Parallelware Analyzer: Roadmap 2019+
www.appentra.com/products/parallelware-analyzer
● Early Access Program (EAP) Q1-Q2 2019
○ Students/Researchers/Developers
○ Academia/Industry
● Create a community to ensure development
aligned with the user communities needs
○ Discussion forums
● Benefits
○ Free access to the tool
○ Early adopter discounts when the product
is officially launched
● Official launch of Parallelware Analyzer:
○ Expected to be at SC19
● Sign-up now for EAP starting 1 Jan 2019
Index
● Is There a Need for Parallelware Tools on POWER Systems?
● The Parallelware Tools Suite: Design of Software Components
● Parallelware Technology: Roadmap 2018-2019
● Parallelware Trainer: Roadmap 2018-2019
● Parallelware Analyzer: Roadmap 2018-2019
● Conclusions & Future Work
Conclusions & Future Work
Conclusions
● Parallelware tools are advancing the state of the art in parallel
programming challenges, and can facilitate it for POWER systems too.
○ Parallelware Trainer can help in training for POWER systems using OpenMP
and OpenACC for multicores and GPUs.
● Parallelware Analyzer (BETA) helps to dive into the complexity of
developing HPC codes, covering the computational, control-flow and
memory perspectives
○ Data scoping in big code bases is probably pain point #1
Coming soon...
● Distribute binary packages of Parallelware tools for POWER systems.
● Certify Parallelware tools as OpenPOWER Ready.
SC18
Emerging
Technologies
booth #619
SC18
Startup Pavilion
booth #3869
How Parallelware technology eases
HPC software development for
POWER systems?
Parallelware Analyzer and GPU programming for big code bases
Manuel Arenaz
manuel.arenaz@appentra.com
OpenPOWER Academic Discussion Group Workshop 2018
Saturday, 10 November 2018 | Dallas, US
https://indico-jsc.fz-juelich.de/event/76/

How Parallelware technology eases HPC software development for POWER systems

  • 1.
    How Parallelware technologyeases HPC software development for POWER systems Parallelware Analyzer and GPU programming for big code bases Manuel Arenaz manuel.arenaz@appentra.com OpenPOWER Academic Discussion Group Workshop 2018 Saturday, 10 November 2018 | Dallas, US https://indico-jsc.fz-juelich.de/event/76/
  • 2.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 3.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Core Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 4.
    Is there aneed for Parallelware tools on Power?
  • 5.
    Is there aneed for Parallelware tools on Power? Incredible computational power in one full node of Summit! Full Node Capabilities Processor POWER9 V100 Count 2 6 FLOPS (SP) 2.161 TFLOPS 2 × 22 × 49.12 GFLOPs 94.2 TFLOPs 6 × 15.7 TFLOPs FLOPS (DP) 1.081 TFLOPS 2 × 22 × 24.56 GFLOPs 46.8 TFLOPS 6 × 7.8 TFLOPs AI FLOPS - 750 TFLOPS 6 × 125 TFLOPs Memory 512 GiB (DRR4) 16 × 32 GiB 96 GiB (HBM2) 6 × 16 GiB Bandwidth 341.33 GB/s 16 × 21.33 GB/s 900 GB/s/GPU What are Summit’s parallel programming challenges?
  • 6.
    Is there aneed for Parallelware tools on Power? ● Parallel programming of many-core processors ● Parallel programming of multiple GPUs (multi-GPU) ● Data movement through a heterogeneous complex memory hierarchy ● Training of computational researchers and engineers ● Porting of existing codes to (pre-)exascale systems How can Parallelware tools help to address these challenges?
  • 7.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 8.
    Parallelware Tools Suite:Software Components GUI Desktop Emerging Technology Command Line Tool Emerging Technology 1. Parallelware Trainer 2. Parallelware Analyzer Parallelware Technology (libpw) Parallelware front-end Parallelware back-endParallelware middle-end Semantic Analysis Engine C OpenACC 2.0 Multi-Threading Offloading OpenMP 4.5
  • 9.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 10.
    Parallelware Technology: Roadmap2018-2020 GUI Desktop Emerging Technology Command Line Tool Emerging Technology The H2020 FETHPC projects MAESTRO and EPEEC will enable an incremental development following a co-design approach guided by (pre-)exascale applications. Parallelware Technology (libpw) Parallelware front-end Parallelware back-endParallelware middle-end Semantic Analysis Engine C OpenACC 2.0 Multi-Threading Offloading OpenMP 4.5 C++ Fortran Tasking OmpSs FPGAs http://www.prace-ri.eu/pracesc18-presentations/
  • 11.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 12.
    An interactive tool thatacts as your mentor “ Tell me, I will forget, Show me, I may remember, Involve me, I will understand.” Emerging Technologies
  • 13.
    Parallelware Trainer (v1.0Sep 2018) Technical features: ● Identification of parallelization opportunities. ● Assistance with the introduction of correct OpenMP and OpenACC directives. ● Correct data scoping, including private/shared variables. ● Support for C programming language. ● Use any compiler and any build/compilation tool in Windows, Linux and MacOSX. ● Develop, test and benchmark all within the same interface. Benefits: ● Faster, more effective learning. ● Work on realistic codes rather than toy examples, including your own code. ● Reduced learning curve. ● Parallelize code within minutes. ● Immediate identification of where and how to parallelize. ● Support for multithreading, offloading to GPUs. https://www.appentra.com/products/parallelware-trainer/
  • 14.
    Parallelware Trainer (v1.0Sep 2018) Project Explorer Code Editor Version Manager Output Consoles
  • 15.
    PASSIVE LEARNING Lectures INTERACTIVE LEARNING Exercises Demonstrations LEARNING BY DOINGWITH MENTORS Problem solving HACKATHONS WORKSHOPS COURSES interactivity Learning effort Scalability Learning effort Interactivity Scalability interactivityLearning effort Scalability THE HPC EDUCATION & TRAINING PYRAMID Parallelware Trainer
  • 16.
    New knowledge base(glossary of parallel programming) Parallelware Trainer
  • 17.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2020 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 18.
    A command-line reporting toolto improve productivity of HPC application developers Emerging Technologies
  • 19.
    Methodological Approach toParallel Programming Tuned OpenACC parallel versionCode Analysis Working OpenMP Parallel version Profiled version Working version MS0 MS1 MS2 MS3 Working OpenACC parallel versionPreparing for the hackathon Tuned OpenMP parallel version Highly optimized OpenMP+OpenACC parallel version MS4 MS5 MS6 MS7 MS8 Biggest parallelisation barrier is “Code Analysis”: data scoping across procedure boundaries in codes using complex in-memory data layouts
  • 20.
    Parallelware Analyzer (Beta) ●Help to understand where and how to parallelize in real codes. ● Reports to facilitate understanding the code from different perspectives. ● Batch processing of files/directories of big code bases.
  • 21.
  • 22.
    Parallelware Analyzer: Roadmap2019+ www.appentra.com/products/parallelware-analyzer ● Early Access Program (EAP) Q1-Q2 2019 ○ Students/Researchers/Developers ○ Academia/Industry ● Create a community to ensure development aligned with the user communities needs ○ Discussion forums ● Benefits ○ Free access to the tool ○ Early adopter discounts when the product is officially launched ● Official launch of Parallelware Analyzer: ○ Expected to be at SC19 ● Sign-up now for EAP starting 1 Jan 2019
  • 23.
    Index ● Is Therea Need for Parallelware Tools on POWER Systems? ● The Parallelware Tools Suite: Design of Software Components ● Parallelware Technology: Roadmap 2018-2019 ● Parallelware Trainer: Roadmap 2018-2019 ● Parallelware Analyzer: Roadmap 2018-2019 ● Conclusions & Future Work
  • 24.
    Conclusions & FutureWork Conclusions ● Parallelware tools are advancing the state of the art in parallel programming challenges, and can facilitate it for POWER systems too. ○ Parallelware Trainer can help in training for POWER systems using OpenMP and OpenACC for multicores and GPUs. ● Parallelware Analyzer (BETA) helps to dive into the complexity of developing HPC codes, covering the computational, control-flow and memory perspectives ○ Data scoping in big code bases is probably pain point #1 Coming soon... ● Distribute binary packages of Parallelware tools for POWER systems. ● Certify Parallelware tools as OpenPOWER Ready. SC18 Emerging Technologies booth #619 SC18 Startup Pavilion booth #3869
  • 25.
    How Parallelware technologyeases HPC software development for POWER systems? Parallelware Analyzer and GPU programming for big code bases Manuel Arenaz manuel.arenaz@appentra.com OpenPOWER Academic Discussion Group Workshop 2018 Saturday, 10 November 2018 | Dallas, US https://indico-jsc.fz-juelich.de/event/76/