Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

Post-K and Arm HPC
ecosystem
Yutaka Ishikawa
RIKEN Center for Computational Science
10:00– 10:30 26th of June, 2018
Arm Architecture HPC Workshop by
Linaro and HiSilicon, Santa Clara, USA

Project Overview
20018/7/26
Login
Servers
Login
Servers
Maintenance
Servers
Maintenance
Servers
I/O NetworkI/O Network
……
…
…
…
…
…
…
…
…
…
… Hierarchical
Storage System
Hierarchical
Storage System
Portal
Servers
Portal
Servers
 Missions
• Building the Japanese national flagship supercomputer,
post K, and
• Developing wide range of HPC applications, running on
post K, in order to solve social and science issues in
Japan
 Project organization
• Post K Computer development
• RIKEN AICS is in charge of development
• Fujitsu is vendor partner.
• International collaborations: DOE, CEA,
JLESC (NCSA, ANL, UTK, JSC, BSC, INRIA, RIKEN)
• Applications
• The government selected
• 9 social & scientific priority issues
• 4 exploratory issues
and their R&D organizations.
 Current Status
• The first porotype CPU chip has been
powered on at Fujitsu
• Fujitsu is now evaluating the chip
• System software stack is being
implemented
• Target applications are being tuned
2
NOW
Courtesy of FUJITSU LIMITED

Project Overview
20018/7/26
Login
Servers
Login
Servers
Maintenance
Servers
Maintenance
Servers
I/O NetworkI/O Network
……
…
…
…
…
…
…
…
…
…
… Hierarchical
Storage System
Hierarchical
Storage System
Portal
Servers
Portal
Servers
 Missions
• Building the Japanese national flagship supercomputer,
post K, and
• Developing wide range of HPC applications, running on
post K, in order to solve social and science issues in
Japan
 Project organization
• Post K Computer development
• RIKEN AICS is in charge of development
• Fujitsu is vendor partner.
• International collaborations: DOE, CEA,
JLESC (NCSA, ANL, UTK, JSC, BSC, INRIA, RIKEN)
• Applications
• The government selected
• 9 social & scientific priority issues
• 4 exploratory issues
and their R&D organizations.
 Current Status
• The first porotype CPU chip has been
powered on at Fujitsu
• Fujitsu is now evaluating the chip
• System software stack is being
implemented
• Target applications are being tuned
3
NOW
Target Applications
Program Brief description
① GENESIS MD for proteins
② Genomon Genome processing (Genome alignment)
③ GAMERA
Earthquake simulator (FEM in unstructured &
structured grid)
④ NICAM+LETK
Weather prediction system using Big data (structured
grid stencil & ensemble Kalman filter)
⑤ NTChem molecular electronic (structure calculation)
⑥ FFB Large Eddy Simulation (unstructured grid)
⑦ RSDFT an ab-initio program (density functional theory)
⑧ Adventure
Computational Mechanics System for Large Scale
Analysis and Design (unstructured grid)
⑨ CCS-QCD Lattice QCD simulation (structured grid Monte Carlo)

Two compute nodes are
implemented on one board
CPU Architecture
 Armv8-A + SVE (Scalable Vector Extension)
 SIMD Length: 512 bit
 FP64/FP32/FP16
 INT 1-, 2-, 4-, 8-byte
 # of Cores: 48 + (2/4 for OS)
 Byte/DP Flop
 Approx. 0.4
 Fujitsuʼs extensions
 Inter core barrier
 Sector cache
 Hardware prefetch assist
20018/7/26 4

An Overview of Post-K Hardware
 Compute Node, Compute + I/O Node
connected by 6D mesh/torus Interconnect
 3-level hierarchical storage system
 1st Layer
 Cache for global file system
 Temporary file systems
- Local file system for compute node
- Shared file system for a job
 2nd Layer
 Lustre-based global file system
 3rd Layer
 Storage for archive
520018/7/26

An Overview of System Software Stack
20018/7/26
Easy of use is one of our KPIs (Key Performance Indicators)
Providing wide range of
applications/tools/libraries/compilers
Linux Distribution
Eco‐System
Parallel Programming Environments
XMP, FDPS, …
Armv8 + SVE
Multi-Kernel System: Linux and light-weight kernel (McKernel)
Batch Job System
Application-oriented
File I/O
Communication
MPI
Parallel File System
Tuning and Debugging Tools
Hierarchical File System
Low Level Communication
File I/O for
Hierarchical Storage
LLIO
Fortran, C/C++, OpenMP, Java, …
Math libraries
Process/Thread
PIP
6

An Overview of System Software Stack
20018/7/26
Easy of use is one of our KPIs (Key Performance Indicators)
Providing wide range of
applications/tools/libraries/compilers
Linux Distribution
Eco‐System
Parallel Programming Environments
XMP, FDPS, …
Armv8 + SVE
Multi-Kernel System: Linux and light-weight kernel (McKernel)
Batch Job System
Application-oriented
File I/O
Communication
MPI
Parallel File System
Tuning and Debugging Tools
Hierarchical File System
Low Level Communication
File I/O for
Hierarchical Storage
LLIO
Fortran, C/C++, OpenMP, Java, …
Math libraries
Process/Thread
PIP
7
Balazs Gerofi, Rolf Riesen, Masamichi Takagi, Taisuke Boku , Yutaka Ishikawa, Robert
W. Wisniewski, “Performance and Scalability of Lightweight Multi‐Kernel based
Operating Systems,” IPDPS2018, 2018.

 Programing Languages and Compilers
provided by Fujitsu
 Fortran2008 & Fortran2018 subset
 C11 & GNU and Clang extensions
 C++14 & C++17 subset and GNU and
Clang extensions
 OpenMP 4.5 & OpenMP 5.0 subset
 Java
 Parallel Programming Language &
Domain Specific Library provided by
RIKEN
 XcalableMP
 FDPS (Framework for Developing Particle
Simulator)
 Process/Thread Library provided by RIKEN
 PiP (Process in Process)
 Script Languages provided by Linux
distributor
 E.g., Python+NumPy, SciPy
 Communication Libraries
 MPI 3.1 & MPI4.0 subset
 Open MPI base (Fujitsu), MPICH (RIKEN）
 Low-level Communication Libraries
 uTofu (Fujitsu), LLC(RIKEN）
 File I/O Libraries provided by RIKEN
 pnetCDF, DTF, FTAR
 Math Libraries
 BLAS, LAPACK, ScaLAPACK, SSL II
（Fujitsu）
 EigenEXA, Batched BLAS （RIKEN）
 Programming Tools provided by Fujitsu
 Profiler, Debugger, GUI
Post-K Programming Environment
GCC and LLVM will be also available
20018/7/26 8

Support of Software Development/Porting
CY2017 CY2018 CY2019 CY2020 CY2021
Specification
Optimization
Guidebook
RIKEN
Performance
Evaluation
Environment
Early Access
Program
Publishing Incrementally
Performance estimation tool using FX100
RIKEN Simulator
Installation,
and Tuning
ManufacturingDesign and Implementation Operation
Armv8‐A + SVE Overview Detailed hardware info.
• CY2018. Q2, Optimization guidebook is incrementally published
• CY2021. Q1/Q2, General operation starts
NOW
9
• Takeo Yoshida, “Fujitsuʼs HPC processor for the Post-K computer,” IEEE Hot
Chips: A Symposium on High Performance Chips, San Jose, August 21, 2018.
Note: Fujitsu will reveal features of Post‐K CPU at Hot Chips 2018.
20018/7/26
Presenting microarchitecture including core pipeline, cache,
memory, NUMA, performance and power management features.
• CY2020. Q2, Early access program start
Contribution to Arm HPC (Armv8-A+SVE) Ecosystem

Fujitsuʼs OSS Efforts
20018/7/26 10

11
 Application
 MODYLAYS, USQCD, OpenFOAM
 Library
 Numpy, Scipy, pysam, FFTW, LAPACK95, lapack, blas, Metis, ParMetis,
HDF5, NetCDF, NetCDF-fortran, PnetCDF, scalasca, SCOTCH, Zoltan,
openmpi1.8, openmpi1.10, mpich2-1.4.1, boost, FFTE, PETSc/SLEPc
Elemental, BWA, Star, Blat, TopHat, TopHat2, MapSplice2, MPDyn2,
ELPA, Trillinos, Eigen3, mesa, MesaGLUT, libxml2, C-LIME, EigenExa
 Tool/Visuallization Tool
 git, git-flow, gnuplot, Paraview, VisIT, ImageMagick, svn, Samtools,
bedtools, Biobambam, Picard, GMT, GrADS, HDF-EOS, wgrib, GRIB API,
Climate data Operators
 Build tool
 cmake, gnu Autotools, automake, autoconf, gcc, gfortran, C++, libtools
 Shell script / Programming language / Script language
 python2, python3, perl5, R, Ruby2, zsh, ksh, NCADS Command
Language
OSS Survey (9 priority issues developers)
20018/7/26

12
 Application
 ABINIT-MP, AkaiKKR, bedtools, Biobambam, BWA, CUBE, ERmod, fdps, FFV-C,
FrontFlow/Red, FrontISTR, GAMES, GENESIS, gromacs, GROMACS, HIVE,
LAMMPS, MapSplice2, MODYLAS, NEURON, octa, OpenFOAM, PBVR, Picard,
PIMD, quantum ESPRESSO, rDock, Samtools, SCALE, Star, TopHat, TopHat 2,
WHEEL, xTAPP,
 Library
 FFTW, matplotlib(python), beautiful soup(python), metis, ParMETIS, NetCDF4,
HDF5, NuSDAS1.3, octa, fdps, Zoltan, cgns, Polylib, libsim
 Visualization tool
 gnuplot, PBVR, VTK, OSMesa
 Tool
 GNU utils, zlib, anaconda(python), itk, PAPI, PMlib, Szip, zip, TextParser, fpzip,
 Build tool
 make, autoconf, cmake
 Shell script / Programming language / Script language
 bash, curl, python, ruby
 ISV
 ABAQUS, Advance, AMBER, Ansys fluent, Gaussian, FLUENT, Scryu/Tetra, LS-
DYNA, VPS solver ( PAM-CRASH ), Helyx, HEETAH, iconCFD, LaBS, JMAG,
MIZUHO, NuFD, VASP, VSOP
OSS Survey (K computer users)
20018/7/26

Open Source Management Tools
 EasyBuild
 Used at CEA
 RIKEN is now evaluating it. As an example, CAFFE, a deep
learning tool, is ported to an Arm machine using EasyBuild
 CAFFE consists of several opensource packages:
- boost, blas, cmake, gflags, google (glog, googletest, snapy, leveldb, protobuf),
lmdb, opencv
 Spack
 Used at ECP project
 RIKEN starts Spack evaluation also.
1320018/7/26

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

Similar to Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018 (20)

More from Linaro

More from Linaro (20)

Recently uploaded

Recently uploaded (20)

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018