Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

1,128 views

Published on

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

Bio: "Yutaka Ishikawa is the project leader of developing the post K
supercomputer. From 1987 to 2001, he was a member of AIST (former
Electrotechnical Laboratory), METI. From 1993 to 2001, he was the
chief of Parallel and Distributed System Software Laboratory at Real
World Computing Partnership. He led development of cluster system
software called SCore, which was used in several large PC cluster
systems around 2004. From 2002 to 2014, he was a professor at the
University Tokyo. He led a project to design a commodity-based
supercomputer called T2K open supercomputer. As a result, three
universities, Tsukuba, Tokyo, and Kyoto, obtained each supercomputer
based on the specification in 2008. He was also involved with the
design of the Oakleaf-PACS, the successor of T2K supercomputer in both
Tsukuba and Tokyo, whose peak performance is 25PF."

Session Title: Post-K and Arm HPC Ecosystem
Session Description:
"Post-K, a flagship supercomputer in Japan, is being developed by Riken
and Fujitsu. It will be the first supercomputer with Armv8-A+SVE.
This talk will give an overview of Post-K and how RIKEN and Fujitsu
are currently working on software stack for an Arm architecture."

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Yutaka Ishikawa - Post-K and Arm HPC Ecosystem - Linaro Arm HPC Workshop Santa Clara 2018

  1. 1. Post-K and Arm HPC ecosystem Yutaka Ishikawa RIKEN Center for Computational Science 10:00– 10:30 26th of June, 2018 Arm Architecture HPC Workshop by  Linaro and HiSilicon, Santa Clara, USA
  2. 2. Project Overview 20018/7/26 Login Servers Login Servers Maintenance Servers Maintenance Servers I/O NetworkI/O Network …… … … … … … … … … … … Hierarchical Storage System Hierarchical Storage System Portal Servers Portal Servers  Missions • Building the Japanese national flagship supercomputer,  post K, and • Developing wide range of HPC applications, running on  post K, in order to solve social and science issues in  Japan  Project organization • Post K Computer development • RIKEN AICS is in charge of development • Fujitsu is vendor partner. • International collaborations: DOE,  CEA,  JLESC (NCSA, ANL, UTK, JSC, BSC, INRIA, RIKEN) • Applications • The government selected • 9 social & scientific priority issues • 4 exploratory issues and their R&D organizations.  Current Status • The first porotype CPU chip has been  powered on at Fujitsu • Fujitsu is now evaluating the chip • System software stack is being  implemented • Target applications are being tuned 2 NOW Courtesy of FUJITSU LIMITED
  3. 3. Project Overview 20018/7/26 Login Servers Login Servers Maintenance Servers Maintenance Servers I/O NetworkI/O Network …… … … … … … … … … … … Hierarchical Storage System Hierarchical Storage System Portal Servers Portal Servers  Missions • Building the Japanese national flagship supercomputer,  post K, and • Developing wide range of HPC applications, running on  post K, in order to solve social and science issues in  Japan  Project organization • Post K Computer development • RIKEN AICS is in charge of development • Fujitsu is vendor partner. • International collaborations: DOE,  CEA,  JLESC (NCSA, ANL, UTK, JSC, BSC, INRIA, RIKEN) • Applications • The government selected • 9 social & scientific priority issues • 4 exploratory issues and their R&D organizations.  Current Status • The first porotype CPU chip has been  powered on at Fujitsu • Fujitsu is now evaluating the chip • System software stack is being  implemented • Target applications are being tuned 3 Courtesy of FUJITSU LIMITED NOW Target Applications Program Brief description ① GENESIS MD for proteins ② Genomon Genome processing (Genome alignment) ③ GAMERA Earthquake simulator (FEM in unstructured & structured grid) ④ NICAM+LETK Weather prediction system using Big data (structured grid stencil & ensemble Kalman filter) ⑤ NTChem molecular electronic (structure calculation) ⑥ FFB Large Eddy Simulation (unstructured grid) ⑦ RSDFT an ab-initio program (density functional theory) ⑧ Adventure Computational Mechanics System for Large Scale Analysis and Design (unstructured grid) ⑨ CCS-QCD Lattice QCD simulation (structured grid Monte Carlo)
  4. 4. Courtesy of FUJITSU LIMITED Two compute nodes are  implemented on one board  CPU Architecture  Armv8-A + SVE (Scalable Vector Extension)  SIMD Length: 512 bit  FP64/FP32/FP16  INT 1-, 2-, 4-, 8-byte  # of Cores: 48 + (2/4 for OS)  Byte/DP Flop  Approx. 0.4  Fujitsuʼs extensions  Inter core barrier  Sector cache  Hardware prefetch assist 20018/7/26 4
  5. 5. An Overview of Post-K Hardware  Compute Node, Compute + I/O Node connected by 6D mesh/torus Interconnect  3-level hierarchical storage system  1st Layer  Cache for global file system  Temporary file systems - Local file system for compute node - Shared file system for a job  2nd Layer  Lustre-based global file system  3rd Layer  Storage for archive 520018/7/26
  6. 6. An Overview of System Software Stack 20018/7/26 Easy of use is one of our KPIs (Key Performance Indicators) Providing wide range of applications/tools/libraries/compilers Linux Distribution Eco‐System Parallel Programming Environments XMP, FDPS, … Armv8 + SVE Multi-Kernel System: Linux and light-weight kernel (McKernel) Batch Job System Application-oriented File I/O Communication MPI Parallel File System Tuning and Debugging Tools Hierarchical File System Low Level Communication File I/O for Hierarchical Storage LLIO Fortran, C/C++, OpenMP, Java, … Math libraries Process/Thread PIP 6
  7. 7. An Overview of System Software Stack 20018/7/26 Easy of use is one of our KPIs (Key Performance Indicators) Providing wide range of applications/tools/libraries/compilers Linux Distribution Eco‐System Parallel Programming Environments XMP, FDPS, … Armv8 + SVE Multi-Kernel System: Linux and light-weight kernel (McKernel) Batch Job System Application-oriented File I/O Communication MPI Parallel File System Tuning and Debugging Tools Hierarchical File System Low Level Communication File I/O for Hierarchical Storage LLIO Fortran, C/C++, OpenMP, Java, … Math libraries Process/Thread PIP 7 Balazs Gerofi, Rolf Riesen, Masamichi Takagi, Taisuke Boku , Yutaka Ishikawa, Robert  W. Wisniewski, “Performance and Scalability of Lightweight Multi‐Kernel based  Operating Systems,” IPDPS2018, 2018.
  8. 8.  Programing Languages and Compilers provided by Fujitsu  Fortran2008 & Fortran2018 subset  C11 & GNU and Clang extensions  C++14 & C++17 subset and GNU and Clang extensions  OpenMP 4.5 & OpenMP 5.0 subset  Java  Parallel Programming Language & Domain Specific Library provided by RIKEN  XcalableMP  FDPS (Framework for Developing Particle Simulator)  Process/Thread Library provided by RIKEN  PiP (Process in Process)  Script Languages provided by Linux distributor  E.g., Python+NumPy, SciPy  Communication Libraries  MPI 3.1 & MPI4.0 subset  Open MPI base (Fujitsu), MPICH (RIKEN)  Low-level Communication Libraries  uTofu (Fujitsu), LLC(RIKEN)  File I/O Libraries provided by RIKEN  pnetCDF, DTF, FTAR  Math Libraries  BLAS, LAPACK, ScaLAPACK, SSL II (Fujitsu)  EigenEXA, Batched BLAS (RIKEN)  Programming Tools provided by Fujitsu  Profiler, Debugger, GUI Post-K Programming Environment GCC and LLVM will be also available 20018/7/26 8
  9. 9. Support of Software Development/Porting CY2017 CY2018 CY2019 CY2020 CY2021 Specification Optimization Guidebook RIKEN  Performance  Evaluation Environment Early Access  Program Publishing Incrementally  Performance estimation tool using FX100 RIKEN Simulator Installation, and Tuning ManufacturingDesign and Implementation Operation Armv8‐A + SVE Overview Detailed hardware info. • CY2018. Q2, Optimization guidebook is incrementally published • CY2021. Q1/Q2, General operation starts NOW 9 • Takeo Yoshida, “Fujitsuʼs HPC processor for the Post-K computer,” IEEE Hot Chips: A Symposium on High Performance Chips, San Jose, August 21, 2018. Note: Fujitsu will reveal features of Post‐K CPU at Hot Chips 2018.  20018/7/26 Presenting microarchitecture including core pipeline, cache, memory, NUMA, performance and power management features. • CY2020. Q2, Early access program start Contribution to Arm HPC (Armv8-A+SVE) Ecosystem
  10. 10. Fujitsuʼs OSS Efforts 20018/7/26 10
  11. 11. 11  Application  MODYLAYS, USQCD, OpenFOAM  Library  Numpy, Scipy, pysam, FFTW, LAPACK95, lapack, blas, Metis, ParMetis, HDF5, NetCDF, NetCDF-fortran, PnetCDF, scalasca, SCOTCH, Zoltan, openmpi1.8, openmpi1.10, mpich2-1.4.1, boost, FFTE, PETSc/SLEPc Elemental, BWA, Star, Blat, TopHat, TopHat2, MapSplice2, MPDyn2, ELPA, Trillinos, Eigen3, mesa, MesaGLUT, libxml2, C-LIME, EigenExa  Tool/Visuallization Tool  git, git-flow, gnuplot, Paraview, VisIT, ImageMagick, svn, Samtools, bedtools, Biobambam, Picard, GMT, GrADS, HDF-EOS, wgrib, GRIB API, Climate data Operators  Build tool  cmake, gnu Autotools, automake, autoconf, gcc, gfortran, C++, libtools  Shell script / Programming language / Script language  python2, python3, perl5, R, Ruby2, zsh, ksh, NCADS Command Language OSS Survey (9 priority issues developers) 20018/7/26
  12. 12. 12  Application  ABINIT-MP, AkaiKKR, bedtools, Biobambam, BWA, CUBE, ERmod, fdps, FFV-C, FrontFlow/Red, FrontISTR, GAMES, GENESIS, gromacs, GROMACS, HIVE, LAMMPS, MapSplice2, MODYLAS, NEURON, octa, OpenFOAM, PBVR, Picard, PIMD, quantum ESPRESSO, rDock, Samtools, SCALE, Star, TopHat, TopHat 2, WHEEL, xTAPP,  Library  FFTW, matplotlib(python), beautiful soup(python), metis, ParMETIS, NetCDF4, HDF5, NuSDAS1.3, octa, fdps, Zoltan, cgns, Polylib, libsim  Visualization tool  gnuplot, PBVR, VTK, OSMesa  Tool  GNU utils, zlib, anaconda(python), itk, PAPI, PMlib, Szip, zip, TextParser, fpzip,  Build tool  make, autoconf, cmake  Shell script / Programming language / Script language  bash, curl, python, ruby  ISV  ABAQUS, Advance, AMBER, Ansys fluent, Gaussian, FLUENT, Scryu/Tetra, LS- DYNA, VPS solver ( PAM-CRASH ), Helyx, HEETAH, iconCFD, LaBS, JMAG, MIZUHO, NuFD, VASP, VSOP OSS Survey (K computer users) 20018/7/26
  13. 13. Open Source Management Tools  EasyBuild  Used at CEA  RIKEN is now evaluating it. As an example, CAFFE, a deep learning tool, is ported to an Arm machine using EasyBuild  CAFFE consists of several opensource packages: - boost, blas, cmake, gflags, google (glog, googletest, snapy, leveldb, protobuf), lmdb, opencv  Spack  Used at ECP project  RIKEN starts Spack evaluation also. 1320018/7/26

×