ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"

The Next Flagship Supercomputer in Japan
Yutaka Ishikawa, Project Leader
AICS RIKEN
02:15 pm ‐ 02:45 pm, June 21, 2016

Outline of Talk
Introduction of the project
An Overview of the Japanese next flagship supercomputer, so-
called post K
Introduction of International Collaborations
 System software stack for post K is being developed with
international collaborations
Concluding Remarks
2ISC'16, June 21, 2016

Flagship 2020 project
 Developing the next Japanese flagship computer,
so-called “post K”
3
Disaster prevention
and global climate
Energy issues Industrial competitiveness Basic science
Society with health
and longevity
 Developing a wide range of application codes,
to run on the “post K”, to solve major social
and science issues
Vendor partner
The Japanese government selected 9 social &
scientific priority issues and their R&D
organizations.
ISC'16, June 21, 2016

Disaster prevention
and global climate
Society with health
and longevity
R&D Organization
4
Target ApplicationsArchitectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
ISC'16, June 21, 2016
To build an efficient execution environment in terms of
 Power consumption,
 Productivity, and
 Usability
Application developers are involved in the design

Disaster prevention
and global climate
Society with health
and longevity
R&D Organization
5
Target Applications
Architectural Parameters
• #SIMD, SIMD length, #core, #NUMA node
• cache (size and bandwidth)
• memory technologies
• specialized hardware
• Interconnect
• I/O network
ISC'16, June 21, 2016
To build an efficient execution environment in terms of
 Power consumption,
 Productivity, and
 Usability
Application developers are involved in the design
 Mutual understanding both
computer architecture/system software and applications
 Looking at performance predictions
 Finding out the best solution with constraints, e.g., power
consumption, budget, and space
Prediction of node-level
performance
Profiling applications,
e.g., cache misses
and execution unit
usages
Prediction Tool
Prediction of scalability
(latency and bandwidth)

Disaster prevention
and global climate
Society with health
and longevity
R&D Organization
6
Target Applications
ISC'16, June 21, 2016
• DOE‐MEXT
• JLESC
• …
International
Collaboration
•
• HPCI Consortium
• PC Cluster Consortium
• OpenHPC
• …
Communities
• Univ. of Tsukuba
• Univ. of Tokyo
• Kyoto Univ.
Domestic
Collaboration

An Overview of post K
 Hardware
 Manycore architecture
 6D mesh/torus Interconnect
 3-level hierarchical storage system
 Silicon Disk
 Magnetic Disk
 Storage for archive
7
Target performance:
100 times (maximum) of K by the capacity computing
50 times (maximum) of K by the capability computing
Power consumption of 30 - 40MW (cf. K computer: 12.7 MW)
Login
Servers
Login
Servers
Maintenance
Servers
Maintenance
Servers
I/O NetworkI/O Network
……
…
…
…
…
…
…
…
…
…
… Hierarchical
Storage System
Hierarchical
Storage System
Portal
Servers
Portal
Servers
 System Software
 Multi-Kernel: Linux with Light-weight Kernel
 File I/O middleware for 3-level hierarchical storage
system and application
 Application-oriented file I/O middleware
 MPI+OpenMP programming environment
 Highly productive programing language and libraries
ISC'16, June 21, 2016

What we have done
 Software
 OS functional design
 Communication functional design
 File I/O functional design
 Programming languages
 Mathematical libraries
8
• Node architecture
• System configuration
• Storage system
Continue to design
 Hardware
 Instruction set architecture
ISC'16, June 21, 2016

Instruction Set Architecture
 ARM V8 HPC Extension
 Fujitsu is a lead partner of ARM HPC extension development
 Detailed features will be announced at Hot Chips 28 - 2016
9
http://www.hotchips.org/program/
Mon 8/22 Day1 9:45AM GPUs & HPCs
ARMv8‐A Next Generation Vector Architecture for HPC
 Fujitsuʼs inheritances
 FMA
 Math acceleration primitives
 Inter core barrier
 Sector cache
 Hardware prefetch assist
ISC'16, June 21, 2016

Outline of Talk
Introduction of FLAGSHIP2020 project
An Overview of post K system
Introduction of International Collaborations
Concluding Remarks
10
*The Icon is made by Freepik from www.flaticon.com
More than 10 research topics
Collaboration Categories
◎ Collaborative development of open source software
◎ Evaluation and analysis of benchmarks and technologies
◎ Standardization of mature technologies
◎ Pre-standardization interface coordination
◎ Collection and publication of open data
ISC'16, June 21, 2016

System Software Collaboration: Example (DOE-MEXT)
11
In terms of Collaborative development of open source software
• Argonne contribution: CH4 hackathon for LLC
• AICS contribution: a part of CH4 implementation
• Memory management for new memory hierarchy
• MPICH and LLC communication libraries
MPICH Software Structure
CH4: the successor of CH3, the current
abstract network device interface
◎ Collaborative development of open source software◎ Evaluation and analysis of benchmarks and technologies
ISC'16, June 21, 2016

12
Northwestern
University
• I/O Benchmarks and pnetCDF
implementations for Scientific
Big Data
PI: Takemasa Miyoshi, RIKEN AICS
“Innovating Big Data Assimilation technology for revolutionizing very‐short‐
range severe weather prediction”
An innovative 30-second super-rapid update numerical weather prediction system for 30-minute/1-
hour severe weather forecasting will be developed, aiding disaster prevention and mitigation, as well
as bringing a scientific breakthrough in meteorology.
The results of 100 ensemble simulations are read by
data assimilation processes and data size in total is
over 1.7 TB
◎ Collaborative development of open source software
ISC'16, June 21, 2016

System Software Collaboration: Example
13
• Twice meetings per year
• A researcher visits Intel for a few months
Lightweight kernel
McKernel is running on Intel Xeon and Xeon phi
• Understanding benefit of lightweight kernel
• Understanding differences of McKernel and mOS
• Standardization of API for lightweight kernel (Plan)
intel
◎ Evaluation and analysis of benchmarks and technologies ◎ Pre-standardization interface coordination
ISC'16, June 21, 2016

14
 AICS and U Houston, U Tsukuba:
Extension of PGAS (Partitioned Global Address Space)
model with language constructs of multitasking
(multithreading) for manycore‐based exascale systems
(XcalableMP 2.0)
XMP, XcalableMP, is a directive-based language for distributed memory systems
• PGAS language for large scale distributed memory system
• HPF‐like concept and OpenMP‐like description with directives
• Two memory models: Global View and Local View
• Global View: PGAS, image of large array distributed into partial ones in nodes
• Local view: MPI‐like + Coarray notation is allowed
◎ Collaborative development of open source software◎ Evaluation and analysis of benchmarks and technologies
 ANL and AICS, U. Tsukuba:
Runtime design for PGAS communication and
multitasking using Argobot light‐weight user‐
level thread.
ISC'16, June 21, 2016

Concluding Remarks
 Fujitsu decided that post Kʼs CPU is based on ARM V8 with
HPC extension
 The usability will be improved than the K computer by
changing architecture
 More wide-range community support
 The system software stack for Post K is being designed and
implemented with the leverage of international
collaborations
 The software stack developed at RIKEN is Open source
 It also runs on Intel Xeon and Xeon phi
 RIKEN would like to contribute to OpenHPC
15ISC'16, June 21, 2016

ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"

Similar to ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K" (20)

Recently uploaded

Recently uploaded (20)

ARM-based Supercomputer from Fujitsu and RIKEN - "Post-K"