SlideShare a Scribd company logo
1 of 24
Download to read offline
Supercomputer Fugaku
Toshiyuki Shimizu
FUJITSU LIMITED
Feb. 18th, 2020
Copyright 2020 FUJITSU LIMITED
Outline
◼ Fugaku project overview
◼ Co-design
◼ Approach
◼ Design results
◼ Performance & energy consumption evaluation
◼ Green500
◼ OSS apps
◼ Fugaku priority issues
◼ Summary
Copyright 2020 FUJITSU LIMITED1
Copyright 2020 FUJITSU LIMITED
Supercomputer “Fugaku”, formerly known as Post-K
ApproachFocus
Application performance
Power efficiency
Usability
Co-design w/ application developers and Fujitsu-designed CPU core w/
high memory bandwidth utilizing HBM2
Leading-edge Si-technology, Fujitsu's proven low power & high
performance logic design, and power-controlling knobs
Arm®v8-A ISA with Scalable Vector Extension (“SVE”), and Arm standard
Linux
2
Fugaku project schedule
2011 2012 2013 2014 2015 2016 2017 2018 2020 20212019
Basic
design
Detailed design &
Implementation
Manufacturing,
Installation
and Tuning
General
operation
2022
Copyright 2020 FUJITSU LIMITED
Co-Design w/ apps groups
Architecture &
sizing
Select
apps
Feasibility study
Apps
review
Fugaku development & delivery
3
Fugaku co-design
◼ Co-design goals
◼ Obtain the best performance, 100x apps performance than K computer, within power budget, 30-40MW
• Design applications, compilers, libraries, and hardware
◼ Approach
◼ Estimate perf & power using apps info, performance counts of Fujitsu FX100, and cycle base simulator
• Computation time: brief & precise estimation
• Communication time: bandwidth and latency for communication w/ some attributes for communication patterns
• I/O time:
◼ Then, optimize apps/compilers etc. and resolve bottlenecks
◼ Estimation of performance and power
◼ Precise performance estimation for primary kernels
• Make & run Fugaku objects on the Fugaku cycle base simulator
◼ Brief performance estimation for other sections
• Replace performance counts of FX100 w/ Fugaku params: # of inst. commit/cycle, wait cycles of barrier, inst. fetch, branch, fp exec, data
load/store wait cycles of L1D/L2, etc.
◼ Power estimation
• DGEMM execution toggles on the emulator + estimation of memory and interconnect considering utilization + loss of convertors
Copyright 2020 FUJITSU LIMITED4
Co-design iterations around year 2015
◼ Apply each of application kernels
◼ Define/refine a set of architecture parameters
◼ Implement/tune the kernel under the architecture parameters
◼ Evaluate execution time using the estimation tools
◼ Identify hardware bottlenecks and explore design space
◼ Examples of architecture parameters
◼ Frequency, # of arch regs, SIMD width, cache structure & size, # of cores…
◼ Memory, interconnect parameters
◼ Implementation of instructions: i.e. Combined gather…
Copyright 2020 FUJITSU LIMITED5
A64FX CPU
◼ Arm SVE, high performance and high efficiency
◼ DP performance 2.7+ TFLOPS, >90%@DGEMM, (CPU freq=1.8/2.0/2.2 GHz)
◼ Memory BW 1024 GB/s, >80%@STREAM Triad
12x compute cores
1x assistant core
A64FX
ISA (Base, extension) Armv8.2-A, SVE
Peak DP performance 2.7+ TFLOPS
SIMD width 512-bit
# of cores 48 + 4
Memory capacity 32 GiB (HBM2 x4)
Memory peak bandwidth 1024 GB/s
PCIe Gen3 16 lanes
High speed interconnect TofuD integrated
PCle
Controller
Tofu
Interface
C
C
C
C
N
O
C
HBM2HBM2
HBM2HBM2
CMG CMG
CMG CMG
CMG:Core Memory Group NOC:Network on Chip
Copyright 2020 FUJITSU LIMITED
FUGAKU
operating cond.
6
◼ TSMC 7nm FinFET & CoWoS
◼ Broadcom SerDes, HBM I/O, and SRAMs
◼ 8.786 billion transistors
◼ 594 signal pins
A64FX leading-edge Si-technology
Copyright 2020 FUJITSU LIMITED
Core Core Core Core Core Core Core Core Core Core
Core Core Core Core Core
Core Core Core Core
Core Core Core Core Core Core Core
Core Core Core Core Core Core Core Core Core Core Core Core
Core Core Core Core Core Core Core Core Core Core
Core Core Core Core
L2
Cache
L2
Cache
L2
Cache
L2
Cache
HBM2InterfaceHBM2Interface
HBM2interfaceHBM2Interface
PCIe InterfaceTofuD Interface
RING-Bus
7
Fujitsu-designed CPU core w/ High Memory Bandwidth
◼ A64FX out-of-order controls in
cores, caches, and memories
achieve superior throughput
HBM2 8GiBHBM2 8GiBHBM2 8GiBHBM2 8GiB
CMG
L1D 64KiB, 4way
512-bit wide SIMD
2x FMAs
CoreCore
230+
GB/s
115+
GB/s
12x Compute Cores + 1x Assistant Core
L2 Cache 8MiB, 16way
256
GB/s
115+
GB/s
57+
GB/s
Core Core
x4
BW and calc. perf. A64FX B/F
DP floating perf. (TFlops) 2.7+ -
L1 data cache (TB/s) 11+ 4
L2 cache (TB/s) 3.6+ 1.3
Memory BW (GB/s) 1024 0.37
Copyright 2020 FUJITSU LIMITED8
A64FX optimized load efficiency for apps performance
◼ 128 bytes/cycle sustained bandwidth
even for unaligned SIMD load
◼ “Combined Gather” doubles gather
(indirect) load’s data throughput,
when target elements are within a
“128-byte aligned block” for a pair of
two regs, even & odd
L1D cache
Read port0
Read port1
Read data0
64B/cycle
Read data1
64B/cycle
Mem.
128B
0 1 2 3 4 5 6 7
0 1 3 2 6 7 45
8B
Regs
flow-1
flow-2
flow-4 flow-3
Maximizes BW to 32 bytes/cyc.
Copyright 2020 FUJITSU LIMITED
Suggested through Co-design work w/ app teams
9
A64FX Power Knobs to reduce power consumption
◼ “Power knob” limits units’ activities via user APIs
◼ Performance/Watt can be optimized by utilizing Power knobs
Copyright 2020 FUJITSU LIMITED
L1 inst.
cache
Branch
Predictor
Decode
& Issue
RSE0
RSA
RSE1
RSBR
PGPR
EXA
EXB
EAGA
EXC
EAGB
EXD
PFPR
Fetch
Port
Store
Port
L1 data
cache
HBM2 Controller
Fetch Issue Dispatch Reg-read Execute Cache and Memory
CSE
Commit
PC
Control
Registers
L2 cache
HBM2
Write
Buffer
Tofu controller
FLA
PPR
FLB
PRX
FLA pipeline only
HBM2 B/W adjust
(units of 10%)
Frequency reduction
Decode width: 4 to 2
EXA pipeline only
PCI Controller
52 cores
10
Fugaku system software developed with RIKEN
Fugaku system hardware
Multi-Kernel System: RHEL8 and light-weight kernel (IHK/McKernel)
Fujitsu Technical Computing Suite / RIKEN-developed system software
Fugaku applications
Management software Programming environmentFile system
System management
for high availability &
power saving
operation
Job management for
higher system
utilization & power
efficiency
LLIO
NVM-based file I/O
Accelerator for Fugaku
OpenMP, COARRAY, Math.libs.
Open MPI, MPICH(PiP), DTF
XcalableMP, FDPSFEFS
Lustre-based
distributed file system
Compilers, AI frameworks
Debug & tuning tools, Containers
Fugaku developed w/ RIKEN
Copyright 2020 FUJITSU LIMITED
Tensorflow
Chainer
Pytorch
Docker
KVM
Singularity
FDPS: Framework for Developing Particle Simulator
PiP: Process-in-Process
DTF: Data Transfer Framework
IHK: Interface for Heterogeneous Kernels
11
Outline
◼ Fugaku project overview
◼ Co-design
◼ Approach
◼ Design results
◼ Performance & energy consumption evaluation
◼ Green500
◼ OSS apps
◼ Fugaku priority issues
◼ Summary
Copyright 2020 FUJITSU LIMITED12
Green500, Nov. 2019
◼ A64FX prototype –
◼ Fujitsu A64FX 48C 2GHz
ranked #1on the list
◼ 768x general purpose A64FX
CPU w/o accelerators
• 1.9995 PFLOPS @ HPL, 84.75%
• 16.876 GF/W
• Power quality level 2
Copyright 2020 FUJITSU LIMITED
https://www.top500.org/green500/lists/2019/11/
13
How we achieved
◼ Key for GF/W is {energy efficient HW} x {parallel/exec efficiency}
◼ A64FX is designed for energy efficient
◼ Fujitsu’s proven CPU microarchitecture & 7nm FinFET
◼ SoC design: Tofu interconnect D integrated
◼ CoWoS: 4x HBM2 for main memory integrated
◼ Superior parallel/exec efficiency
◼ Math. libraries are tuned for application efficiency
◼ Comm. libs are also tuned utilizing long experience of Tofu @ K computer
◼ Performance tuning is efficiently done utilizing rich performance
analyzer/monitor
Copyright 2020 FUJITSU LIMITED
+ Concerted efforts
of co-design
14
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 5,000,000 10,000,000 15,000,000 20,000,000
SC19 TOP500 calculation efficiency
◼ A64FX superior
efficiency 84.75%
with small Nmax
◼ Results of:
◼ Optimized
communication
and math. libs
◼ Optimization of
overlapped
communication
Copyright 2020 FUJITSU LIMITED
A64FX prototype
Nmax
Challenge
Efficiency
w/ Acc
w/o Acc
15
A64FX CPU performance evaluation for real apps
◼ Open source software, Real
apps on an A64FX @ 2.2GHz
◼ Up to 1.8x faster over the
latest x86 processor (24core,
2.9GHz) x2
◼ High memory BW and long
SIMD length of A64FX work
effectively with these
applications
Copyright 2020 FUJITSU LIMITED
OpenFOAM
FrontlSTR
ABINIT
SALMON
SPECFEM3D
WRF
MPAS
FUJITSU A64FX x86 (24core, 2.9GHz) x2
0.5 1.0 1.5
Relative speed up ratio
16
A64FX CPU power efficiency for real apps
◼ Performance / Energy
consumption on an A64FX @
2.2GHz
◼ Up to 3.7x more efficient over
the latest x86 processor
(24core, 2.9GHz) x2
◼ High efficiency is achieved by
energy-conscious design and
implementation
Copyright 2020 FUJITSU LIMITED
OpenFOAM
FrontlSTR
ABINIT
SALMON
SPECFEM3D
WRF
MPAS
1.0 2.0 3.0 4.0
Relative power efficiency ratio
FUJITSU A64FX x86 (24core, 2.9GHz) x2
17
Fugaku priority issues and performance prediction
◼ 100x app performance and
power budget requirements
are met
◼ Some apps utilize power knob
to reduce power consumption
and achieve high energy
efficiency
◼ Measuring and optimizing
using real HW
Copyright 2020 FUJITSU LIMITED
https://postk-web.r-ccs.riken.jp/perf.html
18
Fugaku project schedule
2011 2012 2013 2014 2015 2016 2017 2018 2020 20212019
Basic
design
Detailed design &
Implementation
Manufacturing,
Installation
and Tuning
General
operation
2022
Copyright 2020 FUJITSU LIMITED
Co-Design w/ apps groups
Architecture &
sizing
Select
apps
Feasibility study
Apps
review
Fugaku development & delivery
Previous estimation w/o HW w/ HW
Apps tuning
19
Current status normalized by the estimation w/o HW
◼ Performance and energy (=power*elapse) of apps
Copyright 2020 FUJITSU LIMITED
A
Estimationw/HW/w/oHW
Fasterthanprev.est.
Lowerenergythanprev.est.
B C D E FDGEMM Stream
Performance
Energy
Note: evaluation underway
20
Current status normalized by the estimation w/o HW
◼ Performance and energy (=power*elapse) of apps
Copyright 2020 FUJITSU LIMITED
A
Estimationw/HW/w/oHW
B C D E FDGEMM Stream
• Performances are match
due to precise simulation
• Lower power 8% in logic &
15% in memory use;
acceptable variations
Computations were
reduced after the
previous estimation
Good estimation and
optimization: errors are
less than 20%
Performance
Energy
Note: evaluation underway
21
Summary
Copyright 2020 FUJITSU LIMITED
https://www.riken.jp/pr/news/2019/20190827_1/
◼ “Fugaku” is co-designed and runs apps at high
performance w/ optimal power consumption
◼ Arm HPC ecosystem and apps portfolio are growing
by efforts of communities and selections of HW
Fujitsu PRIMEHPC
FX1000
◼ Fujitsu Supercomputer
PRIMEHPC FX1000 & FX700
based on Fugaku tech.
◼ Cray CS500 using Fujitsu
A64FX Arm CPU
FX700
22
Copyright 2020 FUJITSU LIMITED23

More Related Content

What's hot

Region推論アルゴリズムの実装と動向
Region推論アルゴリズムの実装と動向Region推論アルゴリズムの実装と動向
Region推論アルゴリズムの実装と動向Takamasa Saichi
 
The Tofu Interconnect D for the Post K Supercomputer
The Tofu Interconnect D for the Post K SupercomputerThe Tofu Interconnect D for the Post K Supercomputer
The Tofu Interconnect D for the Post K Supercomputerinside-BigData.com
 
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)Ryuichi Yasunaga
 
SAT/SMTソルバの仕組み
SAT/SMTソルバの仕組みSAT/SMTソルバの仕組み
SAT/SMTソルバの仕組みMasahiro Sakai
 
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U Translation
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U TranslationIETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U Translation
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U TranslationKentaro Ebisawa
 
第3回 配信講義 計算科学技術特論A (2021)
第3回 配信講義 計算科学技術特論A (2021) 第3回 配信講義 計算科学技術特論A (2021)
第3回 配信講義 計算科学技術特論A (2021) RCCSRENKEI
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumMichal Rostecki
 
Singularityで分散深層学習
Singularityで分散深層学習Singularityで分散深層学習
Singularityで分散深層学習Hitoshi Sato
 
5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA Clusters5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA ClustersYitzhak Bar-Geva
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介VirtualTech Japan Inc.
 
Slurmのジョブスケジューリングと実装
Slurmのジョブスケジューリングと実装Slurmのジョブスケジューリングと実装
Slurmのジョブスケジューリングと実装Ryuichi Sakamoto
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Kentaro Ebisawa
 
第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Brendan Gregg
 
第8回 配信講義 計算科学技術特論A(2021)
第8回 配信講義 計算科学技術特論A(2021)第8回 配信講義 計算科学技術特論A(2021)
第8回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...Preferred Networks
 
1076: CUDAデバッグ・プロファイリング入門
1076: CUDAデバッグ・プロファイリング入門1076: CUDAデバッグ・プロファイリング入門
1076: CUDAデバッグ・プロファイリング入門NVIDIA Japan
 

What's hot (20)

Region推論アルゴリズムの実装と動向
Region推論アルゴリズムの実装と動向Region推論アルゴリズムの実装と動向
Region推論アルゴリズムの実装と動向
 
The Tofu Interconnect D for the Post K Supercomputer
The Tofu Interconnect D for the Post K SupercomputerThe Tofu Interconnect D for the Post K Supercomputer
The Tofu Interconnect D for the Post K Supercomputer
 
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)
3GPP 5G SA Detailed explanation 5(5G SA Handover Call Flow include 5GC)
 
SAT/SMTソルバの仕組み
SAT/SMTソルバの仕組みSAT/SMTソルバの仕組み
SAT/SMTソルバの仕組み
 
GNU Radio Study for Super beginner
GNU Radio Study for Super beginnerGNU Radio Study for Super beginner
GNU Radio Study for Super beginner
 
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U Translation
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U TranslationIETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U Translation
IETF 104 Hackathon VPP Prototyping Stateless SRv6/GTP-U Translation
 
第3回 配信講義 計算科学技術特論A (2021)
第3回 配信講義 計算科学技術特論A (2021) 第3回 配信講義 計算科学技術特論A (2021)
第3回 配信講義 計算科学技術特論A (2021)
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
 
FreeRTOS Course - Semaphore/Mutex Management
FreeRTOS Course - Semaphore/Mutex ManagementFreeRTOS Course - Semaphore/Mutex Management
FreeRTOS Course - Semaphore/Mutex Management
 
Lispマシン・シミュレータの紹介
Lispマシン・シミュレータの紹介Lispマシン・シミュレータの紹介
Lispマシン・シミュレータの紹介
 
Singularityで分散深層学習
Singularityで分散深層学習Singularityで分散深層学習
Singularityで分散深層学習
 
5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA Clusters5G Cellular D2D RDMA Clusters
5G Cellular D2D RDMA Clusters
 
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介OpenStackを使用したGPU仮想化IaaS環境 事例紹介
OpenStackを使用したGPU仮想化IaaS環境 事例紹介
 
Slurmのジョブスケジューリングと実装
Slurmのジョブスケジューリングと実装Slurmのジョブスケジューリングと実装
Slurmのジョブスケジューリングと実装
 
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
Zebra SRv6 CLI on Linux Dataplane (ENOG#49)
 
第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)第13回 配信講義 計算科学技術特論B(2022)
第13回 配信講義 計算科学技術特論B(2022)
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
第8回 配信講義 計算科学技術特論A(2021)
第8回 配信講義 計算科学技術特論A(2021)第8回 配信講義 計算科学技術特論A(2021)
第8回 配信講義 計算科学技術特論A(2021)
 
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
深層学習の新しい応用と、 それを支える計算機の進化 - Preferred Networks CEO 西川徹 (SEMICON Japan 2022 Ke...
 
1076: CUDAデバッグ・プロファイリング入門
1076: CUDAデバッグ・プロファイリング入門1076: CUDAデバッグ・プロファイリング入門
1076: CUDAデバッグ・プロファイリング入門
 

Similar to 08 Supercomputer Fugaku

Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesKTN
 
Top 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisTop 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisNomanSiddiqui41
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemLinaro
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionNVIDIA Taiwan
 
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC EcosystemThe First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC EcosystemShinji Sumimoto
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIDataWorks Summit
 
Fujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU SpecificationsFujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU Specificationsinside-BigData.com
 
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...KTN
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiessolarisyougood
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedRCCSRENKEI
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureRebekah Rodriguez
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...Filipe Miranda
 
01 From K to Fugaku
01 From K to Fugaku01 From K to Fugaku
01 From K to FugakuRCCSRENKEI
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataLviv Startup Club
 

Similar to 08 Supercomputer Fugaku (20)

Implementing AI: High Performace Architectures
Implementing AI: High Performace ArchitecturesImplementing AI: High Performace Architectures
Implementing AI: High Performace Architectures
 
Top 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & AnalysisTop 10 Supercomputers With Descriptive Information & Analysis
Top 10 Supercomputers With Descriptive Information & Analysis
 
Post-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC EcosystemPost-K: Building the Arm HPC Ecosystem
Post-K: Building the Arm HPC Ecosystem
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
 
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC EcosystemThe First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem
The First SVE Enabled Arm Processor: A64FX and Building up Arm HPC Ecosystem
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
Fujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU SpecificationsFujitsu Presents Post-K CPU Specifications
Fujitsu Presents Post-K CPU Specifications
 
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...
 
Fujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilitiesFujitsu m10 server features and capabilities
Fujitsu m10 server features and capabilities
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
IBM PureSystems
IBM PureSystemsIBM PureSystems
IBM PureSystems
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Fugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons LearnedFugaku, the Successes and the Lessons Learned
Fugaku, the Successes and the Lessons Learned
 
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureSupermicro’s Universal GPU: Modular, Standards Based and Built for the Future
Supermicro’s Universal GPU: Modular, Standards Based and Built for the Future
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
01 From K to Fugaku
01 From K to Fugaku01 From K to Fugaku
01 From K to Fugaku
 
HPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big DataHPE Solutions for Challenges in AI and Big Data
HPE Solutions for Challenges in AI and Big Data
 

More from RCCSRENKEI

第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第9回 配信講義 計算科学技術特論B(2022)
 第9回 配信講義 計算科学技術特論B(2022) 第9回 配信講義 計算科学技術特論B(2022)
第9回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...RCCSRENKEI
 
Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...RCCSRENKEI
 
第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)RCCSRENKEI
 
210603 yamamoto
210603 yamamoto210603 yamamoto
210603 yamamotoRCCSRENKEI
 
第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
第14回 配信講義 計算科学技術特論A(2021)
第14回 配信講義 計算科学技術特論A(2021)第14回 配信講義 計算科学技術特論A(2021)
第14回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 
第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)RCCSRENKEI
 

More from RCCSRENKEI (20)

第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)第15回 配信講義 計算科学技術特論B(2022)
第15回 配信講義 計算科学技術特論B(2022)
 
第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)第14回 配信講義 計算科学技術特論B(2022)
第14回 配信講義 計算科学技術特論B(2022)
 
第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)第12回 配信講義 計算科学技術特論B(2022)
第12回 配信講義 計算科学技術特論B(2022)
 
第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)第11回 配信講義 計算科学技術特論B(2022)
第11回 配信講義 計算科学技術特論B(2022)
 
第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)第10回 配信講義 計算科学技術特論B(2022)
第10回 配信講義 計算科学技術特論B(2022)
 
第9回 配信講義 計算科学技術特論B(2022)
 第9回 配信講義 計算科学技術特論B(2022) 第9回 配信講義 計算科学技術特論B(2022)
第9回 配信講義 計算科学技術特論B(2022)
 
第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)第8回 配信講義 計算科学技術特論B(2022)
第8回 配信講義 計算科学技術特論B(2022)
 
第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)第7回 配信講義 計算科学技術特論B(2022)
第7回 配信講義 計算科学技術特論B(2022)
 
第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)第6回 配信講義 計算科学技術特論B(2022)
第6回 配信講義 計算科学技術特論B(2022)
 
第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)第5回 配信講義 計算科学技術特論B(2022)
第5回 配信講義 計算科学技術特論B(2022)
 
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
Realization of Innovative Light Energy Conversion Materials utilizing the Sup...
 
Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...Current status of the project "Toward a unified view of the universe: from la...
Current status of the project "Toward a unified view of the universe: from la...
 
第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)第4回 配信講義 計算科学技術特論B(2022)
第4回 配信講義 計算科学技術特論B(2022)
 
第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)第3回 配信講義 計算科学技術特論B(2022)
第3回 配信講義 計算科学技術特論B(2022)
 
第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)第2回 配信講義 計算科学技術特論B(2022)
第2回 配信講義 計算科学技術特論B(2022)
 
第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)第1回 配信講義 計算科学技術特論B(2022)
第1回 配信講義 計算科学技術特論B(2022)
 
210603 yamamoto
210603 yamamoto210603 yamamoto
210603 yamamoto
 
第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)第15回 配信講義 計算科学技術特論A(2021)
第15回 配信講義 計算科学技術特論A(2021)
 
第14回 配信講義 計算科学技術特論A(2021)
第14回 配信講義 計算科学技術特論A(2021)第14回 配信講義 計算科学技術特論A(2021)
第14回 配信講義 計算科学技術特論A(2021)
 
第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)第12回 配信講義 計算科学技術特論A(2021)
第12回 配信講義 計算科学技術特論A(2021)
 

Recently uploaded

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 

Recently uploaded (20)

Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 

08 Supercomputer Fugaku

  • 1. Supercomputer Fugaku Toshiyuki Shimizu FUJITSU LIMITED Feb. 18th, 2020 Copyright 2020 FUJITSU LIMITED
  • 2. Outline ◼ Fugaku project overview ◼ Co-design ◼ Approach ◼ Design results ◼ Performance & energy consumption evaluation ◼ Green500 ◼ OSS apps ◼ Fugaku priority issues ◼ Summary Copyright 2020 FUJITSU LIMITED1
  • 3. Copyright 2020 FUJITSU LIMITED Supercomputer “Fugaku”, formerly known as Post-K ApproachFocus Application performance Power efficiency Usability Co-design w/ application developers and Fujitsu-designed CPU core w/ high memory bandwidth utilizing HBM2 Leading-edge Si-technology, Fujitsu's proven low power & high performance logic design, and power-controlling knobs Arm®v8-A ISA with Scalable Vector Extension (“SVE”), and Arm standard Linux 2
  • 4. Fugaku project schedule 2011 2012 2013 2014 2015 2016 2017 2018 2020 20212019 Basic design Detailed design & Implementation Manufacturing, Installation and Tuning General operation 2022 Copyright 2020 FUJITSU LIMITED Co-Design w/ apps groups Architecture & sizing Select apps Feasibility study Apps review Fugaku development & delivery 3
  • 5. Fugaku co-design ◼ Co-design goals ◼ Obtain the best performance, 100x apps performance than K computer, within power budget, 30-40MW • Design applications, compilers, libraries, and hardware ◼ Approach ◼ Estimate perf & power using apps info, performance counts of Fujitsu FX100, and cycle base simulator • Computation time: brief & precise estimation • Communication time: bandwidth and latency for communication w/ some attributes for communication patterns • I/O time: ◼ Then, optimize apps/compilers etc. and resolve bottlenecks ◼ Estimation of performance and power ◼ Precise performance estimation for primary kernels • Make & run Fugaku objects on the Fugaku cycle base simulator ◼ Brief performance estimation for other sections • Replace performance counts of FX100 w/ Fugaku params: # of inst. commit/cycle, wait cycles of barrier, inst. fetch, branch, fp exec, data load/store wait cycles of L1D/L2, etc. ◼ Power estimation • DGEMM execution toggles on the emulator + estimation of memory and interconnect considering utilization + loss of convertors Copyright 2020 FUJITSU LIMITED4
  • 6. Co-design iterations around year 2015 ◼ Apply each of application kernels ◼ Define/refine a set of architecture parameters ◼ Implement/tune the kernel under the architecture parameters ◼ Evaluate execution time using the estimation tools ◼ Identify hardware bottlenecks and explore design space ◼ Examples of architecture parameters ◼ Frequency, # of arch regs, SIMD width, cache structure & size, # of cores… ◼ Memory, interconnect parameters ◼ Implementation of instructions: i.e. Combined gather… Copyright 2020 FUJITSU LIMITED5
  • 7. A64FX CPU ◼ Arm SVE, high performance and high efficiency ◼ DP performance 2.7+ TFLOPS, >90%@DGEMM, (CPU freq=1.8/2.0/2.2 GHz) ◼ Memory BW 1024 GB/s, >80%@STREAM Triad 12x compute cores 1x assistant core A64FX ISA (Base, extension) Armv8.2-A, SVE Peak DP performance 2.7+ TFLOPS SIMD width 512-bit # of cores 48 + 4 Memory capacity 32 GiB (HBM2 x4) Memory peak bandwidth 1024 GB/s PCIe Gen3 16 lanes High speed interconnect TofuD integrated PCle Controller Tofu Interface C C C C N O C HBM2HBM2 HBM2HBM2 CMG CMG CMG CMG CMG:Core Memory Group NOC:Network on Chip Copyright 2020 FUJITSU LIMITED FUGAKU operating cond. 6
  • 8. ◼ TSMC 7nm FinFET & CoWoS ◼ Broadcom SerDes, HBM I/O, and SRAMs ◼ 8.786 billion transistors ◼ 594 signal pins A64FX leading-edge Si-technology Copyright 2020 FUJITSU LIMITED Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core L2 Cache L2 Cache L2 Cache L2 Cache HBM2InterfaceHBM2Interface HBM2interfaceHBM2Interface PCIe InterfaceTofuD Interface RING-Bus 7
  • 9. Fujitsu-designed CPU core w/ High Memory Bandwidth ◼ A64FX out-of-order controls in cores, caches, and memories achieve superior throughput HBM2 8GiBHBM2 8GiBHBM2 8GiBHBM2 8GiB CMG L1D 64KiB, 4way 512-bit wide SIMD 2x FMAs CoreCore 230+ GB/s 115+ GB/s 12x Compute Cores + 1x Assistant Core L2 Cache 8MiB, 16way 256 GB/s 115+ GB/s 57+ GB/s Core Core x4 BW and calc. perf. A64FX B/F DP floating perf. (TFlops) 2.7+ - L1 data cache (TB/s) 11+ 4 L2 cache (TB/s) 3.6+ 1.3 Memory BW (GB/s) 1024 0.37 Copyright 2020 FUJITSU LIMITED8
  • 10. A64FX optimized load efficiency for apps performance ◼ 128 bytes/cycle sustained bandwidth even for unaligned SIMD load ◼ “Combined Gather” doubles gather (indirect) load’s data throughput, when target elements are within a “128-byte aligned block” for a pair of two regs, even & odd L1D cache Read port0 Read port1 Read data0 64B/cycle Read data1 64B/cycle Mem. 128B 0 1 2 3 4 5 6 7 0 1 3 2 6 7 45 8B Regs flow-1 flow-2 flow-4 flow-3 Maximizes BW to 32 bytes/cyc. Copyright 2020 FUJITSU LIMITED Suggested through Co-design work w/ app teams 9
  • 11. A64FX Power Knobs to reduce power consumption ◼ “Power knob” limits units’ activities via user APIs ◼ Performance/Watt can be optimized by utilizing Power knobs Copyright 2020 FUJITSU LIMITED L1 inst. cache Branch Predictor Decode & Issue RSE0 RSA RSE1 RSBR PGPR EXA EXB EAGA EXC EAGB EXD PFPR Fetch Port Store Port L1 data cache HBM2 Controller Fetch Issue Dispatch Reg-read Execute Cache and Memory CSE Commit PC Control Registers L2 cache HBM2 Write Buffer Tofu controller FLA PPR FLB PRX FLA pipeline only HBM2 B/W adjust (units of 10%) Frequency reduction Decode width: 4 to 2 EXA pipeline only PCI Controller 52 cores 10
  • 12. Fugaku system software developed with RIKEN Fugaku system hardware Multi-Kernel System: RHEL8 and light-weight kernel (IHK/McKernel) Fujitsu Technical Computing Suite / RIKEN-developed system software Fugaku applications Management software Programming environmentFile system System management for high availability & power saving operation Job management for higher system utilization & power efficiency LLIO NVM-based file I/O Accelerator for Fugaku OpenMP, COARRAY, Math.libs. Open MPI, MPICH(PiP), DTF XcalableMP, FDPSFEFS Lustre-based distributed file system Compilers, AI frameworks Debug & tuning tools, Containers Fugaku developed w/ RIKEN Copyright 2020 FUJITSU LIMITED Tensorflow Chainer Pytorch Docker KVM Singularity FDPS: Framework for Developing Particle Simulator PiP: Process-in-Process DTF: Data Transfer Framework IHK: Interface for Heterogeneous Kernels 11
  • 13. Outline ◼ Fugaku project overview ◼ Co-design ◼ Approach ◼ Design results ◼ Performance & energy consumption evaluation ◼ Green500 ◼ OSS apps ◼ Fugaku priority issues ◼ Summary Copyright 2020 FUJITSU LIMITED12
  • 14. Green500, Nov. 2019 ◼ A64FX prototype – ◼ Fujitsu A64FX 48C 2GHz ranked #1on the list ◼ 768x general purpose A64FX CPU w/o accelerators • 1.9995 PFLOPS @ HPL, 84.75% • 16.876 GF/W • Power quality level 2 Copyright 2020 FUJITSU LIMITED https://www.top500.org/green500/lists/2019/11/ 13
  • 15. How we achieved ◼ Key for GF/W is {energy efficient HW} x {parallel/exec efficiency} ◼ A64FX is designed for energy efficient ◼ Fujitsu’s proven CPU microarchitecture & 7nm FinFET ◼ SoC design: Tofu interconnect D integrated ◼ CoWoS: 4x HBM2 for main memory integrated ◼ Superior parallel/exec efficiency ◼ Math. libraries are tuned for application efficiency ◼ Comm. libs are also tuned utilizing long experience of Tofu @ K computer ◼ Performance tuning is efficiently done utilizing rich performance analyzer/monitor Copyright 2020 FUJITSU LIMITED + Concerted efforts of co-design 14
  • 16. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0 5,000,000 10,000,000 15,000,000 20,000,000 SC19 TOP500 calculation efficiency ◼ A64FX superior efficiency 84.75% with small Nmax ◼ Results of: ◼ Optimized communication and math. libs ◼ Optimization of overlapped communication Copyright 2020 FUJITSU LIMITED A64FX prototype Nmax Challenge Efficiency w/ Acc w/o Acc 15
  • 17. A64FX CPU performance evaluation for real apps ◼ Open source software, Real apps on an A64FX @ 2.2GHz ◼ Up to 1.8x faster over the latest x86 processor (24core, 2.9GHz) x2 ◼ High memory BW and long SIMD length of A64FX work effectively with these applications Copyright 2020 FUJITSU LIMITED OpenFOAM FrontlSTR ABINIT SALMON SPECFEM3D WRF MPAS FUJITSU A64FX x86 (24core, 2.9GHz) x2 0.5 1.0 1.5 Relative speed up ratio 16
  • 18. A64FX CPU power efficiency for real apps ◼ Performance / Energy consumption on an A64FX @ 2.2GHz ◼ Up to 3.7x more efficient over the latest x86 processor (24core, 2.9GHz) x2 ◼ High efficiency is achieved by energy-conscious design and implementation Copyright 2020 FUJITSU LIMITED OpenFOAM FrontlSTR ABINIT SALMON SPECFEM3D WRF MPAS 1.0 2.0 3.0 4.0 Relative power efficiency ratio FUJITSU A64FX x86 (24core, 2.9GHz) x2 17
  • 19. Fugaku priority issues and performance prediction ◼ 100x app performance and power budget requirements are met ◼ Some apps utilize power knob to reduce power consumption and achieve high energy efficiency ◼ Measuring and optimizing using real HW Copyright 2020 FUJITSU LIMITED https://postk-web.r-ccs.riken.jp/perf.html 18
  • 20. Fugaku project schedule 2011 2012 2013 2014 2015 2016 2017 2018 2020 20212019 Basic design Detailed design & Implementation Manufacturing, Installation and Tuning General operation 2022 Copyright 2020 FUJITSU LIMITED Co-Design w/ apps groups Architecture & sizing Select apps Feasibility study Apps review Fugaku development & delivery Previous estimation w/o HW w/ HW Apps tuning 19
  • 21. Current status normalized by the estimation w/o HW ◼ Performance and energy (=power*elapse) of apps Copyright 2020 FUJITSU LIMITED A Estimationw/HW/w/oHW Fasterthanprev.est. Lowerenergythanprev.est. B C D E FDGEMM Stream Performance Energy Note: evaluation underway 20
  • 22. Current status normalized by the estimation w/o HW ◼ Performance and energy (=power*elapse) of apps Copyright 2020 FUJITSU LIMITED A Estimationw/HW/w/oHW B C D E FDGEMM Stream • Performances are match due to precise simulation • Lower power 8% in logic & 15% in memory use; acceptable variations Computations were reduced after the previous estimation Good estimation and optimization: errors are less than 20% Performance Energy Note: evaluation underway 21
  • 23. Summary Copyright 2020 FUJITSU LIMITED https://www.riken.jp/pr/news/2019/20190827_1/ ◼ “Fugaku” is co-designed and runs apps at high performance w/ optimal power consumption ◼ Arm HPC ecosystem and apps portfolio are growing by efforts of communities and selections of HW Fujitsu PRIMEHPC FX1000 ◼ Fujitsu Supercomputer PRIMEHPC FX1000 & FX700 based on Fugaku tech. ◼ Cray CS500 using Fujitsu A64FX Arm CPU FX700 22