An Update on CSCS

T. Schulthess
Thomas C. Schulthess
1
An update on CSCS

T. Schulthess
ETH Domain
2
Lausanne
Basel
Thoune
Villigen
Birmensdorf
Zürich
Dübendorf
St. Gallen
Kastanienbaum
Bellinzona
Lugano-
Cornaredo
Davos
Neuchâtel
Sion
ETH Zurich
EPFL, Lausanne
PSI
WSL
Empa
Eawag

T. Schulthess 3
Centro Svizzero di Calcolo Scientifico (CSCS)
The Swiss National Supercomputing Center
• Established in 1991 as a unit of ETH Zurich
• Located in Lugano, Ticino
• 115 highly qualified staff (25f) from 26 nations
• Flexible Infrastructure: Power/cooling: 12MW, upgradable to 25MW
• Develops & operates the key supercomputing capabilities required to solve important problems to science and/or society
• A research infrastructure with ~2000 users, 200 projects
• Leads the national strategy for High-Performance Computing and Networking (HPCN) that was passed by Swiss Parliament in 2009
• Has a dedicated User Laboratory for supercomputing since 2011
(i.e. research infrastructure funded by the ETH Domain on a programmatic basis)
• From 2017, tier-0 system of the “Partnership for Advanced Computing in Europe” (PRACE)

T. Schulthess
User Lab allocations 2018 – by research fields
4
node hours storage

T. Schulthess
User Lab allocations 2018 – by institutions
5

T. Schulthess
“Piz Daint” 2017 fact sheet
6
http://www.cscs.ch/publications/fact_sheets/index.html
~5’000 NVIDIA P100 GPU accelerated nodes
~1’400 Dual multi-core socket nodes
Institutions using Piz Daint (in 2019)
•User Lab (including PRACE Tier-0 allocations)
•University of Zurich, USI, PSI, EMPA
•NCCR MARVEL and HBP (EPFL)
•CHIPP (Swiss LHC Grid sine Aug. 2017)
•Others (exploratory)

T. Schulthess 7
CSCS vision for next generation systems
•Performance goal: develop a general purpose system (for all domains) with enough
performance to run “exascale weather and climate simulations” by 2022, specifically,
•Run global model with 1 km horizontal resolution at one simulated year per day
throughput on a system with similar footprint at Piz Daint;
•Functional goal: converged Cloud and HPC services in one infrastructure
•Support most native Cloud services on supercomputer replacing Piz Daint in 2022
•In particular, focus on software defined infrastructure (networking, storage and
compute) and service orientation
Pursue clear and ambitious goals for successor of Piz Daint

T. Schulthess 8
September 15, 2015
Today’s Outlook: GPU-accelerated Weather Forecasting
John Russell
“Piz Kesch”
Since April 2016, the Swiss version* of the COSMO model is running operationally on GPUs
(*) Swiss version of the COSMO model is running at
1km horizontal resolution over Alpine region and was
(in 2016) ~10x more efficient than the state of the art

T. Schulthess
MeteoSwiss’ performance ambitions in 2013
9
1
5
10
15
20
25
30
35
40
Constant budget for investments and operations
24x
Ensemble with multiple forecasts
Grid 2.2 km → 1.1 km
10x
Requirements from MeteoSwiss
Data assimilation
6x
We need a 40x improvement between 2012 and 2015 at constant cost
?

T. Schulthess
COSMO: old and new (refactored) code
10
main (current / Fortran)
physics
(Fortran)
dynamics (Fortran)
MPI
system
main (new / Fortran)
physics
(Fortran)
with OpenMP /
OpenACC
dynamics (C++)
MPI or whatever
system
Generic Comm.
Library
boundary
conditions & halo
exchg.
stencil library
X86* CUDA
Shared Infrastructure
ROCmPhi*
* two different OpenMP backends

T. Schulthess
Where the factor 40 improvement came from
11
1
5
10
15
20
25
30
35
40
Constant budget for investments and operations
Grid 2.2 km → 1.1 km
24x
Ensemble with multiple forecasts
Data assimilation
10x
1.7x from software refactoring (old vs. new implementation on x86)
2.8x Mathematical improvements (resource utilisation, precision)
2.8x Moore’s Law & arch. improvements on x86
2.3x Change in architecture (CPU → GPU)
1.3x additional processors
Requirements from MeteoSwiss
6x
Investment in software allowed mathematical improvements and change in architecture
There is no silver bullet!
Bonus: reduction in power!

T. Schulthess
Leadership in weather and climate
12
European model may be the best – but far away from
sufficient accuracy and reliability!
Peter Bauer, ECMWF

T. Schulthess
Structural convergence
Statistics of cloud ensemble:
E.g., spacing and size of convective clouds
Bulk convergence
Area-averaged bulk effects upon ambient flow:
E.g., heating and moistening of cloud layer
Resolving convective clouds (convergence?)
13
Source: Christoph Schär, ETH Zurich

T. Schulthess
Structural and bulk convergence
14
Source: Christoph Schär, ETH Zurich
Statistics of cloud area Statistics of up- & downdrafts
No structural convergence Bulk statistics of updrafts converges
Factor 4
(Panosetti et al. 2018)

T. Schulthess 15
Source: Christoph Schär, ETH Zurich, & Nils Wedi, ECMWF
Can the delivery of a 1km-scale
capability be pulled in by a decade?

T. Schulthess
Our “exascale” goal for 2022
16
Horizontal resolution 1 km (globally quasi-uniform)
Vertical resolution 180 levels (surface to ~100 km)
Time resolution Less than 1 minute
Coupled Land-surface/ocean/ocean-waves/sea-ice
Atmosphere Non-hydrostatic
Precision Single (32bit) or mixed precision
Compute rate 1 SYPD (simulated year wall-clock day)

T. Schulthess
Running COSMO 5.0 & IFS (“the European Model”) at global scale on Piz Daint
17
Scaling to full system size: ~5300 GPU accelerate nodes available
Running a near-global (±80º covering 97% of Earths surface) COSMO 5.0 simulation & IFS
> Either on the hosts processors: Intel Xeon E5 2690v3 (Haswell 12c).
> Or on the GPU accelerator: PCIe version of NVIDIA GP100 (Pascal) GPU

T. Schulthess
The baseline for COSMO-global and IFS
18

T. Schulthess
Memory use efficiency
19
Necessary data transfers
Actual data transfers
Fuhrer et al., Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2017-230, published 2018
Achieved BW
Max achievable BW
(STREAM)
0.88
0.76
= 0.67
2x lower than peak BW
0.55 w. regard to
peak BW

T. Schulthess
Can the 100x shortfall of a grid-based implementation like COSMO-global be overcome?
20
1. Icosahedral/octahedral grid (ICON/IFS) vs. Lat-long/Cartesian grid (COSMO)
2x fewer grid-columns
Time step of 10 ms instead of 5 ms
4x
2. Improving BW efficiency
Improve BW efficiency and peak BW 2x
(results on Volta show this is realistic)
3. Strong scaling
4x possible in COSMO, but we reduced
available parallelism by factor 1.33
3x
4. Remaining reduction in shortfall 4x
Numerical algorithms (larger time steps)
Further improved processors / memory
But we don’t want to increase the footprint of the 2022 system succeeding “Piz Daint”
100x

T. Schulthess
What about ensembles and throughput for climate?
(Remaining goals beyond 2022)
21
1. Improve the throughput to 5 SYPD
2. Reduce the footprint of a single simulation by up to factor 10-50
Necessary data transfers
Actual data transfers
Achieved BW
Max achievable BW
Change the architecture from control flow to data flow centric (reduce necessary data transfers)
We may have to change the footprint of machines to hyper scale!

T. Schulthess 22
LUMI Consortium
•Large consortium with strong national HPC centres and competence
provides a unique opportunity for
•knowledge transfer;
•synergies in operations; and
•regionally adaptable user support for extreme-scale systems
•National & EU investments (2020-2026)
Finland 50 M€
Belgium 15.5 M€
Czech Republic 5 M€
Denmark 6 M€
Estonia 2 M€
Norway 4 M€
Poland 5 M€
Sweden 7 M€
Switzerland 10 M€
EU 104 M€
Plus additional investments in applications development

T. Schulthess
Kajaani Data Center (LUMI)
23
100% hydroelectric energy up to 200 MW
2200 m2 floor space, expandable up to 4600 m2
Waste heat reuse: effective energy price 35 €/MWh,
negative CO2 footprint: 13500 tons reduced every year
One power grid outage in 36 years
100% free cooling @ PUE 1.03
Extreme connectivity:
Kajaani DC is a direct part of the Nordic backbone; 4x100
Gbit/s in place; can be easily scaled up to multi-terabit level
Zero network downtime since the establishment of the DC in 2012

T. Schulthess
Collaborators on Exascale (climate)
24
Tim Palmer (U. of Oxford)
Christoph Schar (ETH Zurich)
Oliver Fuhrer (MeteoSwiss)
Peter Bauer (ECMWF)
Bjorn Stevens (MPI-M)
Torsten Hoefler (ETH Zurich)Nils Wedi (ECMWF)

An Update on CSCS

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Similar to An Update on CSCS

Similar to An Update on CSCS (20)

More from inside-BigData.com

More from inside-BigData.com (20)

Recently uploaded

Recently uploaded (20)

An Update on CSCS