SlideShare a Scribd company logo
1 of 53
Download to read offline
April 20 2023
DRAC: Designing RISC-V-
based Accelerators for Next
Generation Computers
Miquel Moretó
UPC and BSC
Universidad Complutense de Madrid (UCM), Madrid (Spain)
Barcelona Supercomputing Center
Centro Nacional de Supercomputación
BSC-CNS objectives
Supercomputing services
to Spanish and EU researchers
R&D in Computer, Life, Earth and
Engineering Sciences
PhD programme, technology
transfer, public engagement
Spanish Government 60%
Catalan Government 30%
Univ. Politècnica de Catalunya (UPC) 10%
BSC-CNS is
a consortium
that includes
MareNostrum 1
2004 – 42,3 Tflops
1st
Europe / 4th
World
New technologies
MareNostrum 2
2006 – 94,2 Tflops
1st
Europe / 5th
World
New technologies
MareNostrum 3
2012 – 1,1 Pflops
12th
Europe / 36th
World
MareNostrum 4
2017 – 11,1 Pflops
2nd
Europe / 13th
World
New technologies
Access: prace-ri.eu/hpc_acces Access: bsc.es/res-intranet
General Purpose Cluster: 11.15 Pflops
MN4 CTE-Power: 1.57 Pflops
MN4 CTE-ARM: 0.65 Pflops
MN4 CTE-AMD: 0.52 Pflops
MareNostrum 4
Total peak performance: 13,9 Pflops
GPP - General Purpose
Intel Sapphire Rapids
Peak performance: 45,4 Pflops
Sustained HPL: 35,4 Pflops
April 2023
ACC – Accelerated
Intel Sapphire Rapids
NVIDIA Hopper
Peak performance: 260 Pflops
Sustained HPL: 163 Pflops
June 2023
NGT GPP - Next Generation
NVIDIA Grace
Peak performance: 2,82 Pflops
Sustained HPL: 2 Pflops
June 2023
NGT ACC - Next Generation
Intel Emerald Rapids
Intel Rialto Bridge
Peak performance: 6 Pflops
Sustained HPL: 4,24 Pflops
December 2023
InfiniBand NDR 200
Fat Tree
Spectrum Scale File System
248 PB HDD
2,81 PB NVMe
402 PB tape
January 2023
MareNostrum5
BSC Staff Evolution
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024
65
114
164
229
279
302 310 321
358
433 447
475
529
613
668
737
782 787
982
1060,56
BSC Staff evolution 2005 - 2022, plus forecast 2023 & 2024
Data at 30th June 2022
(Including collaborators)
6
Collaborations with Global IT Industry
June 11, 2020
Computer
Sciences
Earth
Sciences
CASE
Life
Sciences
To influence the way machines are built, programmed
and used: programming models, performance tools,
Big Data, Artificial Intelligence , computer architecture,
energy efficiency
To develop and implement global and
regional state-of-the-art models for short-
term air quality forecast and long-term
climate applications
To understand living organisms by means of
theoretical and computational methods
(molecular modeling, genomics, proteomics)
To develop scientific and engineering software to
efficiently exploit super-computing capabilities
(biomedical, geophysics, atmospheric, energy, social
and economic simulations)
Mission of BSC Scientific Departments
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
The RISC-V Revolution!
Today´s Technology Trends
Massive penetration of Open
Source Software
• IoT (Arduino),
• Mobile (Android),
• Enterprise (Linux),
• HPC (Linux,
OpenMP, etc.)
New Open Source Hardware
Momentum from IoT and the
Edge to HPC
• RISC-V
• OpenPOWER
Moore´s Law + Power =
Specialization (HW/SW Co-
Design)
• More cost effective
• More performant
• Less Power
HPC Today
• Europe has led the way in defining a
common open HPC software ecosystem
• Linux is the de facto standard OS despite
proprietary alternatives
• Software landscape from Cloud to IoT
already enjoys the benefit of open source
• Open source provides:
• A common platform, specification and
interface
• Accelerates building new functionality by
leveraging existing components
• Lowers the entry barrier for others to
contribute new components
• Crowd-sources solutions for small and larger
problems
• What about Hardware and in particular, the
CPU? CPUs/GPUs/ASICs
HW Systems
OS
Compiler/Toolchain
Schedulers
Libraries/Platforms
Applications
OPEN
CLOSED
Linux
LLVM
COMPS
OpenMP
GROMACS,NAMD,
WRF, VASP, etc.
OCP
HPC Tomorrow
• Europe can lead the way to a
completely open SW/HW stack for the
world
• RISC-V provides the open source
hardware alternative to dominating
proprietary non-EU solutions
• Europe can achieve complete
technology independence with these
foundational building blocks
• Currently at the same early stage in HW
as we were with SW when Linux was
adopted many years ago
• RISC-V can unify, focus, and build a
new microelectronics industry in
Europe. CPUs/GPUs/ASICs
HW Systems
OS
Compiler/Toolchain
Schedulers
Libraries/Platforms
Applications
OPEN
Linux
LLVM
COMPS
OpenMP
GROMACS,NAMD,
WRF, VASP, etc.
OCP
RISC-V,
OpenPower,
MIPS
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
The European Processor Initiative (EPI)
The European Processor Initiative (EPI) under the SGA1 of the
Framework Partnership Agreement (FPA: 800928), to design and
implement a roadmap for a new family of low-power European
processors for extreme scale computing, high-performance Big-Data and
a range of emerging applications.
• History: Remember MontBlanc? BSC leads the RISC-V HPC accelerator
development
• Consortium (SGA1):
28 partners from 10 European countries to Coordinate: Bull SAS (France)
• Budget: €80M (100% funded)
• Duration: 36 months (01/12/2018-31/12/2021)
• 5 Streams (4 Technical and 1 Management/Exploitation/C&D)
The European Processor Initiative
Key contact point
Jesús Labarta jesus.labarta@bsc.es
👉
15
EPI MAIN OBJECTIVE
To develop European microprocessor and accelerator technology
 Strengthen competitiveness of EU industry and science
Rhea
Arm-based
general purpose
CPU
EPAC
RISC-V based
Accelerators
SiPearl BSC, SemiDynamics,
EXTOLL, FORTH, ETHZ,
UniBo, UniZG, Chalmers,
CEA, E4, Menta, ZPT, …
16
Pilot based
on EPAC
EPI SGA2
Rhea1 Go-To-Market
EPAC Demonstrator
OVERALL TECHNOLOGY ROADMAP
EPI SGA1
EPAC test chip 1.0
Next project(s)
EU Chips act
related
Exascale
Systems
Centers of Excellence in HPC Applications
2019 – 2021 2022 - 2024 2025 - …
2015 – 2018
H2020 projects
HPC ecosystem exploration
Pilot based
on Rhea I
Rhea
Arm-based
general
purpose CPU
EPAC
RISC-V
based
Accelerators
17
EPAC: A RISC-V ACCELERATOR
17
EPAC accelerator
Network on Chip (NoC)
L2 HN …
…
C
V
STX … VRP
AXI Lite
Peripherals Peripherals Peripherals
Bridge
Bridge
Host
CPU
L2 HN L2 HN
C
V
STX
18
RISC-V core and VPU
RISC-V core: Avispado
• 2-way in-order core
• Full HW-support for unaligned accesses
• Cache: L1I$ =16KB, L1D$ = 32KB
VPU
• Long vectors: 256 DP elements
– #Functional Units (FUs) << Vector Length (VL)
– 1 vector instruction can take several cycles
• 8 Lanes per core
– FMA/lane: 2 DP Flop/cycle
• 40 physical registers, some out of order
F. Minervini, O. Palomar. RISC-V Summit 2021
“Vitruvius: An Area-Efficient Decoupled Vector
Accelerator for High Performance Computing”
Architecture Vector register size (1 cell = 1 double element)
Intel AVX512 D1 D2 D3 D4 D5 D6 D7 D8
Arm Neon D1 D2
A64FX D1 D2 D3 D4 D5 D6 D7 D8
NEC Aurora SX D1 D2 D3 D4 D5 D6 D7 D8 … D256
EPAC VPU D1 D2 D3 D4 D5 D6 D7 D8 … D256
Key contact point
Adrián Cristal adrian.cristal@bsc.es
👉
Key contact point
Roger Espasa (SemiDynamics)
👉
19
EPAC
EPAC programming environment
• Offload of processes from host to accelerator
• Interoperability MPI + OpenMP
– Accelerator can run MPI and OpenPM processes
• Task-based models
– Taskify MPI calls
– Single mechanism
 Concurrency
 Locality and data management
• Long vectors (256 elements, 8 lanes per core)
– Decouple front-end from back-end
– Convey access pattern semantics to the architecture
– Vector length agnostic (VLA) programming and architecture
Applications
Libraries (FFTW, SpMV, ...)
Scheduler (Slurm)
Compiler (LLVM)
OS (Linux)
Hardware
(Host + Accelerator)
Programming Model
(OpenMP, MPI)
Memory
Host
CPU
Bridge L2
Vector
Core
STX
20
How to use the V-extensions?
• Assembler: always a valid option but not the most pleasant
• C/C++ builtins (intrinsics)
– Low-level mapping to instructions
– Allows embedding it into an existing C/C++ codebase
– Allows relatively quick experimentation
• #pragma omp simd (aka “Semi automated vectorization”)
– Relies on vectorization capabilities of the compiler
 Usually works but gets complicated if the code calls functions
– Also usable in Fortran
• Autovectorization: let compiler work for you ;-)
Interested in compiling your code in RVV?
Roger Ferrer roger.ferrer@bsc.es
👉
21
RISC-V platforms (SDV)
• Hardware/Software infrastructure for
Continuous Integration and RTL check
– HiFive commercial hardware (scalar)
– EPAC RTL (1core) on FPGA
• Platform to:
– Demonstrate a full HPC software stack
 Linux, compiler, libraries, job scheduler, MPI
– Test latest RTL with complex codes
 Advanced performance analysis tools
 Accurate timing available
Commercial and FPGA-based
RISC-V
Commercial
(only scalar)
EPAC FPGA-based
implementation
(vector support)
Full SDV
Interested in testing your code on EPAC?
Filippo Mantovani filippo.mantovani@bsc.es
👉
22
• Chip fabrication Q2 2021
• Global Foundries 22nm (GF22FDX)
• Final Top level chip floorplan
• Total area:
• 5943 X 4593 um2
• (27.297 mm2
)
STX
Avispado
VPU
L2 HN
Avispado
VPU
L2 HN
Avispado
VPU
L2 HN
Avispado
VPU
L2 HN
STX
VRP
serdes serdes
EPAC 1.0 Test Chip (Tapeout Q2 2021)
23
EPAC 1.5 Test Chip (Tapeout Q4 2022)
24/10/2022 ACAT 2022, Suarez
STX STX
XP XP XP
HN
L2$
HN
L2$
XP
HN
L2$
XP
HN
L2$
AVS
VPU
FPGA
Bridge
VRP
XP
I/O
micro-
tile
AVS
VPU
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
DRAC Overview
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
• CIC-IPN Lagarto I design in Mexico (2012-2017)
• 5-stage single issue in-order pipeline, MIPS-based microcontroller
• FPGA implementation capable of booting Linux
• BSC and CIC-IPN Lagarto Initiative (2018 onwards)
• MIPS to RISC-V
• FPGA to ASIC
• European Processor Initiative (EPI) (Dec 2018 - Dec 2023)
• Flagship project (80M€ Phase 1; 70M€ Phase 2; 26 partners)
• BSC leads the European RISC-V Vector Accelerator (EPAC)
• July 2018: Come to my office...
• RIS3CAT “Emerging Technologies” call in Nov 2018
It all started in Mexico...
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
• DRAC: Designing RISCV-based Accelerators
for next generation Computers
• Consortium: BSC (coord.), UPC, UAB, UB, URV
• Dates: June 2019 – June 2023
• Budget: 4M€ (50% co-funded by Generalitat)
• Alignment with the European Processor Initiative (EPI) project:
• Focus on RISC-V-based accelerator developed in Barcelona
• Promote RISC-V in the CS degrees in Catalan universities
• Build IC design teams capable of taping out DRAC technology: RTL design,
verification and physical design
DRAC Project
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional
Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of
Catalonia. Copyright 2020 © All Rights Reserved.
1 Design of an out-of-order general purpose RISC-V processor
2 Design of accelerators and required hardware support to have secure processors
that incorporate post-quantum cryptographic schemes and virtualization techniques
3 Design of accelerators for genomics data analytics
4 Design of efficient and low power processors for autonomous navigation applications
5 Building a local ecosystem for custom hardware design and fabrication
6 Transfer DRAC technology to local and international companies
7 Transfer DRAC technology to the university system using educational kits based on
DRAC designs
DRAC General Objectives
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Lagarto RISC-V Tapeouts
2020 2021 2022 2023
DVINO (Apr’21)
- Lagarto Hun in-
order
- VPU
- PLL 600 MHz
- SDRAM mem cont
- HyperRAM
- VGA
- ADC
- TSMC 65nm
- 8mm2
Kameleon (Dec’22)
- Lagarto Ka 11-stage ooo
- PLL 1.2 GHz
- Automotive Accel
- Crypto Accel
- Genomic
Accel
- PICOS Accel
- SerDes 8GHz
- GF 22nm
- Area: 9mm2
Sargantana (Feb’22)
- Sargantana 7-stage
in-order
- PLL 1.2 GHz
- Custom extensions
- SDRAM
- Prototype analog
IPs: SerDes 8GHz
- GF 22nm
- Area: 2.9mm2
2019
Lagarto (May’19)
- Lagarto Hun 5-stage
in-order
- 150MHz (external)
- TSMC 65nm
- 2.5mm2
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Lagarto: First RISCV Tapeout (May 2019)
• Target design:
• Lagarto Hun in-order scalar core, 5 stages, single issue, RV64IMA
• 16KB L1 caches, 64KB L2 cache, TLB
• Memory controller on the FPGA side via packetizer
• Debug ring via JTAG
• Target technology: TSMC 65nm, area fits in 2.5mm2
• Fabrication and bringup
• Submitted in May 2019
• Samples received in Sep 2019
• Bringup with custom PCB in Oct 2019
• Linux boot in Dec 2019
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
From Architecture to an Actual System
RTL design and verification IC design
PCB design and
assembly
System
characterization
Technology
information
IC design flow
Physical verification
Prepare for manufacturing
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
DVINO: 2nd Lagarto Tapeout (Apr 2021)
• DRAC Vector IN-Order (DVINO) processor details:
• Lagarto Hun scalar pipeline, 5-stage, in-order, RV64IMA
• Hydra 2-lane (VPU-DRAC-1.0), 4096-bit vector length
• Internal PLL. DVINO can run at 600, 400, 300 and 200MHz
• In-house L1 instruction cache and PMU
• L1 data and L2 caches from lowRISC 0.2
• Multiple contr: JTAG, UART, SPI, VGA, SDRAM and Hyperram.
• In-house JTAG-based debug-ring
• Technology node: TSMC 65nm (Europractice)
• Area: 8.6mm2
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Sargantana: 3rd Lagarto Tapeout (Feb 2022)
• Sargantana in-order processor details:
• Lagarto Hun pipeline, 7-stage, in-order, RV64IMAFD (RV64G)
• Support for floating point operations (single and double precision)
• Integer SIMD VPU, 128-bit vector length, custom instructions
• Internal PLL. Sargantana can run above 1.1GHz
• In-house L1 instruction cache and PMU
• L1 data and L2 caches from lowRISC 0.2
• Multiple controllers: JTAG, UART, SPI, SerDes, and Hyperram
• In-house JTAG-based debug-ring
• Technology node: GF 22nm (Europractice)
• Area: 2.9mm2
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Kameleon: 4th Lagarto Tapeout (Dec 2022)
L2 Cache
Memory Controller
Lagarto
Ka
L1 I L1 D
Sargan-
tana
Genomics
(WFA)
Arbiter to select core
Clock
Sauria
L1 I L1 D
AXI
PQC PICOS
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Lagarto Ka Out-of-Order Core
Current Features
• 2-way 64-bit out-of-order
architecture
• RV64IMA ISA
• 11-stage pipeline implementation
• Parameterized branch predictor:
› BTB 16-128 entries
› BHT 16-128 entries
› RAS 2-8 entries
• ROB 128-entries
• Low-power Integer queue (out-of-
order issue)
• In-order Load/Store Queue
• Hit-under-miss support
• Configurable, L1 caches
› 16 KiB L1 I-cache (Typical)
› 32 KiB L1 D-cache (Typical)
3
SAURIA Physical Design
55,744
67%
10,816
13%
9,984
12%
6,656
8%
Total power = 83.2 mW
Systolic Array
SRAMs
Feeders
Others
Specs:
8x16 Array: 128 Processing Elements (PE)
Approximate logic in PE multipliers & adders
1.00 x 0.95 mm = 0.95 mm2
128 GFLOP/s @ 500MHz
1.56 TFLOP/sW
Genome Alignment Acceleration (WFA)
• Pairwise alignment/mapping DNA/RNA sequences with the
novel Wavefront Alignment (WFA) algorithm
• Target: provide specific hardware support for the most time-
consuming operations of the algorithm.
• ISA extensions:
- [vmax_vv] Vectorial maximum.
- [vmax3inc_vv] Vectorial “3-way” maximum fused with
increment.
- [vcnt] Scalar “count consecutive matches”
- [vcnt_vv] Vectorial “count consecutive matches”.
• Use narrow-integers 16-bit or 8-bit integers
• Monolythic accelerator integrated with Lagarto SoC

Single aligner (due to area limitations) capable of 64 ops/cycle

Current Place and Route (PnR) results: 1.1GHz (typical corner)

Area: 1.6mm2 in Global Foundries 22nm

Performance speedups: 515x with 10K reads, 10% error
• Main Goal: RISC-V acceleration of different PQC schemes
• Classic McEliece (CME) KEM acceleration:
• HW/SW co-design implementation of
CME KEM @ Zynq Ultrascale [FPL’21]
• Monolythic CME accelerator based on HLS
• Integration of the CME accelerator in
Lagarto SoC via AXI interface
• Accelerate other KEM and digital signature (DS) schemes:
• NRTU KEM and Crystals-Kyber KEM / Crystals-Dilithium DS
• Both algorithms rely on very different operations and will require different
acceleration techniques.
[FPL’21] V. Kostalampros et al., HLS-Based HW/SW Co-Design of the Post-
Quantum Classic McEliece Cryptosystem. FPL 2021.
PQC Acceleration
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Building a RISC-V Ecosystem in Barcelona
Lagarto Team (Sept 2019)
DRAC KoM (Feb 2020)
DRAC F2F (Jul 2022)
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Building a RISC-V Ecosystem in Barcelona
DRAC Final Workshop (December 2022)
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Lagarto Roadmap
First Steps Towards a Lagarto Multicore
DVINO
L1 I L1 D
L2 cache
NoC
DVINO
L1 I L1 D
L2 cache
NoC
DVINO
L1 I L1 D
L2 cache
NoC
DVINO
L1 I L1 D
L2 cache
NoC
Mem Cntr
• Multicore design based on:
• DVINO processor
• RISC-V ISA support: I, M, A,
F, D, C, V
• 4-lane VPU
• OpenPiton 2-level cache
hierarchy (priv. L1, shared L2)
• Current status
• Linux boot (openSBI)
• RTL simulation of parallel
applications (up to 64 cores)
• Multiple memory controllers
• FPGA-ready
• Integrating Ka+VPU
Towards a RISC-V Heterogeneous Manycore
Ka
Core
VPU
DL1
IL1
L2
General Purpose
Processor Tile
Ka
Core DL1
IL1
L2
Ka
Core DL1
IL1
Sargan
-tana
DL1
IL1
L2
Sargan
-tana
DL1
IL1
HPC Accelerator Tile
Ka
Core
SAURIA
DL1
IL1
L2
Automotive, ML
Accelerator Tile
Sargan
-tana
WFA
DL1
IL1
L2
Genomics
Accelerator Tile
Other Domain-Specific Accelerator Tiles (sparse, security, safety, etc.)
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
Chips Act, PERTE Chip, Intel and more!
PERTE Chip. Microelectronics &
Semiconductors
Recovery,
Transformation and
Resilience Plan
45
M€
COMPONENT I. BOLSTERING SCIENTIFIC CAP
ACITY 1.165
ACTION 1: Development of R&D&i on cutting-edge and alternative architecture microprocessors 475
ACTION 2: Development of R&D&i on integrated photonics 150
ACTION 3: Development of R&D&i on quantum chip development 40
ACTION 4: Budget line for the IPCEI on Microelectronics and Communication Technologies 500
COMPONENT II. DESIGN S
TRA
TE
GY 1.330
ACTION 5: Creation of cutting-edge alternative architecture microprocessor fabless companies 950
ACTION 6: Creation of pilot lines 300
ACTION 7: Creation of a network for education, training and skills-building in relation to semiconductors 80
3. EXECUTION
3. EXECUTION
ACTIONS & BUDGET (I)
PERTE Chip. Microelectronics &
Semiconductors
Recovery,
Transformation and
Resilience Plan
46
M€
COMPONENT III. CONSTRUCTION OF F
ABRICA
TION PLANTS IN SPAIN 9.350
ACTION 8: Creating fabrication capacity at sizes below 5 nm 7.250
ACTION 9: Creating fabrication capacity at sizes above 5 nm 2.100
COMPONENT IV
. STIMULATING THE ICT MANUF
ACTURING INDUSTRY IN SPAIN 400
ACTION 10: ICT manufacturing industry incentive scheme 200
ACTION 11: Creation of a chips fund 200
GOVERNANCE 5
Special Commissioner for the Microelectronic and Semiconductors Project 5
TOT
AL PUBLIC INVESTMENT 12.250
3. EXECUTION
3. EXECUTION
ACTIONS & BUDGET (II)
Intel Labs Barcelona are back!
● New joint Intel – BSC Laboratory to design HPC processors based on RISC-V technology
● Funding: 400M$ in the next 10 years. Headcount: ~200 (estimated)
Other companies will also come to Spain!
Research & Product
Developing European Hardware/Software
Technology
Full Stack Open Source HPC Ecosystem
Intel and BSC: Continuing to collaborate into
the Zettascale era
European & Global collaboration
Build Full System based on RISC-V: MN6 and
many others
Path to Zettascale
If your experience and/or motivation include any of the
disciplines and skills below, check out our QR:
Hardware (Processor architecture, micro-architecture, accelerators,
memory hierarchy, memory controllers, HBM, DRAM, non-volatile
memory, RTL design, VHDL, verilog, SystemC, System verilog, Synopsys,
Cadence, Mentor Graphics, synthesis, place and route, timing closure,
packaging, PCB design, verification, validation, CI, post-silicon debug,
DFT, gate-level simulation,…)
Software (programming models, MPI, compilers (LLVM), SYCL, OneAPI,
Tensorflow, PyTorch, Apache Spark, CI/CD, operating systems, managed
runtimes, OpenMP, task-based programming models, containers,
security, fault-tolerance, virtualization, C/C++, Tcl, Python, Perl/Csh,… )
Research Engineer/Researcher
R&D for Zettascale and beyond: Applications to Accelerators
Reference: 176_23_CS_Z_R0-4-RE1-4
BSC is building a New Lab!
100+ Job Opportunities
Openchip Confidential © 2023
Openchip
A fabless semiconductor company building RISC-V
based High Precision, High Performance
Accelerators targeting HPC and adjacent Enterprise
AI/ML/DL real world workloads with dense and
sparse access patterns.
Designing RISC-V-based
Accelerators
for next generation Computers
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund
within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights
Reserved.
• Open source hardware design opportunities and challenges!!
○ Many open source RTL designs available, including cores, SoCs and accelerators
○ Design toolflow partially open (simulators, testing, FPGA emulation), but still some pieces
are missing (verification, Place&Route, bringup)
○ Technology-related IP is completely closed
○ SW ecosystem still under development
○ Ideal for teaching, research and startups!!
○ Potential to become the European domestic solution
• We need your help! Join us to contribute to the RISC-V open movement:
○ Contribute to open source community with in-house designs and tools
○ Contribute to European RISC-V econsystem and projects
○ Master/PhD student and research engineer positions are available
Conclusions
• Join us in Barcelona between June 5 and 9 2023!!!
RISC-V Summit Europe 2023
Early registration deadline: Apr 30 2023
Designing RISC-V-based Accelerators for next generation Computers
Torre Girona c/Jordi Girona, 31 – Edificio Nexus II c/Jordi Girona, 29 – 08034 Barcelona (España) – Tel. (+34) 93 413 77 16 – info@bsc.es
The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional
Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of
Catalonia. Copyright 2020 © All Rights Reserved.
Thank you!

More Related Content

What's hot

CCNA 200-301 VOLUME 2.pdf
CCNA 200-301 VOLUME 2.pdfCCNA 200-301 VOLUME 2.pdf
CCNA 200-301 VOLUME 2.pdfbekhti
 
RISC-V-Introduction-_-Aug-2021.pptx
RISC-V-Introduction-_-Aug-2021.pptxRISC-V-Introduction-_-Aug-2021.pptx
RISC-V-Introduction-_-Aug-2021.pptxssuser300b04
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステムShinnosuke Furuya
 
5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei
5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei
5G: a revolution or an evolution for IoT by Merouane DEBBAH, HuaweiEuroIoTa
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudRebekah Rodriguez
 
Dell Technologies - Company and Portfolio Introduction in 20 Minutes
Dell Technologies - Company and Portfolio Introduction in 20 MinutesDell Technologies - Company and Portfolio Introduction in 20 Minutes
Dell Technologies - Company and Portfolio Introduction in 20 MinutesDell Technologies
 
CCNA Exploration Companion Guide (v4.0).pdf
CCNA Exploration Companion Guide (v4.0).pdfCCNA Exploration Companion Guide (v4.0).pdf
CCNA Exploration Companion Guide (v4.0).pdfsdafdafs
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning Dr. Swaminathan Kathirvel
 
BeagleBone black
BeagleBone blackBeagleBone black
BeagleBone blackRaja Vedula
 
IT Consulting and Integration Services brochure
IT Consulting and Integration Services brochureIT Consulting and Integration Services brochure
IT Consulting and Integration Services brochureSchneider Electric
 
5G Technology Strategy: Next-Generation Mobile Networking
5G Technology Strategy: Next-Generation Mobile Networking5G Technology Strategy: Next-Generation Mobile Networking
5G Technology Strategy: Next-Generation Mobile Networkingvenkada ramanujam
 
DataCenter:: Infrastructure Presentation
DataCenter:: Infrastructure PresentationDataCenter:: Infrastructure Presentation
DataCenter:: Infrastructure PresentationMuhammad Asad Rashid
 
What are the types of data centers
What are the types of data centersWhat are the types of data centers
What are the types of data centersLivin Jose
 
WiFi 6 - Usher in the Era of Next-Generation Connectivity
WiFi 6 - Usher in the Era of Next-Generation ConnectivityWiFi 6 - Usher in the Era of Next-Generation Connectivity
WiFi 6 - Usher in the Era of Next-Generation ConnectivityHughes Systique Corporation
 

What's hot (20)

Data center
Data centerData center
Data center
 
CCNA 200-301 VOLUME 2.pdf
CCNA 200-301 VOLUME 2.pdfCCNA 200-301 VOLUME 2.pdf
CCNA 200-301 VOLUME 2.pdf
 
RISC-V-Introduction-_-Aug-2021.pptx
RISC-V-Introduction-_-Aug-2021.pptxRISC-V-Introduction-_-Aug-2021.pptx
RISC-V-Introduction-_-Aug-2021.pptx
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Cisco Routers
Cisco RoutersCisco Routers
Cisco Routers
 
5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei
5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei
5G: a revolution or an evolution for IoT by Merouane DEBBAH, Huawei
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Dell Technologies - Company and Portfolio Introduction in 20 Minutes
Dell Technologies - Company and Portfolio Introduction in 20 MinutesDell Technologies - Company and Portfolio Introduction in 20 Minutes
Dell Technologies - Company and Portfolio Introduction in 20 Minutes
 
CCNA Exploration Companion Guide (v4.0).pdf
CCNA Exploration Companion Guide (v4.0).pdfCCNA Exploration Companion Guide (v4.0).pdf
CCNA Exploration Companion Guide (v4.0).pdf
 
Carwhisperer Bluetooth Attack
Carwhisperer Bluetooth AttackCarwhisperer Bluetooth Attack
Carwhisperer Bluetooth Attack
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
BeagleBone black
BeagleBone blackBeagleBone black
BeagleBone black
 
Data center
Data centerData center
Data center
 
IT Consulting and Integration Services brochure
IT Consulting and Integration Services brochureIT Consulting and Integration Services brochure
IT Consulting and Integration Services brochure
 
5G Technology Strategy: Next-Generation Mobile Networking
5G Technology Strategy: Next-Generation Mobile Networking5G Technology Strategy: Next-Generation Mobile Networking
5G Technology Strategy: Next-Generation Mobile Networking
 
DataCenter:: Infrastructure Presentation
DataCenter:: Infrastructure PresentationDataCenter:: Infrastructure Presentation
DataCenter:: Infrastructure Presentation
 
Advantech smart factory products
Advantech smart factory products Advantech smart factory products
Advantech smart factory products
 
What are the types of data centers
What are the types of data centersWhat are the types of data centers
What are the types of data centers
 
WiFi 6 - Usher in the Era of Next-Generation Connectivity
WiFi 6 - Usher in the Era of Next-Generation ConnectivityWiFi 6 - Usher in the Era of Next-Generation Connectivity
WiFi 6 - Usher in the Era of Next-Generation Connectivity
 
SDN Abstractions
SDN AbstractionsSDN Abstractions
SDN Abstractions
 

Similar to DRAC: Designing RISC-V-based Accelerators for next generation Computers

Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418inside-BigData.com
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systemsinside-BigData.com
 
Communication Frameworks for HPC and Big Data
Communication Frameworks for HPC and Big DataCommunication Frameworks for HPC and Big Data
Communication Frameworks for HPC and Big Datainside-BigData.com
 
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri Tulipp. Eu
 
EuroHPC and European HPC Strategy
EuroHPC and European HPC StrategyEuroHPC and European HPC Strategy
EuroHPC and European HPC Strategyinside-BigData.com
 
El nuevo superordenador Mare Nostrum y el futuro procesador europeo
El nuevo superordenador Mare Nostrum y el futuro procesador europeoEl nuevo superordenador Mare Nostrum y el futuro procesador europeo
El nuevo superordenador Mare Nostrum y el futuro procesador europeoAMETIC
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersIntel® Software
 
Rohan Narula_Resume
Rohan Narula_ResumeRohan Narula_Resume
Rohan Narula_ResumeRohan Narula
 
Educating the computer architects of tomorrow's critical systems with RISC-V
Educating the computer architects of tomorrow's critical systems with RISC-VEducating the computer architects of tomorrow's critical systems with RISC-V
Educating the computer architects of tomorrow's critical systems with RISC-VRISC-V International
 
High-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale SystemsHigh-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale Systemsinside-BigData.com
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...VEDLIoT Project
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCFacultad de Informática UCM
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3mustafa sarac
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationVEDLIoT Project
 
Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iaetsd Iaetsd
 
UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondEd Dodds
 
Portfolio of Projects
Portfolio of ProjectsPortfolio of Projects
Portfolio of ProjectsDaniele Pinto
 

Similar to DRAC: Designing RISC-V-based Accelerators for next generation Computers (20)

Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418Panda scalable hpc_bestpractices_tue100418
Panda scalable hpc_bestpractices_tue100418
 
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale SystemsDesigning Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
Designing Scalable HPC, Deep Learning and Cloud Middleware for Exascale Systems
 
Communication Frameworks for HPC and Big Data
Communication Frameworks for HPC and Big DataCommunication Frameworks for HPC and Big Data
Communication Frameworks for HPC and Big Data
 
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri HiPEAC 2019 Workshop - Hardware Starter Kit Agri
HiPEAC 2019 Workshop - Hardware Starter Kit Agri
 
EuroHPC and European HPC Strategy
EuroHPC and European HPC StrategyEuroHPC and European HPC Strategy
EuroHPC and European HPC Strategy
 
El nuevo superordenador Mare Nostrum y el futuro procesador europeo
El nuevo superordenador Mare Nostrum y el futuro procesador europeoEl nuevo superordenador Mare Nostrum y el futuro procesador europeo
El nuevo superordenador Mare Nostrum y el futuro procesador europeo
 
A Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing ClustersA Library for Emerging High-Performance Computing Clusters
A Library for Emerging High-Performance Computing Clusters
 
Rohan Narula_Resume
Rohan Narula_ResumeRohan Narula_Resume
Rohan Narula_Resume
 
Educating the computer architects of tomorrow's critical systems with RISC-V
Educating the computer architects of tomorrow's critical systems with RISC-VEducating the computer architects of tomorrow's critical systems with RISC-V
Educating the computer architects of tomorrow's critical systems with RISC-V
 
High-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale SystemsHigh-Performance and Scalable Designs of Programming Models for Exascale Systems
High-Performance and Scalable Designs of Programming Models for Exascale Systems
 
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
EU-IoT Training Workshops Series: AIoT and Edge Machine Learning 2021_Jens Ha...
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
BourrezCVEnglish
BourrezCVEnglishBourrezCVEnglish
BourrezCVEnglish
 
Real time machine learning proposers day v3
Real time machine learning proposers day v3Real time machine learning proposers day v3
Real time machine learning proposers day v3
 
IoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentationIoT Week 2021_Jens Hagemeyer presentation
IoT Week 2021_Jens Hagemeyer presentation
 
Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...Iirdem design and implementation of finger writing in air by using open cv (c...
Iirdem design and implementation of finger writing in air by using open cv (c...
 
CV_Anglais
CV_AnglaisCV_Anglais
CV_Anglais
 
Re-Vision stack presentation
Re-Vision stack presentationRe-Vision stack presentation
Re-Vision stack presentation
 
UCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and BeyondUCX: An Open Source Framework for HPC Network APIs and Beyond
UCX: An Open Source Framework for HPC Network APIs and Beyond
 
Portfolio of Projects
Portfolio of ProjectsPortfolio of Projects
Portfolio of Projects
 

More from Facultad de Informática UCM

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?Facultad de Informática UCM
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...Facultad de Informática UCM
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmFacultad de Informática UCM
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingFacultad de Informática UCM
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroFacultad de Informática UCM
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...Facultad de Informática UCM
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Facultad de Informática UCM
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFacultad de Informática UCM
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoFacultad de Informática UCM
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Facultad de Informática UCM
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Facultad de Informática UCM
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Facultad de Informática UCM
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windFacultad de Informática UCM
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Facultad de Informática UCM
 
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...Facultad de Informática UCM
 

More from Facultad de Informática UCM (20)

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
 
uElectronics ongoing activities at ESA
uElectronics ongoing activities at ESAuElectronics ongoing activities at ESA
uElectronics ongoing activities at ESA
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura Arm
 
Formalizing Mathematics in Lean
Formalizing Mathematics in LeanFormalizing Mathematics in Lean
Formalizing Mathematics in Lean
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented Computing
 
Computer Design Concepts for Machine Learning
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuro
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error Correction
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intento
 
Type and proof structures for concurrency
Type and proof structures for concurrencyType and proof structures for concurrency
Type and proof structures for concurrency
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
 
Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore wind
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
 
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...
Discrete-Event Modeling and Simulation for Development of Embedded and Real-T...
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 

DRAC: Designing RISC-V-based Accelerators for next generation Computers

  • 1. April 20 2023 DRAC: Designing RISC-V- based Accelerators for Next Generation Computers Miquel Moretó UPC and BSC Universidad Complutense de Madrid (UCM), Madrid (Spain)
  • 2. Barcelona Supercomputing Center Centro Nacional de Supercomputación BSC-CNS objectives Supercomputing services to Spanish and EU researchers R&D in Computer, Life, Earth and Engineering Sciences PhD programme, technology transfer, public engagement Spanish Government 60% Catalan Government 30% Univ. Politècnica de Catalunya (UPC) 10% BSC-CNS is a consortium that includes
  • 3. MareNostrum 1 2004 – 42,3 Tflops 1st Europe / 4th World New technologies MareNostrum 2 2006 – 94,2 Tflops 1st Europe / 5th World New technologies MareNostrum 3 2012 – 1,1 Pflops 12th Europe / 36th World MareNostrum 4 2017 – 11,1 Pflops 2nd Europe / 13th World New technologies Access: prace-ri.eu/hpc_acces Access: bsc.es/res-intranet General Purpose Cluster: 11.15 Pflops MN4 CTE-Power: 1.57 Pflops MN4 CTE-ARM: 0.65 Pflops MN4 CTE-AMD: 0.52 Pflops MareNostrum 4 Total peak performance: 13,9 Pflops
  • 4. GPP - General Purpose Intel Sapphire Rapids Peak performance: 45,4 Pflops Sustained HPL: 35,4 Pflops April 2023 ACC – Accelerated Intel Sapphire Rapids NVIDIA Hopper Peak performance: 260 Pflops Sustained HPL: 163 Pflops June 2023 NGT GPP - Next Generation NVIDIA Grace Peak performance: 2,82 Pflops Sustained HPL: 2 Pflops June 2023 NGT ACC - Next Generation Intel Emerald Rapids Intel Rialto Bridge Peak performance: 6 Pflops Sustained HPL: 4,24 Pflops December 2023 InfiniBand NDR 200 Fat Tree Spectrum Scale File System 248 PB HDD 2,81 PB NVMe 402 PB tape January 2023 MareNostrum5
  • 5. BSC Staff Evolution 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 65 114 164 229 279 302 310 321 358 433 447 475 529 613 668 737 782 787 982 1060,56 BSC Staff evolution 2005 - 2022, plus forecast 2023 & 2024 Data at 30th June 2022 (Including collaborators)
  • 6. 6
  • 7. Collaborations with Global IT Industry June 11, 2020
  • 8. Computer Sciences Earth Sciences CASE Life Sciences To influence the way machines are built, programmed and used: programming models, performance tools, Big Data, Artificial Intelligence , computer architecture, energy efficiency To develop and implement global and regional state-of-the-art models for short- term air quality forecast and long-term climate applications To understand living organisms by means of theoretical and computational methods (molecular modeling, genomics, proteomics) To develop scientific and engineering software to efficiently exploit super-computing capabilities (biomedical, geophysics, atmospheric, energy, social and economic simulations) Mission of BSC Scientific Departments
  • 9. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. The RISC-V Revolution!
  • 10. Today´s Technology Trends Massive penetration of Open Source Software • IoT (Arduino), • Mobile (Android), • Enterprise (Linux), • HPC (Linux, OpenMP, etc.) New Open Source Hardware Momentum from IoT and the Edge to HPC • RISC-V • OpenPOWER Moore´s Law + Power = Specialization (HW/SW Co- Design) • More cost effective • More performant • Less Power
  • 11. HPC Today • Europe has led the way in defining a common open HPC software ecosystem • Linux is the de facto standard OS despite proprietary alternatives • Software landscape from Cloud to IoT already enjoys the benefit of open source • Open source provides: • A common platform, specification and interface • Accelerates building new functionality by leveraging existing components • Lowers the entry barrier for others to contribute new components • Crowd-sources solutions for small and larger problems • What about Hardware and in particular, the CPU? CPUs/GPUs/ASICs HW Systems OS Compiler/Toolchain Schedulers Libraries/Platforms Applications OPEN CLOSED Linux LLVM COMPS OpenMP GROMACS,NAMD, WRF, VASP, etc. OCP
  • 12. HPC Tomorrow • Europe can lead the way to a completely open SW/HW stack for the world • RISC-V provides the open source hardware alternative to dominating proprietary non-EU solutions • Europe can achieve complete technology independence with these foundational building blocks • Currently at the same early stage in HW as we were with SW when Linux was adopted many years ago • RISC-V can unify, focus, and build a new microelectronics industry in Europe. CPUs/GPUs/ASICs HW Systems OS Compiler/Toolchain Schedulers Libraries/Platforms Applications OPEN Linux LLVM COMPS OpenMP GROMACS,NAMD, WRF, VASP, etc. OCP RISC-V, OpenPower, MIPS
  • 13. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. The European Processor Initiative (EPI)
  • 14. The European Processor Initiative (EPI) under the SGA1 of the Framework Partnership Agreement (FPA: 800928), to design and implement a roadmap for a new family of low-power European processors for extreme scale computing, high-performance Big-Data and a range of emerging applications. • History: Remember MontBlanc? BSC leads the RISC-V HPC accelerator development • Consortium (SGA1): 28 partners from 10 European countries to Coordinate: Bull SAS (France) • Budget: €80M (100% funded) • Duration: 36 months (01/12/2018-31/12/2021) • 5 Streams (4 Technical and 1 Management/Exploitation/C&D) The European Processor Initiative Key contact point Jesús Labarta jesus.labarta@bsc.es 👉
  • 15. 15 EPI MAIN OBJECTIVE To develop European microprocessor and accelerator technology  Strengthen competitiveness of EU industry and science Rhea Arm-based general purpose CPU EPAC RISC-V based Accelerators SiPearl BSC, SemiDynamics, EXTOLL, FORTH, ETHZ, UniBo, UniZG, Chalmers, CEA, E4, Menta, ZPT, …
  • 16. 16 Pilot based on EPAC EPI SGA2 Rhea1 Go-To-Market EPAC Demonstrator OVERALL TECHNOLOGY ROADMAP EPI SGA1 EPAC test chip 1.0 Next project(s) EU Chips act related Exascale Systems Centers of Excellence in HPC Applications 2019 – 2021 2022 - 2024 2025 - … 2015 – 2018 H2020 projects HPC ecosystem exploration Pilot based on Rhea I Rhea Arm-based general purpose CPU EPAC RISC-V based Accelerators
  • 17. 17 EPAC: A RISC-V ACCELERATOR 17 EPAC accelerator Network on Chip (NoC) L2 HN … … C V STX … VRP AXI Lite Peripherals Peripherals Peripherals Bridge Bridge Host CPU L2 HN L2 HN C V STX
  • 18. 18 RISC-V core and VPU RISC-V core: Avispado • 2-way in-order core • Full HW-support for unaligned accesses • Cache: L1I$ =16KB, L1D$ = 32KB VPU • Long vectors: 256 DP elements – #Functional Units (FUs) << Vector Length (VL) – 1 vector instruction can take several cycles • 8 Lanes per core – FMA/lane: 2 DP Flop/cycle • 40 physical registers, some out of order F. Minervini, O. Palomar. RISC-V Summit 2021 “Vitruvius: An Area-Efficient Decoupled Vector Accelerator for High Performance Computing” Architecture Vector register size (1 cell = 1 double element) Intel AVX512 D1 D2 D3 D4 D5 D6 D7 D8 Arm Neon D1 D2 A64FX D1 D2 D3 D4 D5 D6 D7 D8 NEC Aurora SX D1 D2 D3 D4 D5 D6 D7 D8 … D256 EPAC VPU D1 D2 D3 D4 D5 D6 D7 D8 … D256 Key contact point Adrián Cristal adrian.cristal@bsc.es 👉 Key contact point Roger Espasa (SemiDynamics) 👉
  • 19. 19 EPAC EPAC programming environment • Offload of processes from host to accelerator • Interoperability MPI + OpenMP – Accelerator can run MPI and OpenPM processes • Task-based models – Taskify MPI calls – Single mechanism  Concurrency  Locality and data management • Long vectors (256 elements, 8 lanes per core) – Decouple front-end from back-end – Convey access pattern semantics to the architecture – Vector length agnostic (VLA) programming and architecture Applications Libraries (FFTW, SpMV, ...) Scheduler (Slurm) Compiler (LLVM) OS (Linux) Hardware (Host + Accelerator) Programming Model (OpenMP, MPI) Memory Host CPU Bridge L2 Vector Core STX
  • 20. 20 How to use the V-extensions? • Assembler: always a valid option but not the most pleasant • C/C++ builtins (intrinsics) – Low-level mapping to instructions – Allows embedding it into an existing C/C++ codebase – Allows relatively quick experimentation • #pragma omp simd (aka “Semi automated vectorization”) – Relies on vectorization capabilities of the compiler  Usually works but gets complicated if the code calls functions – Also usable in Fortran • Autovectorization: let compiler work for you ;-) Interested in compiling your code in RVV? Roger Ferrer roger.ferrer@bsc.es 👉
  • 21. 21 RISC-V platforms (SDV) • Hardware/Software infrastructure for Continuous Integration and RTL check – HiFive commercial hardware (scalar) – EPAC RTL (1core) on FPGA • Platform to: – Demonstrate a full HPC software stack  Linux, compiler, libraries, job scheduler, MPI – Test latest RTL with complex codes  Advanced performance analysis tools  Accurate timing available Commercial and FPGA-based RISC-V Commercial (only scalar) EPAC FPGA-based implementation (vector support) Full SDV Interested in testing your code on EPAC? Filippo Mantovani filippo.mantovani@bsc.es 👉
  • 22. 22 • Chip fabrication Q2 2021 • Global Foundries 22nm (GF22FDX) • Final Top level chip floorplan • Total area: • 5943 X 4593 um2 • (27.297 mm2 ) STX Avispado VPU L2 HN Avispado VPU L2 HN Avispado VPU L2 HN Avispado VPU L2 HN STX VRP serdes serdes EPAC 1.0 Test Chip (Tapeout Q2 2021)
  • 23. 23 EPAC 1.5 Test Chip (Tapeout Q4 2022) 24/10/2022 ACAT 2022, Suarez STX STX XP XP XP HN L2$ HN L2$ XP HN L2$ XP HN L2$ AVS VPU FPGA Bridge VRP XP I/O micro- tile AVS VPU
  • 24. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. DRAC Overview
  • 25. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. • CIC-IPN Lagarto I design in Mexico (2012-2017) • 5-stage single issue in-order pipeline, MIPS-based microcontroller • FPGA implementation capable of booting Linux • BSC and CIC-IPN Lagarto Initiative (2018 onwards) • MIPS to RISC-V • FPGA to ASIC • European Processor Initiative (EPI) (Dec 2018 - Dec 2023) • Flagship project (80M€ Phase 1; 70M€ Phase 2; 26 partners) • BSC leads the European RISC-V Vector Accelerator (EPAC) • July 2018: Come to my office... • RIS3CAT “Emerging Technologies” call in Nov 2018 It all started in Mexico...
  • 26. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. • DRAC: Designing RISCV-based Accelerators for next generation Computers • Consortium: BSC (coord.), UPC, UAB, UB, URV • Dates: June 2019 – June 2023 • Budget: 4M€ (50% co-funded by Generalitat) • Alignment with the European Processor Initiative (EPI) project: • Focus on RISC-V-based accelerator developed in Barcelona • Promote RISC-V in the CS degrees in Catalan universities • Build IC design teams capable of taping out DRAC technology: RTL design, verification and physical design DRAC Project
  • 27. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. 1 Design of an out-of-order general purpose RISC-V processor 2 Design of accelerators and required hardware support to have secure processors that incorporate post-quantum cryptographic schemes and virtualization techniques 3 Design of accelerators for genomics data analytics 4 Design of efficient and low power processors for autonomous navigation applications 5 Building a local ecosystem for custom hardware design and fabrication 6 Transfer DRAC technology to local and international companies 7 Transfer DRAC technology to the university system using educational kits based on DRAC designs DRAC General Objectives
  • 28. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Lagarto RISC-V Tapeouts 2020 2021 2022 2023 DVINO (Apr’21) - Lagarto Hun in- order - VPU - PLL 600 MHz - SDRAM mem cont - HyperRAM - VGA - ADC - TSMC 65nm - 8mm2 Kameleon (Dec’22) - Lagarto Ka 11-stage ooo - PLL 1.2 GHz - Automotive Accel - Crypto Accel - Genomic Accel - PICOS Accel - SerDes 8GHz - GF 22nm - Area: 9mm2 Sargantana (Feb’22) - Sargantana 7-stage in-order - PLL 1.2 GHz - Custom extensions - SDRAM - Prototype analog IPs: SerDes 8GHz - GF 22nm - Area: 2.9mm2 2019 Lagarto (May’19) - Lagarto Hun 5-stage in-order - 150MHz (external) - TSMC 65nm - 2.5mm2
  • 29. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Lagarto: First RISCV Tapeout (May 2019) • Target design: • Lagarto Hun in-order scalar core, 5 stages, single issue, RV64IMA • 16KB L1 caches, 64KB L2 cache, TLB • Memory controller on the FPGA side via packetizer • Debug ring via JTAG • Target technology: TSMC 65nm, area fits in 2.5mm2 • Fabrication and bringup • Submitted in May 2019 • Samples received in Sep 2019 • Bringup with custom PCB in Oct 2019 • Linux boot in Dec 2019
  • 30. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. From Architecture to an Actual System RTL design and verification IC design PCB design and assembly System characterization Technology information IC design flow Physical verification Prepare for manufacturing
  • 31. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. DVINO: 2nd Lagarto Tapeout (Apr 2021) • DRAC Vector IN-Order (DVINO) processor details: • Lagarto Hun scalar pipeline, 5-stage, in-order, RV64IMA • Hydra 2-lane (VPU-DRAC-1.0), 4096-bit vector length • Internal PLL. DVINO can run at 600, 400, 300 and 200MHz • In-house L1 instruction cache and PMU • L1 data and L2 caches from lowRISC 0.2 • Multiple contr: JTAG, UART, SPI, VGA, SDRAM and Hyperram. • In-house JTAG-based debug-ring • Technology node: TSMC 65nm (Europractice) • Area: 8.6mm2
  • 32. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Sargantana: 3rd Lagarto Tapeout (Feb 2022) • Sargantana in-order processor details: • Lagarto Hun pipeline, 7-stage, in-order, RV64IMAFD (RV64G) • Support for floating point operations (single and double precision) • Integer SIMD VPU, 128-bit vector length, custom instructions • Internal PLL. Sargantana can run above 1.1GHz • In-house L1 instruction cache and PMU • L1 data and L2 caches from lowRISC 0.2 • Multiple controllers: JTAG, UART, SPI, SerDes, and Hyperram • In-house JTAG-based debug-ring • Technology node: GF 22nm (Europractice) • Area: 2.9mm2
  • 33. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Kameleon: 4th Lagarto Tapeout (Dec 2022) L2 Cache Memory Controller Lagarto Ka L1 I L1 D Sargan- tana Genomics (WFA) Arbiter to select core Clock Sauria L1 I L1 D AXI PQC PICOS
  • 34. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Lagarto Ka Out-of-Order Core Current Features • 2-way 64-bit out-of-order architecture • RV64IMA ISA • 11-stage pipeline implementation • Parameterized branch predictor: › BTB 16-128 entries › BHT 16-128 entries › RAS 2-8 entries • ROB 128-entries • Low-power Integer queue (out-of- order issue) • In-order Load/Store Queue • Hit-under-miss support • Configurable, L1 caches › 16 KiB L1 I-cache (Typical) › 32 KiB L1 D-cache (Typical) 3
  • 35. SAURIA Physical Design 55,744 67% 10,816 13% 9,984 12% 6,656 8% Total power = 83.2 mW Systolic Array SRAMs Feeders Others Specs: 8x16 Array: 128 Processing Elements (PE) Approximate logic in PE multipliers & adders 1.00 x 0.95 mm = 0.95 mm2 128 GFLOP/s @ 500MHz 1.56 TFLOP/sW
  • 36. Genome Alignment Acceleration (WFA) • Pairwise alignment/mapping DNA/RNA sequences with the novel Wavefront Alignment (WFA) algorithm • Target: provide specific hardware support for the most time- consuming operations of the algorithm. • ISA extensions: - [vmax_vv] Vectorial maximum. - [vmax3inc_vv] Vectorial “3-way” maximum fused with increment. - [vcnt] Scalar “count consecutive matches” - [vcnt_vv] Vectorial “count consecutive matches”. • Use narrow-integers 16-bit or 8-bit integers • Monolythic accelerator integrated with Lagarto SoC  Single aligner (due to area limitations) capable of 64 ops/cycle  Current Place and Route (PnR) results: 1.1GHz (typical corner)  Area: 1.6mm2 in Global Foundries 22nm  Performance speedups: 515x with 10K reads, 10% error
  • 37. • Main Goal: RISC-V acceleration of different PQC schemes • Classic McEliece (CME) KEM acceleration: • HW/SW co-design implementation of CME KEM @ Zynq Ultrascale [FPL’21] • Monolythic CME accelerator based on HLS • Integration of the CME accelerator in Lagarto SoC via AXI interface • Accelerate other KEM and digital signature (DS) schemes: • NRTU KEM and Crystals-Kyber KEM / Crystals-Dilithium DS • Both algorithms rely on very different operations and will require different acceleration techniques. [FPL’21] V. Kostalampros et al., HLS-Based HW/SW Co-Design of the Post- Quantum Classic McEliece Cryptosystem. FPL 2021. PQC Acceleration
  • 38. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Building a RISC-V Ecosystem in Barcelona Lagarto Team (Sept 2019) DRAC KoM (Feb 2020) DRAC F2F (Jul 2022)
  • 39. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Building a RISC-V Ecosystem in Barcelona DRAC Final Workshop (December 2022)
  • 40. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Lagarto Roadmap
  • 41. First Steps Towards a Lagarto Multicore DVINO L1 I L1 D L2 cache NoC DVINO L1 I L1 D L2 cache NoC DVINO L1 I L1 D L2 cache NoC DVINO L1 I L1 D L2 cache NoC Mem Cntr • Multicore design based on: • DVINO processor • RISC-V ISA support: I, M, A, F, D, C, V • 4-lane VPU • OpenPiton 2-level cache hierarchy (priv. L1, shared L2) • Current status • Linux boot (openSBI) • RTL simulation of parallel applications (up to 64 cores) • Multiple memory controllers • FPGA-ready • Integrating Ka+VPU
  • 42. Towards a RISC-V Heterogeneous Manycore Ka Core VPU DL1 IL1 L2 General Purpose Processor Tile Ka Core DL1 IL1 L2 Ka Core DL1 IL1 Sargan -tana DL1 IL1 L2 Sargan -tana DL1 IL1 HPC Accelerator Tile Ka Core SAURIA DL1 IL1 L2 Automotive, ML Accelerator Tile Sargan -tana WFA DL1 IL1 L2 Genomics Accelerator Tile Other Domain-Specific Accelerator Tiles (sparse, security, safety, etc.)
  • 43. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Chips Act, PERTE Chip, Intel and more!
  • 44.
  • 45. PERTE Chip. Microelectronics & Semiconductors Recovery, Transformation and Resilience Plan 45 M€ COMPONENT I. BOLSTERING SCIENTIFIC CAP ACITY 1.165 ACTION 1: Development of R&D&i on cutting-edge and alternative architecture microprocessors 475 ACTION 2: Development of R&D&i on integrated photonics 150 ACTION 3: Development of R&D&i on quantum chip development 40 ACTION 4: Budget line for the IPCEI on Microelectronics and Communication Technologies 500 COMPONENT II. DESIGN S TRA TE GY 1.330 ACTION 5: Creation of cutting-edge alternative architecture microprocessor fabless companies 950 ACTION 6: Creation of pilot lines 300 ACTION 7: Creation of a network for education, training and skills-building in relation to semiconductors 80 3. EXECUTION 3. EXECUTION ACTIONS & BUDGET (I)
  • 46. PERTE Chip. Microelectronics & Semiconductors Recovery, Transformation and Resilience Plan 46 M€ COMPONENT III. CONSTRUCTION OF F ABRICA TION PLANTS IN SPAIN 9.350 ACTION 8: Creating fabrication capacity at sizes below 5 nm 7.250 ACTION 9: Creating fabrication capacity at sizes above 5 nm 2.100 COMPONENT IV . STIMULATING THE ICT MANUF ACTURING INDUSTRY IN SPAIN 400 ACTION 10: ICT manufacturing industry incentive scheme 200 ACTION 11: Creation of a chips fund 200 GOVERNANCE 5 Special Commissioner for the Microelectronic and Semiconductors Project 5 TOT AL PUBLIC INVESTMENT 12.250 3. EXECUTION 3. EXECUTION ACTIONS & BUDGET (II)
  • 47. Intel Labs Barcelona are back! ● New joint Intel – BSC Laboratory to design HPC processors based on RISC-V technology ● Funding: 400M$ in the next 10 years. Headcount: ~200 (estimated) Other companies will also come to Spain!
  • 48. Research & Product Developing European Hardware/Software Technology Full Stack Open Source HPC Ecosystem Intel and BSC: Continuing to collaborate into the Zettascale era European & Global collaboration Build Full System based on RISC-V: MN6 and many others Path to Zettascale
  • 49. If your experience and/or motivation include any of the disciplines and skills below, check out our QR: Hardware (Processor architecture, micro-architecture, accelerators, memory hierarchy, memory controllers, HBM, DRAM, non-volatile memory, RTL design, VHDL, verilog, SystemC, System verilog, Synopsys, Cadence, Mentor Graphics, synthesis, place and route, timing closure, packaging, PCB design, verification, validation, CI, post-silicon debug, DFT, gate-level simulation,…) Software (programming models, MPI, compilers (LLVM), SYCL, OneAPI, Tensorflow, PyTorch, Apache Spark, CI/CD, operating systems, managed runtimes, OpenMP, task-based programming models, containers, security, fault-tolerance, virtualization, C/C++, Tcl, Python, Perl/Csh,… ) Research Engineer/Researcher R&D for Zettascale and beyond: Applications to Accelerators Reference: 176_23_CS_Z_R0-4-RE1-4 BSC is building a New Lab! 100+ Job Opportunities
  • 50. Openchip Confidential © 2023 Openchip A fabless semiconductor company building RISC-V based High Precision, High Performance Accelerators targeting HPC and adjacent Enterprise AI/ML/DL real world workloads with dense and sparse access patterns.
  • 51. Designing RISC-V-based Accelerators for next generation Computers The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. • Open source hardware design opportunities and challenges!! ○ Many open source RTL designs available, including cores, SoCs and accelerators ○ Design toolflow partially open (simulators, testing, FPGA emulation), but still some pieces are missing (verification, Place&Route, bringup) ○ Technology-related IP is completely closed ○ SW ecosystem still under development ○ Ideal for teaching, research and startups!! ○ Potential to become the European domestic solution • We need your help! Join us to contribute to the RISC-V open movement: ○ Contribute to open source community with in-house designs and tools ○ Contribute to European RISC-V econsystem and projects ○ Master/PhD student and research engineer positions are available Conclusions
  • 52. • Join us in Barcelona between June 5 and 9 2023!!! RISC-V Summit Europe 2023 Early registration deadline: Apr 30 2023
  • 53. Designing RISC-V-based Accelerators for next generation Computers Torre Girona c/Jordi Girona, 31 – Edificio Nexus II c/Jordi Girona, 29 – 08034 Barcelona (España) – Tel. (+34) 93 413 77 16 – info@bsc.es The DRAC project with -file number 001-P-001723- has been 50% co-financed with € 2,000,000.00 by the European Union Regional Development Fund within the framework of the ERDF Operational Program of Catalonia 2014-2020, with the support of Generalitat of Catalonia. Copyright 2020 © All Rights Reserved. Thank you!