The irresistible and necessary touch between supercomputers and embedded systems

The irresistible and necessary touch
between supercomputers and
embedded systems
Mauro Olivieri
Associate Professor, Sapienza University of Rome
Visiting researcher, Barcelona Supercomputing Center

Computer system EVOLUTION... at a glance
High Performance
ComputersPersonal
Computers &
Servers
Embedded
Computers

High
Performance
Computers
& Data
Centers
Personal
Computers
& Servers
High end
Embedded
Systems
Diversification of markets and
technical solutions
IoT
devices
Tablets
& Smart
Phones

Personal
Computers
& Servers
High end
Embedded
Systems
Diversification of
microprocessors
Tablets
& Smart
Phones
IoT
devices
High
Performance
Computers
&
Data Centers
POWER9

Today’s supercomputers
DF W
FU
FU
M
clock clockclock clock
Processor CORE
Multi‐core CPU (chip)
Multi‐chip node (board)
HBM
Accel.
(GPU)
Accel.
(GPU)
NIC
L2, L3 cache
Rack
(connects
many nodes)
System
(connects
many racks)L2,L3cacheL2,L3cache

What’s happened in supercomputing
• Mainframe Era (circa 1953 ‐ circa 1972)
• Memory capacity is the main limit
• All fundamental computer architecture techniques are invented
• First rise of HW acceleration: vector computers (1974‐1993)
• Processing speed on matrix algebra is the main limit
• SIMD processing, domain specific architectures
• Rise of massive homogeneous parallelism (1994‐2007)
• Memory bandwidth is the main limit (memory wall)
• Moore’s law boosts clock speed and scale of integration in HIGH VOLUME
PRODUCTION processors (killer processors)
• Parallel architectures with commodity CPUs overcome vector processors
• The renaissance of acceleration units (2008 ‐ …? )
• Power consumption is the main limit
• Hardware specialization allows better power efficiency (FLOPS/W)
• The first example are GPUs because they come from HIGH VOLUME MARKET
• Need specialized yet widely re‐usable power efficient accelerator chips

Todays’s Embedded Systems for IoT
Long range, low BW
Short range, BW
Low rate (periodic) data
SW update, commands
Transmit
Idle: ~1µW
Active: ~ 10mW
Analyze
µController
IOs
Was 1 ÷ 25 MOPS
in 1 ÷ 10 mW
e.g. CortexM
Sense
MEMS Microphone
ULP Imager
100 µW ÷ 2 mW
EMG/ECG/EIT
Local Mem.
Accelerators
Now is >1000 MOPS
in 1 ÷ 10 mW
Courtesy of Luca Benini,ETH Zurich

What’s happened in Embedded Systems for IoT
• Wireless Sensor Network era (2000 – 2010)
• Limited or absent Internet connection, limited and local processing
• First generation Internet of Things (2010 – 2018)
• Internet connection; processing demanded to the cloud
• Artificial Intelligence on Internet of Things (2018 ‐ …?)
• Need very high computing power for AI applications
(VGG16 convolutional NN requires >100 billions operations per inference)
• Need to favor local processing (edge computing) over processing off‐load
(cloud computing) to reduce communication overhead
• Need very high power efficiency for local processing
• Hardware acceleration and parallel computation
• Need a supercomputer on the sensor node

• HPC is a strategic goal pursued at worldwide political level
• EPI project alone totaling 120 Million Euros funding
• HPC systems (supercomputers) necessarily target not only higher
speed, but also higher and higher power efficiency
• HPC history shows that supercomputers need high volume market
devices to economically survive using top‐level technology
• Embedded systems (IoT, automotive) demand not only power
efficiency, but also higher and higher computing speed
• Embedded systems have the market volume to justify mass
production of computing devices
• Key enabling technologies in the close future:
• 7‐5‐3 nm FinFET processes
• 3D stacked memories (HBM), stacked memories‐CPU (heat!)
• Chip‐lets
• Optical links
Summary of the trends...

Summary of the trends...
High
Performance
Computers
& Data
Centers
Personal
Computers
& Servers
High end
Embedded
Systems
Embedded computing and HPC
touch
IoT
devices
Tablets
& Smart
Phones

What about computing platforms?
Embedded Domain
App1    App2     App3
Cortex M5      …  Cortex A57
HPC + PC Domain
App1    App2     App3
Xeon Phi   …  Core I7   … Atom

A bet on the future...
Embedded Domain
App1    App2     App3
Core A      …          Core B
HPC + PC Domain
App1    App2     App3
Core C …  Core D … Core E

Thank you
mauro.olivieri@uniroma1.it
The views expressed in this presentation belong solely
to the presenter and do not necessarily reflect the
views and plans of the BSC or of any other organization
“Personal views” disclaimer

The irresistible and necessary touch between supercomputers and embedded systems

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The irresistible and necessary touch between supercomputers and embedded systems

Similar to The irresistible and necessary touch between supercomputers and embedded systems (20)

Recently uploaded

Recently uploaded (20)

The irresistible and necessary touch between supercomputers and embedded systems