SlideShare a Scribd company logo
1 of 33
Download to read offline
Qualcomm Datacenter Technologies, Inc.
Emerging Computing Trends in the Datacenter
Dileep Bhandarkar, Ph. D.
Vice President, Technology
HiPEAC18 Keynote – 23 January 2018, Manchester, United Kingdom
Outline
• Historical Perspective on 40 Years of Moore’s Law
– Single Core Era enabled by Dennard Scaling
• Post Dennard Scaling Drives Multi-Core Era
• The Shift to Energy Efficient Multi-Core Designs for
the Cloud
• Heterogenous Computing Era with Application
Specific Accelerators
The First 50 Years
after
Shockley’s Transistor Invention
1958: Jack Kilby’s
Integrated Circuit
My 40+ Year Journey From Mainframes to Smartphones https://www.youtube.com/watch?v=7ptXpNFY3XM
Bob Noyce’s
Integrated Circuit
From 2300 to >1Billion Transistors
Moore’s Law video at http://www.cs.ucr.edu/~gupta/hpca9/HPCA-PDFs/Moores_Law_Video_HPCA9.wmv
Dennard Scaling
Device or Circuit Parameter Scaling Factor
Device dimension tox, L, W 1/K
Doping concentration Na K
Voltage V 1/K
Current I 1/K
Capacitance eA/t 1/K
Delay time per circuit VC/I 1/K
Power dissipation per circuit VI 1/K2
Power density VI/A 1
The benefits of scaling : as transistors get smaller, they can switch faster and use less power.
Each new generation of process technology was expected to reduce minimum feature size by
approximately 0.7x (K ~1.4). A 0.7x reduction in linear features size provided roughly a 2x
increase in transistor density.
Dennard scaling broke down around 2004 with unscaled interconnect delays and our inability
to scale the voltage and current due to reliability concerns.
But increasing transistor density (Moore’s Law) has continued to enable multicore designs.
THE MULTICORE ERA
SINGLE THREAD PERFORMANCE IMPROVEMENT SLOWING DOWN
PERFORMANCE DRIVEN BY HIGHER CORE COUNT
Post Dennard Scaling
Transistor Count
Increasing
Slower
Improvement
No Improvement
Power Going Up
With Performance
Core count
increasing to
drive
Performance
Now Performance Improvement Comes from Higher Core Count at Similar Frequency
with Each New Process Node
The last 5 Generations of ~135W Xeon Processors
Slow Improvement in IPC but per thread performance constrained by power
Performance data from www.spec.org
8 cores
Mar 2012
10 cores
Sep 2013
12 cores
Sep 2014
14 cores
Apr 2016
18 cores
Jul 2017
No Improvement in Perf/Watt per Core
even with higher power
Performance data from www.spec.org
Era of Energy Efficient Cores
© 2017 Arm Limited12
Looking ahead from edge to cloud
The future requires a new approach to CPU design
Safe and autonomous Hyper-efficient
Secure private compute
Cortex beyond mobile Mixed reality
Presented by Peter Greenhalgh at Hot Chips 2017
13
Cloud
Traditional
Enterprise IT
%Totaldatacenterserverrevenue
0%
25%
50%
75%
100%
2013 2014 2015 2016 2017 2018 2019 2020
Server Industry is shifting to the Cloud
Disruptions Come from Below!
Mainframes
Minicomputers
RISC Systems
Desktop PCs
Notebooks
Smart Phones
Volume
Performance
Bell’s Law:
hardware technology,
networks, and interfaces
allows new, smaller, more
specialized computing
devices to be introduced to
serve a computing need.
15
Qualcomm Datacenter
Technologies
Uniquely positioned to leverage
mobile growth and drive datacenter
process leadership
65nm 45nm 28nm 20nm 10nm
1st in the
industry
14nm
Mobile driven
NowThen
Fab process tech
driven by PC
Fab process tech driven
by mobile phones
PC driven
2008 2010 2012
2016
20182014 1.5B
units
256M
units
Smartphone unitsPC units
45nm 32nm 10nm14nm22nm
A new world in datacenter:
Manufacturing
process
Mobile Technology Disrupting the Cloud Datacenter
16
Qualcomm Centriq
™
2400
Throughput performance
Thread Density
Quality of Service
Energy Efficiency
What Cloud means for
Processor Architecture
Key metrics
• Perf / thread
• Perf / Watt
• Perf / mm2
The future requires a new approach to CPU design
Datacenter Energy Efficiency Considerations
Source: https://eta.lbl.gov/publications/united-states-data-center-energy, http://perspectives.mvdirona.com/
• US datacenters consumed about 70 billion
kilowatt-hours of electricity in 2014
• Datacenters can cost between $10M and $20M
per megawatt
• Unused datacenter capacity can be expensive
• 1W of server power can cost $1 per year in
energy costs at 10 cents per KWH
• Server power related costs can be 30 to 40%
of overall datacenter operating costs
• Servers need to be designed for efficient
average power consumption instead just
maximizing peak output efficiency
Better Hyper-efficient Designs Needed to Improve Server Energy Efficiency
18
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
Falkor
duplex
8-Serdes
SATACTL
HDMA
EMAC
OCMEM
QGIC
USB
USB
USB
USB
PW
QFPROMIMCMPM/CC
8-Serdes
PCle
8-Serdes8-Serdes
PCle
8-Serdes
DDR DDR DDR
MCMCMC
DDR DDR DDR
Coherent segmented ring interconnect
L3L3L3L3 L3L3
L3L3L3L3 L3L3
MCMCMC
• 48 custom Armv8 cores at 2.6 GHz peak frequency
• Large 60 MB L3 cache
• 6 DDR4 memory channels at 2667 MT/s
• High bandwidth coherent ring
• Low average power under typical load
• Ultra low idle power
• Cache Quality of Service
• Inline memory bandwidth compression
• Security rooted in hardware
• Leading performance and energy efficiency
Qualcomm Centriq 2400: Built for The Cloud
Details at https://www.qualcomm.com/products/qualcomm-centriq-2400-processor
19
Quualcomm Centriq 2400 Drives Perf/W and Perf/Thread Leadership
1
1.71
1.04
1.25
1.38
1
1.18
0.77
0.93
0.99
1
0.69
0.74
0.75
0.72
1
2.02
1.84
1.86
1.70
1
1.01
0.92
0.93
0.85
1
0.24
0.59
0.40
0.27
QDF 2460 PLATINUM 8180 GOLD 6138 PLATINUM 8160 PLATINUM 8170
Power SPECintrate2006 Perf/Watt Perf/Core Perf/Thread Perf/$
IsoPower IsoPerf
48 cores
120 W TDP
657 SIR2006
$1,995
20 cores
125 W TDP
504 SIR2006
$2,612
26 cores
165 W TDP
653 SIR2006
$7,405
28 cores
205 W TDP
775 SIR2006
$10,009
Top Bin
E7 Price
24 cores
150 W TDP
612 SIR2006
$4,702
Top Bin E5 Price
SKU
Performance based on internal tests for SPECintrate2006 (SIR) estimates using gcc O2
20
Qualcomm Centriq 2460 Lowers Average and Idle Power
to Improve Cloud Server Density in Datacenters
0
20
40
60
80
100
120
AveragePower(Watts)
8W idle power
400.
perlbench
401.
bzip2
403.
gcc
429.
mcf
445.
gobmk
456.
hmmer
458.
libquantum
464.
h264ref
471.
omnetpp
473.
astar
458.
sieng
483.
xalancbmk
SPECint®_rate2006 subtests
120W TDP
Median = 65W
Heterogenous Computing Era
• Energy efficiency must be a implicit design target
• Desktop PC CPU cores are too power hungry and not energy efficient
• Wimpy cores are not good enough for servers
• Servers can be designed by scaling up energy efficient mobile core design philosophy
• Many workloads run best on different kinds of specialized processing engines
• Each processing engine has its own strengths
Lessons from Mobile Computing
• Order of Magnitude higher computational efficiency than general
purpose processors
• Can accept inefficient implementation to reduce time to market
• Many potential applications
– Machine Learning
– Encryption
– Data Compression
– Video processing
• Need reasonable volume for business case
• Algorithms need to be stable
• Can they be programmable? Where do FPGAs fit?
The Age of Application Specific Accelerators
Before the emergence of DNNs
 Algorithms and rule based systems were laboriously hand-coded
But by 2012, the ingredients for change were available
Sufficiently powerful GPU’s
Readily available large data sets on the internet
The Emergence of Deep Neural Networks
Deep Neural Networks are becoming Pervasive
 The turning point - ImageNet Competition 2012
 “ImageNet Classification with Deep Convolutional Neural Networks”, Neural Information
Processing Systems Conference (NIPS 2012)
 Deep Neural Net enabled a performance breakthrough
 Now - DNN’s are simpler to develop and deploy, ushering in radical change in many fields and
entire industries
Deep Learning is Growing Exponentially
Source: Google
Source: Google
2626
Devices,machines,
and things are becoming
more intelligent
2727
Learn, infer
context, anticipate
Reasoning
Act intuitively, interact
naturally, protect privacy
Action
Hear, see,
monitor, observe
Perception
Offering new capabilities to enrich our lives
Server/Cloud
Training
Execution/Inference
Devices
Execution/Inference
AI is Increasingly Everywhere
Source: Microsoft, Hot Chips 2017
Deploying DNNs at Datacenter Scale
Training tends toward concentrated, centralized computation
Inference tends toward wide distribution
GPUs
Large DPU
CPUs
Small DPU
CPUs are not powerful enough for training, but have free cycles available for
inference – opportunity for add-in accelerator cards
 Instruction Set enhancements can improve performance
GPUs have too much “extra baggage” that add cost and power for features not
needed for AI – opportunity for domain specific accelerators
FPGAs offer more flexibility, but are difficult to program and expensive
ASICs are energy and product cost efficient, but less flexible
Deep neural networks are making significant strides in many areas
 speech, vision, language, search, robotics, medical imaging & treatment, drug discovery …
We have an opportunity to dramatically reshape our computing devices to
better serve this emerging and growing market
Expect to see lots of innovation and excitement in the years to come
Thoughts on Future Silicon for Deep Learning
• Single thread general purpose performance improvement is slowing down
• Energy efficiency is extremely important in datacenters
• ARM architecture enables energy efficient designs with good performance
• Typical-use efficiency is becoming more important than peak output efficiency
in enterprise data centers
• Idle mode power will become more important for servers
• Smart power management can dynamically optimize server operation to
improve efficiency in normal use
• There is plenty of opportunity for innovation on new application specific
architectures targeted for specific workloads
Concluding Remarks
Follow us on:
For more information, visit us at:
www.qualcomm.com & www.qualcomm.com/blog
Nothing in these materials is an offer to sell any of the components or devices referenced herein.
©2018 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved.
Qualcomm is a trademark of Qualcomm Incorporated, registered in the United States and other countries, Qualcomm Centriq and Falkor are
trademarks of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective
owners.
References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or
business units within the Qualcomm corporate structure, as applicable.
Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio.
Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of
Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its
semiconductor business, QCT.
Thank you

More Related Content

What's hot

"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...Edge AI and Vision Alliance
 
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters Emulex Corporation
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation OverviewNVIDIA Taiwan
 
Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...Qualcomm Research
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Why SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingWhy SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingON.Lab
 
Performance beyond moore's law
Performance beyond moore's lawPerformance beyond moore's law
Performance beyond moore's lawAnand Haridass
 
IBM Power 7
IBM Power 7IBM Power 7
IBM Power 7None
 
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...Edge AI and Vision Alliance
 
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from MovidiusEdge AI and Vision Alliance
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...AMD Developer Central
 
Programmable I/O Controllers as Data Center Sensor Networks
Programmable I/O Controllers as Data Center Sensor NetworksProgrammable I/O Controllers as Data Center Sensor Networks
Programmable I/O Controllers as Data Center Sensor NetworksEmulex Corporation
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarBill Wong
 
InfiniBand Strengthens Leadership as the Interconnect Of Choice
InfiniBand Strengthens Leadership as the Interconnect Of ChoiceInfiniBand Strengthens Leadership as the Interconnect Of Choice
InfiniBand Strengthens Leadership as the Interconnect Of ChoiceMellanox Technologies
 
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...[Case study] Dakota Electric Association: Solutions to streamline GIS, design...
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...Schneider Electric
 
Hp moonshot update moabcon 2013
Hp moonshot update moabcon 2013Hp moonshot update moabcon 2013
Hp moonshot update moabcon 2013inside-BigData.com
 
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...AMD Developer Central
 
Ahead of the NFV Curve with Truly Scale-out Network Function Cloudification
Ahead of the NFV Curve with Truly Scale-out Network Function CloudificationAhead of the NFV Curve with Truly Scale-out Network Function Cloudification
Ahead of the NFV Curve with Truly Scale-out Network Function CloudificationMellanox Technologies
 

What's hot (20)

"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Green Networking With Blade
Green Networking With BladeGreen Networking With Blade
Green Networking With Blade
 
OpenPOWER Foundation Overview
OpenPOWER Foundation OverviewOpenPOWER Foundation Overview
OpenPOWER Foundation Overview
 
Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...Accelerating algorithmic and hardware advancements for power efficient on-dev...
Accelerating algorithmic and hardware advancements for power efficient on-dev...
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Why SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networkingWhy SDN and ON.Lab are hot topics in networking
Why SDN and ON.Lab are hot topics in networking
 
Performance beyond moore's law
Performance beyond moore's lawPerformance beyond moore's law
Performance beyond moore's law
 
IBM Power 7
IBM Power 7IBM Power 7
IBM Power 7
 
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...
“Market Analysis on SoCs for Imaging, Vision and Deep Learning in Automotive ...
 
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
"Dataflow: Where Power Budgets Are Won and Lost," a Presentation from Movidius
 
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
Keynote (Tony King-Smith) - Silicon? Check. HSA? Check. All done? Wrong! - by...
 
Programmable I/O Controllers as Data Center Sensor Networks
Programmable I/O Controllers as Data Center Sensor NetworksProgrammable I/O Controllers as Data Center Sensor Networks
Programmable I/O Controllers as Data Center Sensor Networks
 
Dell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation WebinarDell NVIDIA AI Powered Transformation Webinar
Dell NVIDIA AI Powered Transformation Webinar
 
InfiniBand Strengthens Leadership as the Interconnect Of Choice
InfiniBand Strengthens Leadership as the Interconnect Of ChoiceInfiniBand Strengthens Leadership as the Interconnect Of Choice
InfiniBand Strengthens Leadership as the Interconnect Of Choice
 
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...[Case study] Dakota Electric Association: Solutions to streamline GIS, design...
[Case study] Dakota Electric Association: Solutions to streamline GIS, design...
 
Hp moonshot update moabcon 2013
Hp moonshot update moabcon 2013Hp moonshot update moabcon 2013
Hp moonshot update moabcon 2013
 
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...
IS-4011, Accelerating Analytics on HADOOP using OpenCL, by Zubin Dowlaty and ...
 
Ahead of the NFV Curve with Truly Scale-out Network Function Cloudification
Ahead of the NFV Curve with Truly Scale-out Network Function CloudificationAhead of the NFV Curve with Truly Scale-out Network Function Cloudification
Ahead of the NFV Curve with Truly Scale-out Network Function Cloudification
 

Similar to Hipeac 2018 keynote Talk

Consumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxConsumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxRebekah Rodriguez
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1blewington
 
The past and the next 20 years? Scalable computing as a key evolution
The past and the next 20 years? Scalable computing as a key evolutionThe past and the next 20 years? Scalable computing as a key evolution
The past and the next 20 years? Scalable computing as a key evolutionDesign And Reuse
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERAchronix
 
HKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
HKG15-The Machine: A new kind of computer- Keynote by Dejan MilojicicHKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
HKG15-The Machine: A new kind of computer- Keynote by Dejan MilojicicLinaro
 
UK Data Centre Capabilty Presentation Rev.A
UK Data Centre Capabilty Presentation Rev.AUK Data Centre Capabilty Presentation Rev.A
UK Data Centre Capabilty Presentation Rev.AGary Marshall
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spacejsvetter
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
Power Quality in Internet Data Centers
Power Quality in Internet Data CentersPower Quality in Internet Data Centers
Power Quality in Internet Data CentersLeonardo ENERGY
 
Delivering Carrier Grade OCP for Virtualized Data Centers
Delivering Carrier Grade OCP for Virtualized Data CentersDelivering Carrier Grade OCP for Virtualized Data Centers
Delivering Carrier Grade OCP for Virtualized Data CentersRadisys Corporation
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudRebekah Rodriguez
 
Accelerating HPC with Ethernet
Accelerating HPC with EthernetAccelerating HPC with Ethernet
Accelerating HPC with Ethernetinside-BigData.com
 
Strategies to architecting ultra-efficient data centers
Strategies to architecting ultra-efficient data centersStrategies to architecting ultra-efficient data centers
Strategies to architecting ultra-efficient data centersInfinera
 
Accelerating Cloud Services - Intel
Accelerating Cloud Services - IntelAccelerating Cloud Services - Intel
Accelerating Cloud Services - IntelAmazon Web Services
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
 
Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Dell World
 

Similar to Hipeac 2018 keynote Talk (20)

China AI Summit talk 2017
China AI Summit talk 2017China AI Summit talk 2017
China AI Summit talk 2017
 
Consumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a BoxConsumption Based On-Demand Private Cloud in a Box
Consumption Based On-Demand Private Cloud in a Box
 
Cloud Networking Trends
Cloud Networking TrendsCloud Networking Trends
Cloud Networking Trends
 
MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1MIG 5th Data Centre Summit 2016 PTS Presentation v1
MIG 5th Data Centre Summit 2016 PTS Presentation v1
 
The past and the next 20 years? Scalable computing as a key evolution
The past and the next 20 years? Scalable computing as a key evolutionThe past and the next 20 years? Scalable computing as a key evolution
The past and the next 20 years? Scalable computing as a key evolution
 
GTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWERGTC15-Manoj-Roge-OpenPOWER
GTC15-Manoj-Roge-OpenPOWER
 
Disruptive Technologies
Disruptive TechnologiesDisruptive Technologies
Disruptive Technologies
 
HKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
HKG15-The Machine: A new kind of computer- Keynote by Dejan MilojicicHKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
HKG15-The Machine: A new kind of computer- Keynote by Dejan Milojicic
 
Rendering in the Cloud
Rendering in the CloudRendering in the Cloud
Rendering in the Cloud
 
UK Data Centre Capabilty Presentation Rev.A
UK Data Centre Capabilty Presentation Rev.AUK Data Centre Capabilty Presentation Rev.A
UK Data Centre Capabilty Presentation Rev.A
 
Exploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design spaceExploring emerging technologies in the HPC co-design space
Exploring emerging technologies in the HPC co-design space
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
Power Quality in Internet Data Centers
Power Quality in Internet Data CentersPower Quality in Internet Data Centers
Power Quality in Internet Data Centers
 
Delivering Carrier Grade OCP for Virtualized Data Centers
Delivering Carrier Grade OCP for Virtualized Data CentersDelivering Carrier Grade OCP for Virtualized Data Centers
Delivering Carrier Grade OCP for Virtualized Data Centers
 
Accelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to CloudAccelerating Innovation from Edge to Cloud
Accelerating Innovation from Edge to Cloud
 
Accelerating HPC with Ethernet
Accelerating HPC with EthernetAccelerating HPC with Ethernet
Accelerating HPC with Ethernet
 
Strategies to architecting ultra-efficient data centers
Strategies to architecting ultra-efficient data centersStrategies to architecting ultra-efficient data centers
Strategies to architecting ultra-efficient data centers
 
Accelerating Cloud Services - Intel
Accelerating Cloud Services - IntelAccelerating Cloud Services - Intel
Accelerating Cloud Services - Intel
 
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableSupermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super Affordable
 
Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...Give Your Organization Better, Faster Insights & Answers with High Performanc...
Give Your Organization Better, Faster Insights & Answers with High Performanc...
 

More from Dileep Bhandarkar

Open Compute Summit Keynote 17 June 2011
Open Compute Summit Keynote 17 June 2011Open Compute Summit Keynote 17 June 2011
Open Compute Summit Keynote 17 June 2011Dileep Bhandarkar
 
Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Dileep Bhandarkar
 
Energy Efficiency Considerations in Large Datacenters
Energy Efficiency Considerations in Large DatacentersEnergy Efficiency Considerations in Large Datacenters
Energy Efficiency Considerations in Large DatacentersDileep Bhandarkar
 
Data center-server-cooling-power-management-paper
Data center-server-cooling-power-management-paperData center-server-cooling-power-management-paper
Data center-server-cooling-power-management-paperDileep Bhandarkar
 
Moscow conference keynote in 2012
Moscow conference keynote in 2012Moscow conference keynote in 2012
Moscow conference keynote in 2012Dileep Bhandarkar
 
New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11Dileep Bhandarkar
 
Performance Characterization of the Pentium Pro Processor
Performance Characterization of the Pentium Pro ProcessorPerformance Characterization of the Pentium Pro Processor
Performance Characterization of the Pentium Pro ProcessorDileep Bhandarkar
 
Innovation lecture for hong kong
Innovation lecture for hong kongInnovation lecture for hong kong
Innovation lecture for hong kongDileep Bhandarkar
 
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...Dileep Bhandarkar
 
Qualcomm centriq 2400 hot chips final submission corrected
Qualcomm centriq 2400 hot chips final submission correctedQualcomm centriq 2400 hot chips final submission corrected
Qualcomm centriq 2400 hot chips final submission correctedDileep Bhandarkar
 
Innovation lecture for shanghai final
Innovation lecture for shanghai finalInnovation lecture for shanghai final
Innovation lecture for shanghai finalDileep Bhandarkar
 
Dileep Random Access Talk at salishan 2016
Dileep Random Access Talk at salishan 2016Dileep Random Access Talk at salishan 2016
Dileep Random Access Talk at salishan 2016Dileep Bhandarkar
 
Server design summit keynote handout
Server design summit keynote handoutServer design summit keynote handout
Server design summit keynote handoutDileep Bhandarkar
 
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor Chips
My Feb 2003  HPCA9 Keynote Slides - Billion Transistor Processor ChipsMy Feb 2003  HPCA9 Keynote Slides - Billion Transistor Processor Chips
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor ChipsDileep Bhandarkar
 

More from Dileep Bhandarkar (20)

Open Compute Summit Keynote 17 June 2011
Open Compute Summit Keynote 17 June 2011Open Compute Summit Keynote 17 June 2011
Open Compute Summit Keynote 17 June 2011
 
Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010Datacenter Dynamics Chicago 30 sept 2010
Datacenter Dynamics Chicago 30 sept 2010
 
Energy Efficiency Considerations in Large Datacenters
Energy Efficiency Considerations in Large DatacentersEnergy Efficiency Considerations in Large Datacenters
Energy Efficiency Considerations in Large Datacenters
 
Samsung cio-forum-2012
Samsung cio-forum-2012 Samsung cio-forum-2012
Samsung cio-forum-2012
 
Data center-server-cooling-power-management-paper
Data center-server-cooling-power-management-paperData center-server-cooling-power-management-paper
Data center-server-cooling-power-management-paper
 
Moscow conference keynote in 2012
Moscow conference keynote in 2012Moscow conference keynote in 2012
Moscow conference keynote in 2012
 
New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11New Delhi Cloud Summit 05 26-11
New Delhi Cloud Summit 05 26-11
 
Performance Characterization of the Pentium Pro Processor
Performance Characterization of the Pentium Pro ProcessorPerformance Characterization of the Pentium Pro Processor
Performance Characterization of the Pentium Pro Processor
 
Innovation lecture for hong kong
Innovation lecture for hong kongInnovation lecture for hong kong
Innovation lecture for hong kong
 
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
Performance from Architecture: Comparing a RISC and a CISC with Similar Hardw...
 
Qualcomm centriq 2400 hot chips final submission corrected
Qualcomm centriq 2400 hot chips final submission correctedQualcomm centriq 2400 hot chips final submission corrected
Qualcomm centriq 2400 hot chips final submission corrected
 
Innovation lecture for shanghai final
Innovation lecture for shanghai finalInnovation lecture for shanghai final
Innovation lecture for shanghai final
 
Semicon2018 dileepb
Semicon2018 dileepbSemicon2018 dileepb
Semicon2018 dileepb
 
Alpha memo july 1992
Alpha memo july 1992Alpha memo july 1992
Alpha memo july 1992
 
Dileep Random Access Talk at salishan 2016
Dileep Random Access Talk at salishan 2016Dileep Random Access Talk at salishan 2016
Dileep Random Access Talk at salishan 2016
 
Risc vs cisc
Risc vs ciscRisc vs cisc
Risc vs cisc
 
Moscow conference keynote
Moscow conference keynoteMoscow conference keynote
Moscow conference keynote
 
Server design summit keynote handout
Server design summit keynote handoutServer design summit keynote handout
Server design summit keynote handout
 
DileepB EDPS talk 2015
DileepB  EDPS talk 2015DileepB  EDPS talk 2015
DileepB EDPS talk 2015
 
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor Chips
My Feb 2003  HPCA9 Keynote Slides - Billion Transistor Processor ChipsMy Feb 2003  HPCA9 Keynote Slides - Billion Transistor Processor Chips
My Feb 2003 HPCA9 Keynote Slides - Billion Transistor Processor Chips
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 

Hipeac 2018 keynote Talk

  • 1. Qualcomm Datacenter Technologies, Inc. Emerging Computing Trends in the Datacenter Dileep Bhandarkar, Ph. D. Vice President, Technology HiPEAC18 Keynote – 23 January 2018, Manchester, United Kingdom
  • 2. Outline • Historical Perspective on 40 Years of Moore’s Law – Single Core Era enabled by Dennard Scaling • Post Dennard Scaling Drives Multi-Core Era • The Shift to Energy Efficient Multi-Core Designs for the Cloud • Heterogenous Computing Era with Application Specific Accelerators
  • 3. The First 50 Years after Shockley’s Transistor Invention
  • 4. 1958: Jack Kilby’s Integrated Circuit My 40+ Year Journey From Mainframes to Smartphones https://www.youtube.com/watch?v=7ptXpNFY3XM Bob Noyce’s Integrated Circuit
  • 5. From 2300 to >1Billion Transistors Moore’s Law video at http://www.cs.ucr.edu/~gupta/hpca9/HPCA-PDFs/Moores_Law_Video_HPCA9.wmv
  • 6. Dennard Scaling Device or Circuit Parameter Scaling Factor Device dimension tox, L, W 1/K Doping concentration Na K Voltage V 1/K Current I 1/K Capacitance eA/t 1/K Delay time per circuit VC/I 1/K Power dissipation per circuit VI 1/K2 Power density VI/A 1 The benefits of scaling : as transistors get smaller, they can switch faster and use less power. Each new generation of process technology was expected to reduce minimum feature size by approximately 0.7x (K ~1.4). A 0.7x reduction in linear features size provided roughly a 2x increase in transistor density. Dennard scaling broke down around 2004 with unscaled interconnect delays and our inability to scale the voltage and current due to reliability concerns. But increasing transistor density (Moore’s Law) has continued to enable multicore designs.
  • 7. THE MULTICORE ERA SINGLE THREAD PERFORMANCE IMPROVEMENT SLOWING DOWN PERFORMANCE DRIVEN BY HIGHER CORE COUNT Post Dennard Scaling
  • 8. Transistor Count Increasing Slower Improvement No Improvement Power Going Up With Performance Core count increasing to drive Performance Now Performance Improvement Comes from Higher Core Count at Similar Frequency with Each New Process Node
  • 9. The last 5 Generations of ~135W Xeon Processors Slow Improvement in IPC but per thread performance constrained by power Performance data from www.spec.org 8 cores Mar 2012 10 cores Sep 2013 12 cores Sep 2014 14 cores Apr 2016 18 cores Jul 2017
  • 10. No Improvement in Perf/Watt per Core even with higher power Performance data from www.spec.org
  • 11. Era of Energy Efficient Cores
  • 12. © 2017 Arm Limited12 Looking ahead from edge to cloud The future requires a new approach to CPU design Safe and autonomous Hyper-efficient Secure private compute Cortex beyond mobile Mixed reality Presented by Peter Greenhalgh at Hot Chips 2017
  • 13. 13 Cloud Traditional Enterprise IT %Totaldatacenterserverrevenue 0% 25% 50% 75% 100% 2013 2014 2015 2016 2017 2018 2019 2020 Server Industry is shifting to the Cloud
  • 14. Disruptions Come from Below! Mainframes Minicomputers RISC Systems Desktop PCs Notebooks Smart Phones Volume Performance Bell’s Law: hardware technology, networks, and interfaces allows new, smaller, more specialized computing devices to be introduced to serve a computing need.
  • 15. 15 Qualcomm Datacenter Technologies Uniquely positioned to leverage mobile growth and drive datacenter process leadership 65nm 45nm 28nm 20nm 10nm 1st in the industry 14nm Mobile driven NowThen Fab process tech driven by PC Fab process tech driven by mobile phones PC driven 2008 2010 2012 2016 20182014 1.5B units 256M units Smartphone unitsPC units 45nm 32nm 10nm14nm22nm A new world in datacenter: Manufacturing process Mobile Technology Disrupting the Cloud Datacenter
  • 16. 16 Qualcomm Centriq ™ 2400 Throughput performance Thread Density Quality of Service Energy Efficiency What Cloud means for Processor Architecture Key metrics • Perf / thread • Perf / Watt • Perf / mm2 The future requires a new approach to CPU design
  • 17. Datacenter Energy Efficiency Considerations Source: https://eta.lbl.gov/publications/united-states-data-center-energy, http://perspectives.mvdirona.com/ • US datacenters consumed about 70 billion kilowatt-hours of electricity in 2014 • Datacenters can cost between $10M and $20M per megawatt • Unused datacenter capacity can be expensive • 1W of server power can cost $1 per year in energy costs at 10 cents per KWH • Server power related costs can be 30 to 40% of overall datacenter operating costs • Servers need to be designed for efficient average power consumption instead just maximizing peak output efficiency Better Hyper-efficient Designs Needed to Improve Server Energy Efficiency
  • 18. 18 Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex Falkor duplex 8-Serdes SATACTL HDMA EMAC OCMEM QGIC USB USB USB USB PW QFPROMIMCMPM/CC 8-Serdes PCle 8-Serdes8-Serdes PCle 8-Serdes DDR DDR DDR MCMCMC DDR DDR DDR Coherent segmented ring interconnect L3L3L3L3 L3L3 L3L3L3L3 L3L3 MCMCMC • 48 custom Armv8 cores at 2.6 GHz peak frequency • Large 60 MB L3 cache • 6 DDR4 memory channels at 2667 MT/s • High bandwidth coherent ring • Low average power under typical load • Ultra low idle power • Cache Quality of Service • Inline memory bandwidth compression • Security rooted in hardware • Leading performance and energy efficiency Qualcomm Centriq 2400: Built for The Cloud Details at https://www.qualcomm.com/products/qualcomm-centriq-2400-processor
  • 19. 19 Quualcomm Centriq 2400 Drives Perf/W and Perf/Thread Leadership 1 1.71 1.04 1.25 1.38 1 1.18 0.77 0.93 0.99 1 0.69 0.74 0.75 0.72 1 2.02 1.84 1.86 1.70 1 1.01 0.92 0.93 0.85 1 0.24 0.59 0.40 0.27 QDF 2460 PLATINUM 8180 GOLD 6138 PLATINUM 8160 PLATINUM 8170 Power SPECintrate2006 Perf/Watt Perf/Core Perf/Thread Perf/$ IsoPower IsoPerf 48 cores 120 W TDP 657 SIR2006 $1,995 20 cores 125 W TDP 504 SIR2006 $2,612 26 cores 165 W TDP 653 SIR2006 $7,405 28 cores 205 W TDP 775 SIR2006 $10,009 Top Bin E7 Price 24 cores 150 W TDP 612 SIR2006 $4,702 Top Bin E5 Price SKU Performance based on internal tests for SPECintrate2006 (SIR) estimates using gcc O2
  • 20. 20 Qualcomm Centriq 2460 Lowers Average and Idle Power to Improve Cloud Server Density in Datacenters 0 20 40 60 80 100 120 AveragePower(Watts) 8W idle power 400. perlbench 401. bzip2 403. gcc 429. mcf 445. gobmk 456. hmmer 458. libquantum 464. h264ref 471. omnetpp 473. astar 458. sieng 483. xalancbmk SPECint®_rate2006 subtests 120W TDP Median = 65W
  • 22. • Energy efficiency must be a implicit design target • Desktop PC CPU cores are too power hungry and not energy efficient • Wimpy cores are not good enough for servers • Servers can be designed by scaling up energy efficient mobile core design philosophy • Many workloads run best on different kinds of specialized processing engines • Each processing engine has its own strengths Lessons from Mobile Computing
  • 23. • Order of Magnitude higher computational efficiency than general purpose processors • Can accept inefficient implementation to reduce time to market • Many potential applications – Machine Learning – Encryption – Data Compression – Video processing • Need reasonable volume for business case • Algorithms need to be stable • Can they be programmable? Where do FPGAs fit? The Age of Application Specific Accelerators
  • 24. Before the emergence of DNNs  Algorithms and rule based systems were laboriously hand-coded But by 2012, the ingredients for change were available Sufficiently powerful GPU’s Readily available large data sets on the internet The Emergence of Deep Neural Networks Deep Neural Networks are becoming Pervasive  The turning point - ImageNet Competition 2012  “ImageNet Classification with Deep Convolutional Neural Networks”, Neural Information Processing Systems Conference (NIPS 2012)  Deep Neural Net enabled a performance breakthrough  Now - DNN’s are simpler to develop and deploy, ushering in radical change in many fields and entire industries
  • 25. Deep Learning is Growing Exponentially Source: Google Source: Google
  • 26. 2626 Devices,machines, and things are becoming more intelligent
  • 27. 2727 Learn, infer context, anticipate Reasoning Act intuitively, interact naturally, protect privacy Action Hear, see, monitor, observe Perception Offering new capabilities to enrich our lives
  • 30. Deploying DNNs at Datacenter Scale Training tends toward concentrated, centralized computation Inference tends toward wide distribution GPUs Large DPU CPUs Small DPU
  • 31. CPUs are not powerful enough for training, but have free cycles available for inference – opportunity for add-in accelerator cards  Instruction Set enhancements can improve performance GPUs have too much “extra baggage” that add cost and power for features not needed for AI – opportunity for domain specific accelerators FPGAs offer more flexibility, but are difficult to program and expensive ASICs are energy and product cost efficient, but less flexible Deep neural networks are making significant strides in many areas  speech, vision, language, search, robotics, medical imaging & treatment, drug discovery … We have an opportunity to dramatically reshape our computing devices to better serve this emerging and growing market Expect to see lots of innovation and excitement in the years to come Thoughts on Future Silicon for Deep Learning
  • 32. • Single thread general purpose performance improvement is slowing down • Energy efficiency is extremely important in datacenters • ARM architecture enables energy efficient designs with good performance • Typical-use efficiency is becoming more important than peak output efficiency in enterprise data centers • Idle mode power will become more important for servers • Smart power management can dynamically optimize server operation to improve efficiency in normal use • There is plenty of opportunity for innovation on new application specific architectures targeted for specific workloads Concluding Remarks
  • 33. Follow us on: For more information, visit us at: www.qualcomm.com & www.qualcomm.com/blog Nothing in these materials is an offer to sell any of the components or devices referenced herein. ©2018 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Qualcomm is a trademark of Qualcomm Incorporated, registered in the United States and other countries, Qualcomm Centriq and Falkor are trademarks of Qualcomm Incorporated. Other products and brand names may be trademarks or registered trademarks of their respective owners. References in this presentation to “Qualcomm” may mean Qualcomm Incorporated, Qualcomm Technologies, Inc., and/or other subsidiaries or business units within the Qualcomm corporate structure, as applicable. Qualcomm Incorporated includes Qualcomm’s licensing business, QTL, and the vast majority of its patent portfolio. Qualcomm Technologies, Inc., a wholly-owned subsidiary of Qualcomm Incorporated, operates, along with its subsidiaries, substantially all of Qualcomm’s engineering, research and development functions, and substantially all of its product and services businesses, including its semiconductor business, QCT. Thank you