SlideShare a Scribd company logo
1 of 18
Download to read offline
OpenPOWER and POWER9
Steve Fields
IBM Fellow
Chief Engineer of Power Systems
August 13, 2018
©2016 IBM Corporation 2
Fundamental forces are accelerating change in our industry
Price/Performance
Full system stack innovation required
Moore’s Law
IT innovation can no longer come from
just the processor
Cognitive
Custom Hyperscale
Data Centers
Hybrid Cloud
Open Solutions
IT consumption models
are expanding
Technology and
Processors
2000 2020
Firmware / OS
Accelerators
Software
Storage
Network
2
Full Stack
Acceleration
(Lower is
better)
Not only is Moore’s Law “coming to an end in practical term, in that chip speeds can be expected to stall, but it
is actually likely to roll back in terms of performance …”– William Holt, Intel Executive Vice President and General Manager
©2016 IBM Corporation 3
System Architecture Must Keep Up
3
Memory
I/O
Bandwidth per Core
2012 2018
1. Bandwidth demand of applications is
increasing
2. Inefficiencies of traditional I/O subsystem
become more pronounced at high
throughput
3. Storage Class Memory will be a major
disruptor
Must think about CPU & System
Architecture Differently
• More bandwidth to memory, network, storage
• New interfaces to remove software overhead
and allow I/O and accelerator devices to
integrate natively within applications
• Flexibility to exploit diversity of memory
technologies
Proposed Ecosystem Enablement
System Operating Environment Software Stack
A modern development environment is emerging
based on tools and services
Cloud
Software
Operating
System / KVM
Standard Operating
Environment
(System Mgmt)
Software
Power Open Source Software Stack Components
Existing
Open
Source
Software
Communities
Firmware
Hardware
New OSS
Community
OpenPOWER
Technology
OpenPOWER
Firmware
CAPP
PCIe
POWER8
CAPI over PCIe
Standard POWER Products – 2014
Hardware
“Custom POWER SoC” – Future
Customizable
Framework to Integrate
System IP on Chip
Industry IP License Model
Multiple Options to Design with POWER Technology Within OpenPOWER
40,000 packages now available
PCI Gen X
OpenCAPI Northbound
OpenCAPI & PCI
Yesterday’s Plumbing
Tomorrow’s Differentiation
JEDEC Buffer
OpenCAPI / NVLink
Future Evolution of System Architecture
CPU/Accelerator Bandwidth
System Bus
768 GB/s
BBBBBB BBBBBB BBBBBB
Cores &
Caches
Cores &
Caches
Cores &
Caches
Yesterday’s Plumbing
Tomorrow’s Differentiation
System bottleneck
1x 2x
5x 7-10x
5
IBM Systems | 6
POWER8
© 2016 IBM Corporation7
Memory
Interface
Control
Memory
IBM & Partner
Devices
CAPI/PCI
POWER8 Processor
DMI
Cores
• 12 cores / 8 threads per core
• TDP: 130W and 190W
• 64K data cache, 32K instruction cache
Accelerators
• Crypto & memory expansion
• Transactional Memory
Caches
• 512 KB SRAM L2 / core
• 96 MB eDRAM shared L3
Memory Subsystem
• Memory buffers with 128MB Cache
• ~70ns latency to memory
Bus Interfaces
• Durable Memory attach Interface (DMI)
• Integrated PCIe Gen3
• SMP Interconnect for up to 4 sockets
Virtual Addressing
• Accelerator can work with same memory
addresses that the processors use
• Pointers de-referenced same as the host
application
• Removes OS & device driver overhead
Hardware Managed Cache Coherence
• Enables the accelerator to participate in “Locks”
as a normal thread
• Lowers Latency over IO communication model
6 Hardware Partners developing with CAPI
Over 20 CAPI Solutions
• All listed here http://ibm.biz/powercapi
Examples of Available CAPI Solutions
• IBM Data Engine for NoSQL
• DRC Graphfind analytics
• Erasure Code Acceleration for Hadoop
Coherent Accelerator Processor Interface
(CAPI)
22nm SOI, eDRAM, 15 ML 650mm2
Server Class Memories
(SCM)
• First Functioning Demo of SCM in an Enterprise system
• 15x better than SSD
• Natively non-volatile ST-MRAM DIMMs (Everspin)
• Avoids NVDIMM complications (DRAM to FLASH and back)
• No Supercaps with umbilicals required
SMP
Traditional I/O Subsystem
8Think 2018 / DOC ID / March 21, 2018 / © 2018 IBM Corporation
CPU and Memory overheads increase with device throughput
Accelerator only useful for large operations with very high value
Coherent Attach Model
POWER8 CAPI (coherent accelerator/processor interface)
9Think 2018 / DOC ID / March 21, 2018 / © 2018 IBM Corporation
CPU and memory overhead is eliminated
CPU and Accelerator cooperate natively to execute the application
Programmer treats accelerator much like another CPU thread
IBM Systems
|10
Introducing 822LC Power System for HPC
First Custom-Built GPU Accelerator Server with NVLink
2.5x Faster CPU-GPU Data
Communication via NVLink
NVLink
80 GB/s
GPU
P8
GPU GPU
P8
GPU
PCIe
32 GB/s
GPU
x86
GPU GPU
x86
GPU
No NVLink between CPU & GPU
for x86 Servers: PCIe Bottleneck
NVIDIA P100 Pascal GPU
POWER8 NVLink Server x86 Servers with PCIe
• Custom-built GPU Accelerator Server
• High-Speed NVLink Connections between
CPUs & GPUs and among GPUs
• Features novel NVIDIA P100 Pascal GPU
accelerator
IBM Systems | 11
POWER9
© 2014 IBM Corporation
POWER9 Processor – Common Features
12
New Core Microarchitecture
• Stronger thread performance
• Efficient agile pipeline
• POWER ISA v3.0
Enhanced Cache Hierarchy
• 120MB NUCA L3 architecture
• 12 x 20-way associative regions
• Improved LRU heuristics
• Fed by 7 TB/s On Chip Bandwidth
Cloud + Virtualization Innovation
• Quality of service assists
• New interrupt architecture
• Workload optimized frequency
• Hardware enforced trusted execution
Leadership
Hardware Acceleration Platform
• Enhanced on-die acceleration
• NVLINK 2.0: High bandwidth,
coherent GPU (25G Link)
• CAPI 2.0: Coherent accelerator and
storage attach (PCIe G4)
• OpenCAPI: Improved latency and
bandwidth (25G Link)
State of the Art IO Subsystem
• 48 PCIeG4 lanes – 192GB/s
High Bandwidth
Signaling Technology
• 16 Gb/s interface
– local SMP
• 25 Gb/s IBM interface
– Accelerator, remote SMP
Accel
Link
Cor
e
L2 L2 L2 L2
L2 L2 L2 L2
L2 L2 L2 L2
L3
Region
Memory
Interface
Accel
Link
SMP
SMPInterconnect&
OffChipAcceleratorEnablement
PCIe
Memory
Interface
SMP/Accelerator Signaling
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
On Chip Accel
Memory Signaling
SMP/Accelerator Signaling Memory Signaling
SMPSignaling
PCIeSignaling
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
L3
Region
14nm finFET Semiconductor Process
- Improved device performance &
reduced energy
- 17 layer metal stack, eDRAM
- 8.0 Billion Transistors
© 2014 IBM Corporation
Scale Out
Direct Attach Memory
8 Direct DDR4 Ports 8 Buffered Channels
POWER9 – Dual Memory Subsystems
Scale Up
Buffered Memory
13
• Up to 140 GB/s of sustained bandwidth
• Low latency access
• Commodity packaging form factor
• Adaptive 64B / 128B reads
• Up to 230GB/s of sustained bandwidth
• Extreme capacity – up to 8TB / socket
• Superior RAS w/ chip kill & lane sparing
• Durable from POWER8 systems
• Agnostic interface for alternate memory innovations
© 2014 IBM Corporation
14
© 2018 IBM Corporation
OpenCAPI – An Open heterogeneous architecture standard
OpenCAPI 3.0
OpenCAPI 3.1
OpenCAPI specifications
are downloadable from the
website
at www.opencapi.org
- Register
- Download
IBM Systems | 16
NVLink Evolution in POWER HPC
Graphics Memory Graphics Memory
NVIDIA Volta GPU with NVLink 2.0
POWER9
Graphics Memory
System Memory
75+75
G
B
/s
75+75
G
B
/s
75+75 GB/s
Graphics Memory
System Memory
50+50
G
B/s
50+50
GB/s
50+50
GB/s
50+50GB/s
50+50 GB/s
Graphics Memory
System Memory POWER9
Graphics Memory
POWER8
NVIDIA P100 GPU with NVLink 1.0
2016 2017-2018
Graphics Memory
50+50
GB/s
© 2014 IBM Corporation
Two super computers for Oak Ridge and Lawrence
Livermore Labs in 2017.
Sequoia (LLNL)
2012 - 2017
Mira (ANL)
2012 - 2017
Titan (ORNL)
2012 - 2017
Current DOE Leadership Computers
IBM, Mellanox, and NVIDIA awarded
$325M U.S. Department of Energy’s Super Computer bids
5X – 10X Higher Application Performance versus Current Systems
>100 PF, 2 GB/core main memory, local NVRAM,
Mellanox EDR 100Gb/s InfiniBand, IBM POWER CPUs, NVIDIA Tesla GPUs
© 2015 OpenPOWER Foundation
17
US DOE CORAL Program
PowerAI
18
Open Source Frameworks:
Supported Distribution
Developer Ease-of-Use Tools
Faster Training Times via
HW & SW Performance Optimizations
Integrated & Supported AI Platform
Higher Productivity for Data Scientists
Enable non-Data Scientists to use AI
GPU-Accelerated
Power Servers
Storage

More Related Content

What's hot

Optimize Your It Environment With An Hp Blade System Solution
Optimize Your It Environment With An Hp Blade System SolutionOptimize Your It Environment With An Hp Blade System Solution
Optimize Your It Environment With An Hp Blade System Solutionaljimenez
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...Filipe Miranda
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI Anand Haridass
 
Complexity comparison: Cisco UCS vs. HP Virtual Connect
Complexity comparison: Cisco UCS vs. HP Virtual ConnectComplexity comparison: Cisco UCS vs. HP Virtual Connect
Complexity comparison: Cisco UCS vs. HP Virtual ConnectPrincipled Technologies
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox Technologies
 
HP Blades Presentation
HP Blades PresentationHP Blades Presentation
HP Blades PresentationBhavin Vyas
 
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)aljimenez
 
RONNIEE Express: A Dramatic Shift in Network Architecture
RONNIEE Express: A Dramatic Shift in Network ArchitectureRONNIEE Express: A Dramatic Shift in Network Architecture
RONNIEE Express: A Dramatic Shift in Network Architectureinside-BigData.com
 
HP Bladesystem Overview September 2009
HP Bladesystem Overview September 2009HP Bladesystem Overview September 2009
HP Bladesystem Overview September 2009Louis Göhl
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIBM Switzerland
 
Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerThink Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerAnand Haridass
 
Accelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPAccelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPSpiceworks Ziff Davis
 
Power8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshopPower8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshopsolarisyougood
 
Chapter 04: Storage virtualization basics
Chapter 04: Storage virtualization basicsChapter 04: Storage virtualization basics
Chapter 04: Storage virtualization basicsSsendiSamuel
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuAlan Sill
 
Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Ceph Community
 

What's hot (20)

Optimize Your It Environment With An Hp Blade System Solution
Optimize Your It Environment With An Hp Blade System SolutionOptimize Your It Environment With An Hp Blade System Solution
Optimize Your It Environment With An Hp Blade System Solution
 
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI POWER9 AC922 Newell System - HPC & AI
POWER9 AC922 Newell System - HPC & AI
 
POWER10 innovations for HPC
POWER10 innovations for HPCPOWER10 innovations for HPC
POWER10 innovations for HPC
 
Complexity comparison: Cisco UCS vs. HP Virtual Connect
Complexity comparison: Cisco UCS vs. HP Virtual ConnectComplexity comparison: Cisco UCS vs. HP Virtual Connect
Complexity comparison: Cisco UCS vs. HP Virtual Connect
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for Ceph
 
HP Blades Presentation
HP Blades PresentationHP Blades Presentation
HP Blades Presentation
 
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)
Hp Industry Standard Solutions For Microsoft Windows Server (96dpi)
 
RONNIEE Express: A Dramatic Shift in Network Architecture
RONNIEE Express: A Dramatic Shift in Network ArchitectureRONNIEE Express: A Dramatic Shift in Network Architecture
RONNIEE Express: A Dramatic Shift in Network Architecture
 
HP Bladesystem Overview September 2009
HP Bladesystem Overview September 2009HP Bladesystem Overview September 2009
HP Bladesystem Overview September 2009
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 SupercomputerThink Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
Think Leadership March 2019 - How You Can Have A Piece Of The #1 Supercomputer
 
Accelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HPAccelerate Innovation in Your Business with HP
Accelerate Innovation in Your Business with HP
 
Power8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshopPower8 hardware technical deep dive workshop
Power8 hardware technical deep dive workshop
 
Chapter 04: Storage virtualization basics
Chapter 04: Storage virtualization basicsChapter 04: Storage virtualization basics
Chapter 04: Storage virtualization basics
 
Design installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttuDesign installation-commissioning-red raider-cluster-ttu
Design installation-commissioning-red raider-cluster-ttu
 
Dell & HP Blade Systems Overview
Dell & HP Blade Systems Overview Dell & HP Blade Systems Overview
Dell & HP Blade Systems Overview
 
Phytium 64 core cpu preview
Phytium 64 core cpu previewPhytium 64 core cpu preview
Phytium 64 core cpu preview
 
Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance Deploying flash storage for Ceph without compromising performance
Deploying flash storage for Ceph without compromising performance
 

Similar to Power overview 2018 08-13b

IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specificationsinside-BigData.com
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIBM Switzerland
 
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Redis Labs
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedis Labs
 
OpenCAPI next generation accelerator
OpenCAPI next generation accelerator OpenCAPI next generation accelerator
OpenCAPI next generation accelerator Ganesan Narayanasamy
 
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems Specialist
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems SpecialistOWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems Specialist
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems SpecialistParis Open Source Summit
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
32960 lar visit 022713v2
32960 lar visit 022713v232960 lar visit 022713v2
32960 lar visit 022713v2gmazuel
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Netronome
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeAnand Haridass
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors DataWorks Summit/Hadoop Summit
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Indrajit Poddar
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 

Similar to Power overview 2018 08-13b (20)

Summit workshop thompto
Summit workshop thomptoSummit workshop thompto
Summit workshop thompto
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
IBM Power9 Features and Specifications
IBM Power9 Features and SpecificationsIBM Power9 Features and Specifications
IBM Power9 Features and Specifications
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
Ibm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bkIbm symp14 referentin_barbara koch_power_8 launch bk
Ibm symp14 referentin_barbara koch_power_8 launch bk
 
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
Scaling Redis Cluster Deployments for Genome Analysis (featuring LSU) - Terry...
 
RedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power SystemsRedisConf17 - Redis Enterprise on IBM Power Systems
RedisConf17 - Redis Enterprise on IBM Power Systems
 
Demystify OpenPOWER
Demystify OpenPOWERDemystify OpenPOWER
Demystify OpenPOWER
 
OpenCAPI next generation accelerator
OpenCAPI next generation accelerator OpenCAPI next generation accelerator
OpenCAPI next generation accelerator
 
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems Specialist
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems SpecialistOWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems Specialist
OWF14 - Plenary Session : Thibaud Besson, IBM POWER Systems Specialist
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
32960 lar visit 022713v2
32960 lar visit 022713v232960 lar visit 022713v2
32960 lar visit 022713v2
 
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand Challenge
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
How to Optimize Hortonworks Apache Spark ML Workloads on Modern Processors
 
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
Optimizing Hortonworks Apache Spark machine learning workloads for contempora...
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
IBM PureSystems
IBM PureSystemsIBM PureSystems
IBM PureSystems
 

More from Ganesan Narayanasamy

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency programGanesan Narayanasamy
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and VerilogGanesan Narayanasamy
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISAGanesan Narayanasamy
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Ganesan Narayanasamy
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsGanesan Narayanasamy
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...Ganesan Narayanasamy
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsGanesan Narayanasamy
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsGanesan Narayanasamy
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems Ganesan Narayanasamy
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Ganesan Narayanasamy
 

More from Ganesan Narayanasamy (20)

Chip Design Curriculum development Residency program
Chip Design Curriculum development Residency programChip Design Curriculum development Residency program
Chip Design Curriculum development Residency program
 
Basics of Digital Design and Verilog
Basics of Digital Design and VerilogBasics of Digital Design and Verilog
Basics of Digital Design and Verilog
 
180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA180 nm Tape out experience using Open POWER ISA
180 nm Tape out experience using Open POWER ISA
 
Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture Workload Transformation and Innovations in POWER Architecture
Workload Transformation and Innovations in POWER Architecture
 
OpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT RoorkeeOpenPOWER Workshop at IIT Roorkee
OpenPOWER Workshop at IIT Roorkee
 
Deep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systemsDeep Learning Use Cases using OpenPOWER systems
Deep Learning Use Cases using OpenPOWER systems
 
IBM BOA for POWER
IBM BOA for POWER IBM BOA for POWER
IBM BOA for POWER
 
OpenPOWER System Marconi100
OpenPOWER System Marconi100OpenPOWER System Marconi100
OpenPOWER System Marconi100
 
OpenPOWER Latest Updates
OpenPOWER Latest UpdatesOpenPOWER Latest Updates
OpenPOWER Latest Updates
 
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
OpenCAPI-based Image Analysis Pipeline for 18 GB/s kilohertz-framerate X-ray ...
 
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systemsAI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
AI in healthcare and Automobile Industry using OpenPOWER/IBM POWER9 systems
 
AI in healthcare - Use Cases
AI in healthcare - Use Cases AI in healthcare - Use Cases
AI in healthcare - Use Cases
 
AI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systemsAI in Health Care using IBM Systems/OpenPOWER systems
AI in Health Care using IBM Systems/OpenPOWER systems
 
AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems AI in Healh Care using IBM POWER systems
AI in Healh Care using IBM POWER systems
 
Poster from NUS
Poster from NUSPoster from NUS
Poster from NUS
 
SAP HANA on POWER9 systems
SAP HANA on POWER9 systemsSAP HANA on POWER9 systems
SAP HANA on POWER9 systems
 
Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9Graphical Structure Learning accelerated with POWER9
Graphical Structure Learning accelerated with POWER9
 
AI in the enterprise
AI in the enterprise AI in the enterprise
AI in the enterprise
 
Robustness in deep learning
Robustness in deep learningRobustness in deep learning
Robustness in deep learning
 
Perspectives of Frond end Design
Perspectives of Frond end DesignPerspectives of Frond end Design
Perspectives of Frond end Design
 

Recently uploaded

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Recently uploaded (20)

Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Power overview 2018 08-13b

  • 1. OpenPOWER and POWER9 Steve Fields IBM Fellow Chief Engineer of Power Systems August 13, 2018
  • 2. ©2016 IBM Corporation 2 Fundamental forces are accelerating change in our industry Price/Performance Full system stack innovation required Moore’s Law IT innovation can no longer come from just the processor Cognitive Custom Hyperscale Data Centers Hybrid Cloud Open Solutions IT consumption models are expanding Technology and Processors 2000 2020 Firmware / OS Accelerators Software Storage Network 2 Full Stack Acceleration (Lower is better) Not only is Moore’s Law “coming to an end in practical term, in that chip speeds can be expected to stall, but it is actually likely to roll back in terms of performance …”– William Holt, Intel Executive Vice President and General Manager
  • 3. ©2016 IBM Corporation 3 System Architecture Must Keep Up 3 Memory I/O Bandwidth per Core 2012 2018 1. Bandwidth demand of applications is increasing 2. Inefficiencies of traditional I/O subsystem become more pronounced at high throughput 3. Storage Class Memory will be a major disruptor Must think about CPU & System Architecture Differently • More bandwidth to memory, network, storage • New interfaces to remove software overhead and allow I/O and accelerator devices to integrate natively within applications • Flexibility to exploit diversity of memory technologies
  • 4. Proposed Ecosystem Enablement System Operating Environment Software Stack A modern development environment is emerging based on tools and services Cloud Software Operating System / KVM Standard Operating Environment (System Mgmt) Software Power Open Source Software Stack Components Existing Open Source Software Communities Firmware Hardware New OSS Community OpenPOWER Technology OpenPOWER Firmware CAPP PCIe POWER8 CAPI over PCIe Standard POWER Products – 2014 Hardware “Custom POWER SoC” – Future Customizable Framework to Integrate System IP on Chip Industry IP License Model Multiple Options to Design with POWER Technology Within OpenPOWER 40,000 packages now available
  • 5. PCI Gen X OpenCAPI Northbound OpenCAPI & PCI Yesterday’s Plumbing Tomorrow’s Differentiation JEDEC Buffer OpenCAPI / NVLink Future Evolution of System Architecture CPU/Accelerator Bandwidth System Bus 768 GB/s BBBBBB BBBBBB BBBBBB Cores & Caches Cores & Caches Cores & Caches Yesterday’s Plumbing Tomorrow’s Differentiation System bottleneck 1x 2x 5x 7-10x 5
  • 6. IBM Systems | 6 POWER8
  • 7. © 2016 IBM Corporation7 Memory Interface Control Memory IBM & Partner Devices CAPI/PCI POWER8 Processor DMI Cores • 12 cores / 8 threads per core • TDP: 130W and 190W • 64K data cache, 32K instruction cache Accelerators • Crypto & memory expansion • Transactional Memory Caches • 512 KB SRAM L2 / core • 96 MB eDRAM shared L3 Memory Subsystem • Memory buffers with 128MB Cache • ~70ns latency to memory Bus Interfaces • Durable Memory attach Interface (DMI) • Integrated PCIe Gen3 • SMP Interconnect for up to 4 sockets Virtual Addressing • Accelerator can work with same memory addresses that the processors use • Pointers de-referenced same as the host application • Removes OS & device driver overhead Hardware Managed Cache Coherence • Enables the accelerator to participate in “Locks” as a normal thread • Lowers Latency over IO communication model 6 Hardware Partners developing with CAPI Over 20 CAPI Solutions • All listed here http://ibm.biz/powercapi Examples of Available CAPI Solutions • IBM Data Engine for NoSQL • DRC Graphfind analytics • Erasure Code Acceleration for Hadoop Coherent Accelerator Processor Interface (CAPI) 22nm SOI, eDRAM, 15 ML 650mm2 Server Class Memories (SCM) • First Functioning Demo of SCM in an Enterprise system • 15x better than SSD • Natively non-volatile ST-MRAM DIMMs (Everspin) • Avoids NVDIMM complications (DRAM to FLASH and back) • No Supercaps with umbilicals required SMP
  • 8. Traditional I/O Subsystem 8Think 2018 / DOC ID / March 21, 2018 / © 2018 IBM Corporation CPU and Memory overheads increase with device throughput Accelerator only useful for large operations with very high value
  • 9. Coherent Attach Model POWER8 CAPI (coherent accelerator/processor interface) 9Think 2018 / DOC ID / March 21, 2018 / © 2018 IBM Corporation CPU and memory overhead is eliminated CPU and Accelerator cooperate natively to execute the application Programmer treats accelerator much like another CPU thread
  • 10. IBM Systems |10 Introducing 822LC Power System for HPC First Custom-Built GPU Accelerator Server with NVLink 2.5x Faster CPU-GPU Data Communication via NVLink NVLink 80 GB/s GPU P8 GPU GPU P8 GPU PCIe 32 GB/s GPU x86 GPU GPU x86 GPU No NVLink between CPU & GPU for x86 Servers: PCIe Bottleneck NVIDIA P100 Pascal GPU POWER8 NVLink Server x86 Servers with PCIe • Custom-built GPU Accelerator Server • High-Speed NVLink Connections between CPUs & GPUs and among GPUs • Features novel NVIDIA P100 Pascal GPU accelerator
  • 11. IBM Systems | 11 POWER9
  • 12. © 2014 IBM Corporation POWER9 Processor – Common Features 12 New Core Microarchitecture • Stronger thread performance • Efficient agile pipeline • POWER ISA v3.0 Enhanced Cache Hierarchy • 120MB NUCA L3 architecture • 12 x 20-way associative regions • Improved LRU heuristics • Fed by 7 TB/s On Chip Bandwidth Cloud + Virtualization Innovation • Quality of service assists • New interrupt architecture • Workload optimized frequency • Hardware enforced trusted execution Leadership Hardware Acceleration Platform • Enhanced on-die acceleration • NVLINK 2.0: High bandwidth, coherent GPU (25G Link) • CAPI 2.0: Coherent accelerator and storage attach (PCIe G4) • OpenCAPI: Improved latency and bandwidth (25G Link) State of the Art IO Subsystem • 48 PCIeG4 lanes – 192GB/s High Bandwidth Signaling Technology • 16 Gb/s interface – local SMP • 25 Gb/s IBM interface – Accelerator, remote SMP Accel Link Cor e L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L2 L3 Region Memory Interface Accel Link SMP SMPInterconnect& OffChipAcceleratorEnablement PCIe Memory Interface SMP/Accelerator Signaling Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e On Chip Accel Memory Signaling SMP/Accelerator Signaling Memory Signaling SMPSignaling PCIeSignaling L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region L3 Region 14nm finFET Semiconductor Process - Improved device performance & reduced energy - 17 layer metal stack, eDRAM - 8.0 Billion Transistors
  • 13. © 2014 IBM Corporation Scale Out Direct Attach Memory 8 Direct DDR4 Ports 8 Buffered Channels POWER9 – Dual Memory Subsystems Scale Up Buffered Memory 13 • Up to 140 GB/s of sustained bandwidth • Low latency access • Commodity packaging form factor • Adaptive 64B / 128B reads • Up to 230GB/s of sustained bandwidth • Extreme capacity – up to 8TB / socket • Superior RAS w/ chip kill & lane sparing • Durable from POWER8 systems • Agnostic interface for alternate memory innovations
  • 14. © 2014 IBM Corporation 14
  • 15. © 2018 IBM Corporation OpenCAPI – An Open heterogeneous architecture standard OpenCAPI 3.0 OpenCAPI 3.1 OpenCAPI specifications are downloadable from the website at www.opencapi.org - Register - Download
  • 16. IBM Systems | 16 NVLink Evolution in POWER HPC Graphics Memory Graphics Memory NVIDIA Volta GPU with NVLink 2.0 POWER9 Graphics Memory System Memory 75+75 G B /s 75+75 G B /s 75+75 GB/s Graphics Memory System Memory 50+50 G B/s 50+50 GB/s 50+50 GB/s 50+50GB/s 50+50 GB/s Graphics Memory System Memory POWER9 Graphics Memory POWER8 NVIDIA P100 GPU with NVLink 1.0 2016 2017-2018 Graphics Memory 50+50 GB/s
  • 17. © 2014 IBM Corporation Two super computers for Oak Ridge and Lawrence Livermore Labs in 2017. Sequoia (LLNL) 2012 - 2017 Mira (ANL) 2012 - 2017 Titan (ORNL) 2012 - 2017 Current DOE Leadership Computers IBM, Mellanox, and NVIDIA awarded $325M U.S. Department of Energy’s Super Computer bids 5X – 10X Higher Application Performance versus Current Systems >100 PF, 2 GB/core main memory, local NVRAM, Mellanox EDR 100Gb/s InfiniBand, IBM POWER CPUs, NVIDIA Tesla GPUs © 2015 OpenPOWER Foundation 17 US DOE CORAL Program
  • 18. PowerAI 18 Open Source Frameworks: Supported Distribution Developer Ease-of-Use Tools Faster Training Times via HW & SW Performance Optimizations Integrated & Supported AI Platform Higher Productivity for Data Scientists Enable non-Data Scientists to use AI GPU-Accelerated Power Servers Storage