SlideShare a Scribd company logo
1 of 13
HBM2/2E Memory
IP Interface Selection
and Implementation
Frank Ferro
Joseph Rodriguez
Fall 2020
2
Exponential Data Growth Mandates Increased Bandwidth
Source: Adapted from Jeff Dean, “Recent Advances in Artificial Intelligence and the
Implications for Computer System Design,” HotChips 29 Keynote, August 2017
More
Compute Neural Networks
Other Approaches
Accuracy
Scale (Data Size, Model Size)
1980s
– 1990s Now
• Exponential data growth is driving the need for new architectures
• Advances in computing have pushed bottleneck to memory
• Faster compute and large training sets needed for AI applications
Annual Size of the Global Datasphere
Source: Adapted from Data Age 2025, sponsored by Seagate
with data from IDC Global DataSphere, Nov 2018
2020 20252010 2015
20
40
60
80
100
120
140
160
180
175 ZB
Zettabytes
3
Two Important Memories for AI/ML
Extremely High Bandwidth
and High Capacity
HBM2E
High Bandwidth, High Reliability and
Low Latency
GDDR6
• AI Training
• HPC
• Network (NIC)
• AI Inference
• Graphics
• Automotive (ADAS)
4
HBM2E 4G Announcement
Rambus HBM2E Interface Operating at 4 Gbps
5
Choosing the Correct Memory: Comparison Data
Parameter LPDDR4x LPDDR5 DDR4 GDDR6 HBM2E
Bandwidth (Gbps) Low-Medium
(136)
Medium
(204)
Medium (200) High (512) Highest (3686)
Data Rate (Gbps) 4.266 6.4 3.2 16 3.6
Interface width (bits) 32 32 64 32 1024
Board Area / System
Design
Large / Medium Medium/
Medium
Large / Easy Medium /
Medium
Small / Complex
Efficiency (mW/Gbps) High (3) High (3) Moderate (10) Moderate (10) Highest (2)
Cost ($) Medium Medium Low Medium High
Reliability/Yield Good Good Good Good Moderate
Applications Mobile, AI Mobile, AI Compute,
Network
AI, Graphics, Auto AI, HPC, Network
6
Rambus HBM2E Solution Summary
HBM2E Memory Interface Subsystem
Advantages:
• Production experience
• Hardened timing-closed PHY
• System design support: interposer and
package
• Lab Station development environment:
bring-up support
HBM2E Interface Summary
• Verified HBM2E PHY and controller
• 461 GB/s bandwidth (@ 3.6 Gbps)
• Speed bins: 2, 2.4, 2.8, 3.2, 3.6,4.0Gbps
• DRAMs: Stack height of 2, 4, 8, 12
• Channels: 8x 128 bits
• ASIC Interface: DFI style
• Lane repair
• IEEE 1500 test support
• PHY independent mode
7
Interposer Reference Design
• Interposer design is a critical component of
2.5D system: Rambus provides reference
designs
• Support all foundries/OSAT 2.5D
manufacturing process
• Rambus works with customers to support
their interposer/package design for the
highest data rates
• Channel simulations
• Layout reviews and feedback
• Channel parameter optimization:
• Channel length, width, line spacing and
pitch, number of routing/ground layers
HBM DRAM Stack
Processor
HBM Interface
Interposer
Substrate
PCB
1024
Processor
HBMInterface
Interposer
HBM
DRAM
Stack
8
14LPP 12LP 12LP+ 11LP 7nm
Product HBM2 HBM2E HBM2E HBM2E HBM2E
Speed 2.0 Gbps 3.2 Gbps 3.2 Gbps 3.2 Gbps 4.0 Gbps
Rambus HBM2/2E IP - Market Leadership
~50 Design wins
3.6Gbps 4.0Gbps
9
Rambus HBM2/2E Memory Interface Solution: Controller
• Complete, configurable solution
• Handles all design, test and bring-up challenges
• Fully validated
AXI
Interface 1
AXI
Interface 2
RMW
ECC
HBM2E
Memory
Controller
Core
Multi-PortFront-End
Memory
Test
Mem Test
Analyzer
HBM2E
DRAM
Memory Controller
HBM2E PHY
8x 128-bit
Channels
Optional
blocks
Customer
SOC/ASIC
PHY
Rambus Integrated Memory Subsystem
10
HBM2/2E Controller Core
• Supports HBM2 / HBM2E
• 4, 8 / 12-high stacks
• 4, 8 / 6, 8, 12, 16, 24 Gbit density per channel
• 2, 2.4 / 2.8, 3.2, 3.6, 4.0 Gbit/s/pin
• Modular, highly configurable solution
• Delivered configured to customer requirements to minimize size, power, latency
• Memory parameters are run-time programmable
• High performance
• High bus efficiency across a wide variety of configurations (AXI, native interface) and traffic scenarios (random
and sequential accesses; short and long bursts, etc.)
• Reliability, Availability, Serviceability (RAS) support including ECC , ECC scrubbing and data path parity
protection
• Full featured Memory Test support
• Algorithmic, Arbitrary, Microcode Programmable address and data pattern options
• Delivered fully integrated and verified with the target HBM2/2E PHY
11
HBM2/2E Controller Core Validation
• Simulation-Based Verification
• UVM Memory Testbench
• Over 100 Test Sequences
• Vendor (Samsung, SK hynix) and Avery Design Systems memory models
• Hardware-Based Validation
• Perform testing across a wide range of motherboard and plug-in FPGA-based boards
• Utilize Rambus GUI, Command Line App to drive tests
• Silicon-proven
• Memory system testchips
• Controller Core + PHY
• Deployed in multiple customer designs
12
• #1 in market share ~50 customer designs
• First-time silicon success (no re-spins)
• Multiple tier 1 networking and AI/ML customers in production
• Very mature solution used in wide range of applications
• Performance
• Maximum throughput from both a bus efficiency and data rate
• First to achieve 4.0 Gbps for HBM2E memory interface
• Integrated and verified PHY and Controller solution
• PHY and Controller validated in both hardware and software
• PHY is a complete hardened macro including, PHY, IO, decap
• Provide interposer and package reference design – reduces customer
effort and design risk
• Strong customer support
• Work closely with customer in all project phases (design, tapeout, bring up)
• Lab Station development environment accelerates bring up
Why Choose Rambus HBM2/2E
HBM2E Hardware Development Board
Memory WR/RD scope shot
Thank you

More Related Content

What's hot

ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesAMD
 
Chiplets in Data Centers
Chiplets in Data CentersChiplets in Data Centers
Chiplets in Data CentersODSA Workgroup
 
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation SystemSynopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation SystemMostafa Khamis
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIAllan Cantle
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD
 
3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache AMD
 
Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power Deepak Shankar
 
Physical design
Physical design Physical design
Physical design Mantra VLSI
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...xKinAnx
 
Physical Design Flow Challenges at 28nm on Multi-million Gate Blocks
Physical Design Flow Challenges at 28nm on Multi-million Gate BlocksPhysical Design Flow Challenges at 28nm on Multi-million Gate Blocks
Physical Design Flow Challenges at 28nm on Multi-million Gate BlockseInfochips (An Arrow Company)
 
DDR, GDDR, HBM Memory : Presentation
DDR, GDDR, HBM Memory : PresentationDDR, GDDR, HBM Memory : Presentation
DDR, GDDR, HBM Memory : PresentationSubhajit Sahu
 
MemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemory Fabric Forum
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and designSatya Harish
 
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge Devices
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge DevicesMIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge Devices
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge DevicesMIPI Alliance
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOCA B Shinde
 
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APU
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APUDelivering a new level of visual performance in an SoC AMD "Raven Ridge" APU
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APUAMD
 

What's hot (20)

ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip ArchitecturesISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
ISSCC 2018: "Zeppelin": an SoC for Multi-chip Architectures
 
System-on-Chip
System-on-ChipSystem-on-Chip
System-on-Chip
 
Chiplets in Data Centers
Chiplets in Data CentersChiplets in Data Centers
Chiplets in Data Centers
 
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation SystemSynopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
Synopsys Fusion Compiler-Comprehensive RTL-to-GDSII Implementation System
 
System on Chip (SoC)
System on Chip (SoC)System on Chip (SoC)
System on Chip (SoC)
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
 
AMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor ArchitectureAMD EPYC™ Microprocessor Architecture
AMD EPYC™ Microprocessor Architecture
 
3D V-Cache
3D V-Cache 3D V-Cache
3D V-Cache
 
eMMC 5.0 Total IP Solution
eMMC 5.0 Total IP SolutioneMMC 5.0 Total IP Solution
eMMC 5.0 Total IP Solution
 
Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power Evaluating UCIe based multi-die SoC to meet timing and power
Evaluating UCIe based multi-die SoC to meet timing and power
 
Physical design
Physical design Physical design
Physical design
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
SOC design
SOC design SOC design
SOC design
 
Physical Design Flow Challenges at 28nm on Multi-million Gate Blocks
Physical Design Flow Challenges at 28nm on Multi-million Gate BlocksPhysical Design Flow Challenges at 28nm on Multi-million Gate Blocks
Physical Design Flow Challenges at 28nm on Multi-million Gate Blocks
 
DDR, GDDR, HBM Memory : Presentation
DDR, GDDR, HBM Memory : PresentationDDR, GDDR, HBM Memory : Presentation
DDR, GDDR, HBM Memory : Presentation
 
MemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXLMemVerge: Past Present and Future of CXL
MemVerge: Past Present and Future of CXL
 
Soc architecture and design
Soc architecture and designSoc architecture and design
Soc architecture and design
 
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge Devices
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge DevicesMIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge Devices
MIPI DevCon 2021: MIPI D-PHY and MIPI CSI-2 for IoT: AI Edge Devices
 
SOC Processors Used in SOC
SOC Processors Used in SOCSOC Processors Used in SOC
SOC Processors Used in SOC
 
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APU
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APUDelivering a new level of visual performance in an SoC AMD "Raven Ridge" APU
Delivering a new level of visual performance in an SoC AMD "Raven Ridge" APU
 

Similar to The Ultimate Guide to HBM2E Implementation & Selection - Frank Ferro - Rambus Design Summit 2020

New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...Filipe Miranda
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsAnand Haridass
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Casesinside-BigData.com
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
 
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010Altera Corporation
 
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World Workloads
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World WorkloadsSupermicro Servers with Micron DDR5 & SSDs: Accelerating Real World Workloads
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World WorkloadsRebekah Rodriguez
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesIntel® Software
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph Community
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIBM Switzerland
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors Rebekah Rodriguez
 
Rambus corporate-overview
Rambus corporate-overviewRambus corporate-overview
Rambus corporate-overviewRambus
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Ontico
 

Similar to The Ultimate Guide to HBM2E Implementation & Selection - Frank Ferro - Rambus Design Summit 2020 (20)

New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
New Generation of IBM Power Systems Delivering value with Red Hat Enterprise ...
 
Heterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of SystemsHeterogeneous Computing : The Future of Systems
Heterogeneous Computing : The Future of Systems
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
 
OpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC SystemsOpenPOWER Acceleration of HPCC Systems
OpenPOWER Acceleration of HPCC Systems
 
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerModular by Design: Supermicro’s New Standards-Based Universal GPU Server
Modular by Design: Supermicro’s New Standards-Based Universal GPU Server
 
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010
AMC & VPX Form Factor Boards With High Speed SERDES: Embedded World 2010
 
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World Workloads
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World WorkloadsSupermicro Servers with Micron DDR5 & SSDs: Accelerating Real World Workloads
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World Workloads
 
HiPEAC-Keynote.pptx
HiPEAC-Keynote.pptxHiPEAC-Keynote.pptx
HiPEAC-Keynote.pptx
 
Demystify OpenPOWER
Demystify OpenPOWERDemystify OpenPOWER
Demystify OpenPOWER
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Ceph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-GeneCeph on 64-bit ARM with X-Gene
Ceph on 64-bit ARM with X-Gene
 
Power overview 2018 08-13b
Power overview 2018 08-13bPower overview 2018 08-13b
Power overview 2018 08-13b
 
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex systemIbm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
Ibm symp14 referent_marcus alexander mac dougall_ibm x6 und flex system
 
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
X13 Pre-Release Update featuring 4th Gen Intel® Xeon® Scalable Processors
 
Resume_A0
Resume_A0Resume_A0
Resume_A0
 
POWER9 for AI & HPC
POWER9 for AI & HPCPOWER9 for AI & HPC
POWER9 for AI & HPC
 
Rambus corporate-overview
Rambus corporate-overviewRambus corporate-overview
Rambus corporate-overview
 
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
Dataplane networking acceleration with OpenDataplane / Максим Уваров (Linaro)
 

Recently uploaded

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

The Ultimate Guide to HBM2E Implementation & Selection - Frank Ferro - Rambus Design Summit 2020

  • 1. HBM2/2E Memory IP Interface Selection and Implementation Frank Ferro Joseph Rodriguez Fall 2020
  • 2. 2 Exponential Data Growth Mandates Increased Bandwidth Source: Adapted from Jeff Dean, “Recent Advances in Artificial Intelligence and the Implications for Computer System Design,” HotChips 29 Keynote, August 2017 More Compute Neural Networks Other Approaches Accuracy Scale (Data Size, Model Size) 1980s – 1990s Now • Exponential data growth is driving the need for new architectures • Advances in computing have pushed bottleneck to memory • Faster compute and large training sets needed for AI applications Annual Size of the Global Datasphere Source: Adapted from Data Age 2025, sponsored by Seagate with data from IDC Global DataSphere, Nov 2018 2020 20252010 2015 20 40 60 80 100 120 140 160 180 175 ZB Zettabytes
  • 3. 3 Two Important Memories for AI/ML Extremely High Bandwidth and High Capacity HBM2E High Bandwidth, High Reliability and Low Latency GDDR6 • AI Training • HPC • Network (NIC) • AI Inference • Graphics • Automotive (ADAS)
  • 4. 4 HBM2E 4G Announcement Rambus HBM2E Interface Operating at 4 Gbps
  • 5. 5 Choosing the Correct Memory: Comparison Data Parameter LPDDR4x LPDDR5 DDR4 GDDR6 HBM2E Bandwidth (Gbps) Low-Medium (136) Medium (204) Medium (200) High (512) Highest (3686) Data Rate (Gbps) 4.266 6.4 3.2 16 3.6 Interface width (bits) 32 32 64 32 1024 Board Area / System Design Large / Medium Medium/ Medium Large / Easy Medium / Medium Small / Complex Efficiency (mW/Gbps) High (3) High (3) Moderate (10) Moderate (10) Highest (2) Cost ($) Medium Medium Low Medium High Reliability/Yield Good Good Good Good Moderate Applications Mobile, AI Mobile, AI Compute, Network AI, Graphics, Auto AI, HPC, Network
  • 6. 6 Rambus HBM2E Solution Summary HBM2E Memory Interface Subsystem Advantages: • Production experience • Hardened timing-closed PHY • System design support: interposer and package • Lab Station development environment: bring-up support HBM2E Interface Summary • Verified HBM2E PHY and controller • 461 GB/s bandwidth (@ 3.6 Gbps) • Speed bins: 2, 2.4, 2.8, 3.2, 3.6,4.0Gbps • DRAMs: Stack height of 2, 4, 8, 12 • Channels: 8x 128 bits • ASIC Interface: DFI style • Lane repair • IEEE 1500 test support • PHY independent mode
  • 7. 7 Interposer Reference Design • Interposer design is a critical component of 2.5D system: Rambus provides reference designs • Support all foundries/OSAT 2.5D manufacturing process • Rambus works with customers to support their interposer/package design for the highest data rates • Channel simulations • Layout reviews and feedback • Channel parameter optimization: • Channel length, width, line spacing and pitch, number of routing/ground layers HBM DRAM Stack Processor HBM Interface Interposer Substrate PCB 1024 Processor HBMInterface Interposer HBM DRAM Stack
  • 8. 8 14LPP 12LP 12LP+ 11LP 7nm Product HBM2 HBM2E HBM2E HBM2E HBM2E Speed 2.0 Gbps 3.2 Gbps 3.2 Gbps 3.2 Gbps 4.0 Gbps Rambus HBM2/2E IP - Market Leadership ~50 Design wins 3.6Gbps 4.0Gbps
  • 9. 9 Rambus HBM2/2E Memory Interface Solution: Controller • Complete, configurable solution • Handles all design, test and bring-up challenges • Fully validated AXI Interface 1 AXI Interface 2 RMW ECC HBM2E Memory Controller Core Multi-PortFront-End Memory Test Mem Test Analyzer HBM2E DRAM Memory Controller HBM2E PHY 8x 128-bit Channels Optional blocks Customer SOC/ASIC PHY Rambus Integrated Memory Subsystem
  • 10. 10 HBM2/2E Controller Core • Supports HBM2 / HBM2E • 4, 8 / 12-high stacks • 4, 8 / 6, 8, 12, 16, 24 Gbit density per channel • 2, 2.4 / 2.8, 3.2, 3.6, 4.0 Gbit/s/pin • Modular, highly configurable solution • Delivered configured to customer requirements to minimize size, power, latency • Memory parameters are run-time programmable • High performance • High bus efficiency across a wide variety of configurations (AXI, native interface) and traffic scenarios (random and sequential accesses; short and long bursts, etc.) • Reliability, Availability, Serviceability (RAS) support including ECC , ECC scrubbing and data path parity protection • Full featured Memory Test support • Algorithmic, Arbitrary, Microcode Programmable address and data pattern options • Delivered fully integrated and verified with the target HBM2/2E PHY
  • 11. 11 HBM2/2E Controller Core Validation • Simulation-Based Verification • UVM Memory Testbench • Over 100 Test Sequences • Vendor (Samsung, SK hynix) and Avery Design Systems memory models • Hardware-Based Validation • Perform testing across a wide range of motherboard and plug-in FPGA-based boards • Utilize Rambus GUI, Command Line App to drive tests • Silicon-proven • Memory system testchips • Controller Core + PHY • Deployed in multiple customer designs
  • 12. 12 • #1 in market share ~50 customer designs • First-time silicon success (no re-spins) • Multiple tier 1 networking and AI/ML customers in production • Very mature solution used in wide range of applications • Performance • Maximum throughput from both a bus efficiency and data rate • First to achieve 4.0 Gbps for HBM2E memory interface • Integrated and verified PHY and Controller solution • PHY and Controller validated in both hardware and software • PHY is a complete hardened macro including, PHY, IO, decap • Provide interposer and package reference design – reduces customer effort and design risk • Strong customer support • Work closely with customer in all project phases (design, tapeout, bring up) • Lab Station development environment accelerates bring up Why Choose Rambus HBM2/2E HBM2E Hardware Development Board Memory WR/RD scope shot

Editor's Notes

  1. Why AI, Why Now? Many of the techniques in use today were developed in the 80’s and 90’s during the last big wave of interest in neural networks. But they never took off back then, and conventional algorithms were used instead -- why is that? There are 2 main reasons: Compute wasn’t fast enough Memory performance and capacity weren’t good enough, resulting in conventional approaches performing better Fast forward a few decades to today, and Moore’s Law has given us 5 orders of magnitude better compute, and 3-4 orders of magnitude of improvement in memory performance and capacity Now neural networks can outperform conventional algorithms Another big reason for the proliferation of neural networks is the large and growing amount of digital data available to train them and improve their performance The world’s digital data is growing exponentially, at a rate faster than technology growth rates for processing, memory, and networks There is an interesting dependence that is developing – neural networks are increasingly needed to make sense of the growing amount of digital data, and this data in turn is needed to train and improve the performance of neural networks We believe this trend will continue in the future The industry has recognized the importance of neural networks by recognizing three distinguished researchers, Yann Le Cun, Geoffrey Hinton, and Yoshua Bengio, with the 2019 ACM Turing Award, often referred to as the Nobel Prize in Computing Neural networks are still in their infancy, and there is much more untapped potential, but much more is performance needed. The irony is that just at the time we need more performance, the tools we’ve relied on for the past several decades, Moore’s Law and Dennard Scaling, are either slowing or have ended. The critical question is how, as an industry, do we move forward?