Designing memory controller for ddr5 and hbm2.0

DESIGNING MEMORY CONTROLLER FOR
DDR5 AND HBM2.0
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com

Before we get started….
All attendees are muted and will stay muted
Use the chat or the “Raise Hand” feature to bring questions to our attention

Agenda
Introduction to DDR5 and HBM2.0
Role of the Memory Controller and the Importance
Parts of Memory Controller Options
Metrics to judge the Quality of Service
Introduction to Architecture Exploration
Role of Architecture Exploration in designing a Memory Controller
Parameters to describe the Memory Controller
Architecture model of a Memory Controller
Experiments
Other Controller Designs
Q&A

Introduction to HBM2.0 and DDR5
High Bandwidth Memory (HBM)
◦ high-speed computer memory interface for 3D-stacked SDRAM
◦ Used in conjunction with performance-sensitive consumer applications, graphics accelerators, network
devices and supercomputers.
◦ HBM2 has up to eight dies per stack and doubles pin transfer rates up to 2 GT/s.
◦ 1024-bit wide access with memory bandwidth per package of 256 GB/.
Double Data Rate 5 Synchronous Dynamic Random-Access Memory
◦ 4800 to 6400 million transfers per second (PC5-38400 to PC5-51200).
◦ Minimum burst length is 16, with the option of "burst chop" after 8 transfers.
◦ number of bank groups is 8, with 4 banks per group
◦ Two independent channels per DIMM.

Role of Memory Controller
Manages the flow of data going to and from the computer's main memory
Shared memory is a key component and major performance bottleneck in multi-core processors
Location of the memory controllers at the interconnect has a major impact on throughput
Memory controller decides which request gets access to the memory, for what duration and in
what order
Bandwidth impact at in-order shared bus connecting the CPUs and memory controller (Article)
◦ Intelligent read-to-write switching memory controller provides same benefit as doubling interleaved
memory ranks
◦ Lower read latency across range of throughput obtained by a delayed write scheduling

Parts of a Memory Controller
Address decoder
Buffer and buffer management
Scheduling algorithm to select the next request
Read and Write channels
Interfaces to processor and DRAM
Signal handling and triggering the refresh

Memory Controller Quality of Service
Latency vs Bandwidth
Bytes per Watt
Buffer occupancy
Algorithm efficiency Maximum bandwidth for target application

Graphical and textual statistics
Statistics and Plots for Accurate Analysis

Introduction to Architecture Exploration
Architecture Exploration
◦ Optimize and validate the system specification
◦ Specification: Processor speed, topology and arbitration
◦ Requirements: Timing, energy, cost, weight and efficiency
Performance Analysis
◦ Buffer size, utilization, throughput and response time
Power Measurement
◦ Peak and average power, energy and power/task
Functional Correctness
◦ Arbitration, software task scheduling and task graph
Failure Analysis
◦ Hardware, Software, network, data, power and logic
Making Better Quality Products

Analysis using Architecture Exploration
Buffer management
Power optimization
Core and processor selection and sizing
Response times for various data sizes and rates
Firmware algorithm selection
Algorithm, arbitration and scheduling design
Credit policy and impact of the flash memory selection on throughput
Memory management
Software Task graph

Performance Evaluation of System
Which Libraries?
1. Only configured Parameter
and data table setting.
• Traffic
• Expression
• MasterDevice
• Bus Arbitor/Bus
• DMA
• RAM
• Processor
• PCIe
• AMBA AXI
• Power Management
2. Need to create script code
• GPU Warp/PE
NXP i.MX6 /
nVIDIA Drive PX
Xilinx FPGA
Kintex 8
Discrete
DMA
ARM A53
GPU
Display Ctrl
SRAM3
DRAM3
Video IN
Parameters

Role of Architecture Exploration in
Memory Controller Design
Two types
◦ Stochastic
◦ Cycle-accurate
Modeling
◦ Incorporates the interface fabric, workloads and the traffic model
◦ Define memory controller algorithm as a delay, order buffer or detailed algorithm
◦ Connect the memory controller into a SoC or embedded system
Simulation
◦ Different scheduling algorithms
◦ Separate or single channel for Read and Write
◦ Buffer size
◦ Clock Speed
◦ Connected DRAM
◦ Number of Masters or cores
Analysis
◦ Generated reports to evaluate the Quality of Service

Parameters to Define Memory controller
Stochastic model
◦ Delay for the controller
◦ Scheduling algorithm with buffer
◦ Memory Width
◦ Buffer length
Cycle-accurate
◦ Address breakdown by bits
◦ Fragmentation of large request
◦ Clock speed, bus width and memory width
◦ Buffer length
◦ Burst length
◦ Timing
◦ Refresh-related attributes
◦ Detailed scheduler design based on address and
buffer settings

Architecture Model of SoC
Master
Fabric
Exploration
Parameters
Memory Controller
DRAM Definition
Reports

Parameter of the Memory Controller
Stochastic
Cycle-Accurate

Power Attributes the Memory Controller
Stochastic
Cycle-Accurate

Experiment with a Traffic Model
9/11/2020 MIRABILIS DESIGN INC. 20
DRAM
Display
IO
A
M
B
A
A
X
I
B
u
s
CPU
GPU
Display
Ctrl
CAN
Packet
Ethernet

Experiment with Detailed models of
Processor, GPU and Interfaces
9/11/2020 MIRABILIS DESIGN INC. 21
DRAM
Display
IO
A
M
B
A
A
X
I
B
u
s
CPU
GPU
Display
Ctrl
P
C
I
e
Video Camera SRAM
Packet
System Overview
◦ Camera ： 30fps, VGA corresponds
◦ CPU ： ARM Cortex-A53 1.2GHz
◦ GPU ： 64Cores(8Warps×8PEs), 32Threads, 1GHz
◦ DisplayCtrl ： DisplayBuffer 293,888Byte
◦ SRAM ： SDR, 64MB, 1.0GHz
◦ DRAM ： DDR3, 64MB, 2.4GHz

Debugging Memory Controller Design
Review the latency, buffer usage and throughput
Compare the memory throughput with the Fabric
Modify attributes of the traffic, Fabric and Memory

Explore Controller for Flash and SSD

ABOUT MIRABILIS DESIGN
Deepak Shankar
Founder
Mirabilis Design Inc.
Email: dshankar@mirabilisdesign.com

About Mirabilis Design
Founded in 2003 and based in Sunnyvale, CA, USA.
Development and support centers in US, India, China, Korea and Czech Republic
Focused on system architecture exploration of electronics, semiconductors and software
40+ customers worldwide in Semiconductors, Aerospace, Computing and Automotive
VisualSim- Modeling and simulation software
Largest source of system modeling IP with embedded timing and power
100’s of man years experience in system design and exploration of digital electronics
Select the “Right” configuration to match customer request

Introduction to VisualSim Architect
◦ Architect processors, hardware
systems, software and network
◦ Map algorithms on integrated
and distributed systems
◦ Compute resource requirements
for application task graphs
◦ Test compliance to standards and
generation of diagnostics
Timing and
Throughput
Power
measurement,
management
and Battery
Entire EE to
Semiconductor
Functional and
Safety Analysis
Libraries
Hardware,
Software and
Network
Graphical
Modeling
Functional, timing and power analysis to existing Model-based System Design

Largest Systems-Level Model Library
Largest library of traffic, resources, hardware, software and analysis
Traffic
• Distribution
• Sequence
• Trace file
• Instruction profile
Reports
• Timing and Buffer
• Throughput/Util
• Ave/peak power
• Statistics
Power
• State power table
• Power
management
• Energy harvesters
• Battery
• RegEx operators
SoC Buses
• AMBA and Corelink
• AHB, AB, AXI, ACE,
CHI, CMN600
• Network-on-Chip
• TileLink
System Bus
• PCI/PCI-X/PCIe
• Rapid IO
• AFDX
• OpenVPX
• VME
• SPI 3.0
• 1553B
Processors
• GPU, DSP, mP and mC
• RISC-V
• Nvidia- Drive-PX
• PowerPC
• X86- Intel and AMD
• DSP- TI and ADI
• MIPS, Tensilica, SH
ARM
• M-, R-, 7TDMI
• A8, A53, A55, A72,
A76, A77
Custom Creator
• Script language
• 600 RegEx fn
• Task graph
• Tracer
• C/C++/Java
• Python
Support
• Listener and
Trace
• Debuggers
• Assertions
Stochastic
• FIFO/LIFO Queue
• Time Queue
• Quantity Queue
• System Resource
• Schedulers
• Cyber Security
RTOS
• Template
• ARINC 653
• AUTOSAR
Memory
• Memory Controller
• DDR DRAM 2,3,4, 5
• LPDDR 2, 3, 4
• HBM, HMC
• SDR, QDR, RDRAM
Storage
• Flash & NVMe
• Storage Array
• Disk and SATA
• Fibre Channel
• FireWire
Networking
• Ethernet & GiE
• Audio-Video Bridging
• 802.11 and Bluetooth
• 5G
• Spacewire
• CAN-FD
• TTEthernet
• FlexRay
• TSN & IEEE802.1Q
FPGA
• Xilinx- Zynq, Virtex, Kintex
• Intel-Stratix, Arria
• Microsemi- Smartfusion
• Programmable logic
template
• Interface traffic generator
Software
• GEM5
• Software code integration
• Instruction trace
• Statistical software model
• Task graph
Interfaces
• Virtual Channel
• DMA
• Crossbar
• Serial Switch
• Bridge
RTL-like
• Clock, Wire-Delay
• Registers, Latches
• Flip-flop
• ALU and FSM
• Mux, DeMux
• Lookup table

Designing memory controller for ddr5 and hbm2.0

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Designing memory controller for ddr5 and hbm2.0

Similar to Designing memory controller for ddr5 and hbm2.0 (20)

More from Deepak Shankar

More from Deepak Shankar (12)

Recently uploaded

Recently uploaded (20)

Designing memory controller for ddr5 and hbm2.0