HBM stands for high bandwidth memory and is a type of memory interface used in 3D-stacked DRAM (dynamic random access memory) in GPUs, as well as the server, machine-learning DSP , high-performance computing and networking and client space.
2. Contents
Introduction
HBM2 and HBM2E
HBM Features
Global and each Channel signals
HBM Operating Modes
HBM Channel Addressing
Mode Register set
Write and Read Operation
ROW Commands
COLUMN Commands
References
3. Introduction
• HBM stands for high bandwidth memory and is a type of memory interface
used in 3D-stacked DRAM (dynamic random access memory) in GPUs, as
well as the server, machine-learning DSP , high-performance computing
and networking and client space.
• HBM uses less power and posts higher bandwidth than on DDR4 or
GDDR5 memory with smaller chips.
• HBM technology works by vertically stacking memory chips on top of one
another. The memory chips are connected through through-silicon vias
(TSVs) and microbumps.
• High bandwidths are essential for the complex AI/ML algorithms needed to
rapidly execute massive calculations and safely implement real-time
decisions on the road.
4.
5. HBM2 and HBM2E
• HBM memory stack consists of five chips four storage dies above a single logic
die that controls them and speed upto 128 GBps.
• HBM2 debuted in 2016, and in December 2018 the JEDEC updated the HBM2
standard.
• The HBM2 standard called for up to 8 dies in a stack (as with HBM) with an
overall bandwidth of 256 GBps.
• The HBM2E standard allows up to 12 dies per stack for a max capacity of 24GB.
The standard also pegs memory bandwidth at 307 GBps, delivered across a
1,024-bit memory interface separated by 8 unique channels on each stack.
• Outstanding bandwidth, capacity and latency in a powerefficient, compact footprint
make HBM2E memory a superior choice for AI training hardware.
• According to an Ars Technica report, HBM3 is expected to support up to 64GB
capacities and speeds up to 512 GBps.
6. HBM Features
• 2n prefetch architecture with 256 bits per memory read and write access
• BL = 2 and 4 · 128 DQ width + Optional ECC pin support/channel
• Legacy Mode and Pseudo Channel Mode Operation; (64 DQ width for Pseudo
Channel Mode)
• Differential clock inputs (CK_t/CK_c)
• DDR commands entered on each positive CK_t, CK_c edge. Row Activate
commands require two cycles. All other commands are one cycle command.
• Semi-independent Row & Column Command Interfaces allowing
Activates/Precharges to be issued in parallel with Read/Writes.
• Data referenced to strobes RDQS_t/RDQS_c and WDQS_t/WDQS_c. 1 strobe pair
per DWORD.
• Semi-independent Row & Column Command Interfaces allowing
Activates/Precharges to be issued in parallel with Read/Writes.
7. • Data referenced to strobes RDQS_t/RDQS_c and WDQS_t/WDQS_c. 1 strobe pair per
DWORD. ·
• Up to 8 channels/stack
• 8 or 16 banks per channel; varies by device density/channel
• Bank Grouping supported · 2K or 4K Bytes per page; varies by device density/channel
• DBIac support configurable via MRS
• Data mask for masking WRITE data per byte
• Self Refresh Modes · I/O voltage 1.2 V
• DRAM core voltage 1.2 V, independent of I/O voltage
• Channel density of 1 Gb to 32 Gb
• Unterminated data/address/cmd/clk interfaces
• Temperature sensor with 3-bit encoded range output
8. • Each channel provides access to an independent set of DRAM banks.
• Channels are independently clocked, and need not be synchronous.
• Each channel consists of an independent command and data
interface. RESET, IEEE1500 test port and power supply signals are
common to all channels.
• no channel may access the memory storage for a different channel.
• Each channel interface provides an independent interface to a
number of banks of DRAM of a defined Page size.
10. HBM operating Modes
• HBM DRAM defines two mode of operation depending on channel
density. 1)Legacy Mode and 2)Pseudo Channel Mode
• Legacy mode provides 256 bit prefetch per memory Read and Write
access. Address bit BA4 is a “Don’t Care” in this mode.
• Pseudo Channel mode divides a channel into two individual sub-channels
of 64 bit I/O each, providing 128 bit prefetch per memory Read and
Write access for each Pseudo channel.
• Both Pseudo channels operate semi independent.
• Both Pseudo channels also share the channel’s mode registers. All I/O
signals of DWORD0 and DWORD1 are associated with Pseudo channel 0,
and all I/O signals of DWORD2 and DWORD3 with Pseudo channel 1.
13. Mode Registers
• The bank group feature is configurable via MRS(Mode Registers set).
• All mode registers are programmed via the Mode Register Set (MRS)
command and retain the stored information until they are
reprogrammed, chip reset, or until the device loses power.
• Mode registers must be loaded when all banks are idle and no bursts are
in progress;
15. Operation
• Clocking overview:
• The HBM device captures data on row bus and column bus using differential
CK_t/CK_c.
• The HBM device has uni-directional differential Write strobes (WDQS_t/WDQS_c)
and Read strobes (RDQS_t/RDQS_c) per 32 DQ bits (DWORD).
• HBM Write Data Mask (DM) and Data Bus Inversion (DBIac) Function:
• HBM device supports Data Mask (DM) function for Write operation and Data Bus
Inversion (DBIac) function for Write and Read operation.
• DBI pin is a bi-directional DDR pin and is sampled along with the DQ signals for
Read and Write operation.
• DM pin is bi-directional DDR pin and is sampled along with DQ signals for Read or
Write operation; however DM is input only and is only used for Write operation.
16. Write & Read Operation
Write Operation:
• HBM device inverts Write data received on the DQ inputs in case DBI is
sampled HIGH, or leaves the Write data non-inverted in case DBI is
sampled LOW. Note that DM input is not affected by the DBIac function.
Read Operation:
• HBM device counts the number of DQ signals that are transitioning from
previous state. Note that DM output is not affected by the DBIac
function.
• The HBM device inverts Read data and sets DBI HIGH when the number
of transitioning data bits within a byte is greater than 4, or when the
number of transitioning data bits within a byte equals 4 and DBI was
High; otherwise the HBM device does not invert the Read data and sets
DBI LOW.
19. ROW Commands
• Row No Operation Command (RNOP):
• Bank and Row ACTIVATE Command (ACT):
• Precharge Command (PRE/PREALL)
• AUTO PRECHARGE
• REFRESH Command (REF)
• SINGLE BANK REFRESH Command (REFSB)
20. • Row No Operation Command (RNOP):
• It is used to instruct the HBM device to perform a NOP as row command; this
prevents unwanted row commands from being registered during idle or wait
states.
• Bank and Row ACTIVATE Command (ACT):
• Before a READ or WRITE command can be issued to a bank, a row in that bank
must be opened. This is accomplished via the ACTIVATE command, which selects
both the bank and the row to be activated. Once a row is open, a READ or WRITE
command could be issued to that row.
• Precharge Command (PRE/PREALL):
• The PRECHARGE command is used to deactivate the open row in a particular
bank (PRE) or the open rows in all banks (PREALL). The bank(s) will be in idle state
and available for a subsequent row access a specified time tRP after the
PRECHARGE command is issued.
21. • AUTO PRECHARGE:
• Auto Precharge is a feature which performs the same individual-bank precharge
function described as Precharge, but without requiring an explicit PRECHARGE
command.
• REFRESH Command (REF):
• The REFRESH command is used during normal operation of the HBM device. The
command is received on the row command inputs R[5:0] and requires a CNOP
command on the column command inputs C[7:0].
• Parity is evaluated with the REFRESH command when the parity calculation is
enabled in the Mode Register.
• SINGLE BANK REFRESH Command (REFSB):
• The SINGLE BANK REFRESH command provides an alternative solution for the
refresh of the HBM device. The command initiates a refresh cycle on a single
bank while accesses to other banks including writes and reads are not affected.
23. • Column No Operation Command (CNOP):
• It is used to instruct the HBM device to perform a NOP as column command; this
prevents unwanted column commands from being registered during idle or wait states.
• Read Command (RD, RDA):
• A read burst is initiated with a READ command . The bank and column addresses are
provided with the READ command and auto precharge is either enabled or disabled for
that access.
• Parity is evaluated with the READ command when the parity calculation is enabled in the
Mode Register.
• Write Command (WR, WRA):
• A Write burst is initiated with a WRITE command. The bank and column addresses are
provided with the WRITE command and auto precharge is either enabled or disabled for
that access.
• Parity is evaluated with the WRITE command when the parity calculation is enabled in
MR0
24. • Mode Register Set (MRS):
• The MODE REGISTER SET (MRS) command is used to load the Mode Registers of
the HBM device. The command is received on the column command inputs C[7:0]
and requires a RNOP command on the row command inputs R[5:0].
• Power-Mode Commands:
• Power-Down is entered when CKE is registered LOW along with RNOP and CNOP
commands.
• CKE must not go LOW when read or write operations are in progress.
• CKE can go LOW while any other operations such as row activation, precharge,
auto precharge, or refresh are in progress, but the power-down specification will
not apply until such operations are complete.
• Self-Refresh (SRE, SRX):
• Self-refresh can be used to retain data in the HBM device, even if the rest of the
system is powered down. When in the self-refresh mode.