DDR4 introduces new features such as bank grouping, cyclic redundancy check, and multi-purpose registers to improve performance and reliability over DDR3. It supports higher densities up to 16Gb and speeds up to 3.2Gbps while lowering the voltage. Training flows are also enhanced in DDR4 with the addition of internal voltage reference training, preamble training, and more options for data pattern testing using multi-purpose registers during initialization.
The U-Boot is an "Universal Bootloader" ("Das U-Boot") is a monitor program that is under GPL. This production quality boot-loader is used as default boot loader by several board vendors. It is easily portable and easy to port and to debug by supporting PPC, ARM, MIPS, x86,m68k, NIOS, Microblaze architectures. Here is a presentation that introduces U-Boot.
IDT DDR4 RCD register and DB data buffer enable RDIMM and LRDIMM to faster speeds and deeper memories. This video helps you understand the DDR4 feature enhancements of IDT's DDR4 RCD and DB compared to earlier DDR3 technology. An introduction into some available LeCroy testing and debug tools completes the video. Presented by Douglas Malech, Product Marketing Manager at IDT and Mike Micheletti, Product Manager at Teledyne LeCroy. To learn more about IDT's leading portfolio of memory interface products, visit www.idt.com/go/MIP.
Highlighted notes while studying Concurrent Data Structures:
DDR4 SDRAM
Source: Wikipedia
Double Data Rate 4 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR4 SDRAM, is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
The U-Boot is an "Universal Bootloader" ("Das U-Boot") is a monitor program that is under GPL. This production quality boot-loader is used as default boot loader by several board vendors. It is easily portable and easy to port and to debug by supporting PPC, ARM, MIPS, x86,m68k, NIOS, Microblaze architectures. Here is a presentation that introduces U-Boot.
IDT DDR4 RCD register and DB data buffer enable RDIMM and LRDIMM to faster speeds and deeper memories. This video helps you understand the DDR4 feature enhancements of IDT's DDR4 RCD and DB compared to earlier DDR3 technology. An introduction into some available LeCroy testing and debug tools completes the video. Presented by Douglas Malech, Product Marketing Manager at IDT and Mike Micheletti, Product Manager at Teledyne LeCroy. To learn more about IDT's leading portfolio of memory interface products, visit www.idt.com/go/MIP.
Highlighted notes while studying Concurrent Data Structures:
DDR4 SDRAM
Source: Wikipedia
Double Data Rate 4 Synchronous Dynamic Random-Access Memory, officially abbreviated as DDR4 SDRAM, is a type of synchronous dynamic random-access memory with a high bandwidth ("double data rate") interface.
Wikipedia is a free online encyclopedia, created and edited by volunteers around the world and hosted by the Wikimedia Foundation.
AMBA AHB is a bus interface suitable for high-performance synthesizable designs. It defines the interface between components, such as masters, interconnects, and slaves.
U-Boot, a boot loader for Embedded boards based on PowerPC, ARM, MIPS and several other processors, which can be installed in a boot ROM and used to initialize and test the hardware or to download and run application
code.
This presentation is about U-Boot: the most popular open source, primary boot loader used in embedded devices, as well as it's mechanisms and features.
The respective talk was held by Sam Protsenko (Software Engineer, Consultant, GlobalLogic) at GlobalLogic Mykolaiv Embedded TechTalk #1 on May 25, 2018.
HBM stands for high bandwidth memory and is a type of memory interface used in 3D-stacked DRAM (dynamic random access memory) in GPUs, as well as the server, machine-learning DSP , high-performance computing and networking and client space.
LCU14 302- How to port OP-TEE to another platformLinaro
LCU14 302- How to port OP-TEE to another platform
---------------------------------------------------
Speaker: Joakim Bech, Jens Wiklander and Pascal Brand
Date: September 17, 2014
---------------------------------------------------
★ Session Summary ★
OP-TEE (Open Portable Trusted Execution Environment) is the source code for the TEE in Linux using the ARM Trustzone technology. This component meets the Global Platform TEE System Architecture specification. Most of the code is generic. However, a number of platform specific characteristics are addressed, such as memory layout or board specific hardware IP. In this session, you can learn the steps to follow to port OP-TEE for your armv7 platform, as the ones that have been defined when porting OP-TEE to A80 (SWG-77). OP-TEE to the Allwinner A80 platform
---------------------------------------------------
★ Resources ★
Zerista: http://lcu14.zerista.com/event/member/137748
Google Event: https://plus.google.com/u/0/events/cnd044lmnid6jcoj1a9svlhmkj0
Video: https://www.youtube.com/watch?v=QgaGJow7hws&list=UUIVqQKxCyQLJS6xvSmfndLA
Etherpad: http://pad.linaro.org/p/lcu14-302
---------------------------------------------------
★ Event Details ★
Linaro Connect USA - #LCU14
September 15-19th, 2014
Hyatt Regency San Francisco Airport
---------------------------------------------------
http://www.linaro.org
http://connect.linaro.org
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
The Cisco 892F ISRs have an SFP port that supports auto-media-detection, auto-failover, and remote fault indication (RFI), as described in the IEEE 802.3ah specification.
AMBA AHB is a bus interface suitable for high-performance synthesizable designs. It defines the interface between components, such as masters, interconnects, and slaves.
U-Boot, a boot loader for Embedded boards based on PowerPC, ARM, MIPS and several other processors, which can be installed in a boot ROM and used to initialize and test the hardware or to download and run application
code.
This presentation is about U-Boot: the most popular open source, primary boot loader used in embedded devices, as well as it's mechanisms and features.
The respective talk was held by Sam Protsenko (Software Engineer, Consultant, GlobalLogic) at GlobalLogic Mykolaiv Embedded TechTalk #1 on May 25, 2018.
HBM stands for high bandwidth memory and is a type of memory interface used in 3D-stacked DRAM (dynamic random access memory) in GPUs, as well as the server, machine-learning DSP , high-performance computing and networking and client space.
LCU14 302- How to port OP-TEE to another platformLinaro
LCU14 302- How to port OP-TEE to another platform
---------------------------------------------------
Speaker: Joakim Bech, Jens Wiklander and Pascal Brand
Date: September 17, 2014
---------------------------------------------------
★ Session Summary ★
OP-TEE (Open Portable Trusted Execution Environment) is the source code for the TEE in Linux using the ARM Trustzone technology. This component meets the Global Platform TEE System Architecture specification. Most of the code is generic. However, a number of platform specific characteristics are addressed, such as memory layout or board specific hardware IP. In this session, you can learn the steps to follow to port OP-TEE for your armv7 platform, as the ones that have been defined when porting OP-TEE to A80 (SWG-77). OP-TEE to the Allwinner A80 platform
---------------------------------------------------
★ Resources ★
Zerista: http://lcu14.zerista.com/event/member/137748
Google Event: https://plus.google.com/u/0/events/cnd044lmnid6jcoj1a9svlhmkj0
Video: https://www.youtube.com/watch?v=QgaGJow7hws&list=UUIVqQKxCyQLJS6xvSmfndLA
Etherpad: http://pad.linaro.org/p/lcu14-302
---------------------------------------------------
★ Event Details ★
Linaro Connect USA - #LCU14
September 15-19th, 2014
Hyatt Regency San Francisco Airport
---------------------------------------------------
http://www.linaro.org
http://connect.linaro.org
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
The Cisco 892F ISRs have an SFP port that supports auto-media-detection, auto-failover, and remote fault indication (RFI), as described in the IEEE 802.3ah specification.
IMAX3: Amazing Dataflow-Centric CGRA and its Applications
I present this slide to all hungry engineers who are tired of CPU, GPU, FPGA, tensor core, AI core, who want some challenging one with no black box inside, and who want to improve by themselves.
Charles Zhang from Phytium previewed the 64 Core Phytium chip at the 2015 Hot Chips conference. This week, the ARM chip was unveiled with a prototype server at Hot Chips 2016.
1. DDR4 Mini Workshop
DDR4 Mini Workshop
DDR4 Mini Workshop
DDR4 Mini Workshop
Server Memory Forum 2011
Server Memory Forum 2011
2. Disclaimer
Disclaimer
E th h M j it f DDR4 h b d fi d/Fi d th
Even though Majority of DDR4 spec has been defined/Fixed, there
might be chance of update before publicationt
Rev A is expected to be in publice 1H’12 and the contents of this
material might be changed in Final version
g g
3. DDR4 Outlook
DDR4 Outlook
DDR4 adopts evolutionary path with High BW & reliability scheme
DDR4 adopts evolutionary path with High BW & reliability scheme
Spec items DDR3 DDR4
D it / S d
512Mbp~8Gb 2Gb~16Gb
Density / Speed
512Mbp 8Gb
1.6~2.1Gbps
2Gb 16Gb
1.6~3.2Gbps
Voltage
(VDD/VDDQ/VPP)
1.5V/1.5V/NA
(1.35V/1.35V/NA)
1.2V/1.2V/2.5V
Vref External Vref (VDD/2) Internal Vref (need training)
Interface
Vref External Vref (VDD/2) Internal Vref (need training)
Data IO CTT (34ohm) POD (34ohm)
CMD/ADDR IO CTT CTT
Strobe Bi-dir / diff Bi-dir / diff
Strobe Bi dir / diff Bi dir / diff
Core
architect
# of banks 8Banks 16Banks (4BG)
Page size(X4/8/16) 1KB / 1KB / 2KB 512B / 1KB / 2KB
architect
# prefetch 8bits 8bits
Added function RESET/ZQ/Dynamic ODT + CRC/DBI/Multi preamble ..
Package type/balls
(X4 8/X16)
78 / 96 BGA 78 / 96 BGA
Physical
(X4,8/X16)
78 / 96 BGA 78 / 96 BGA
DIMM type R,LR,U,SoDIMM + ECC SoDIMM
DIMM pins 240 (R,LR,U) / 204 (So) 284 (R,LR,U) / 256 (So)
4. Bank Group alleviates the needs to increase core frequency from
Bank Group
Bank Group
p q y
previous gen. products to DDR4
DDR DDR2 DDR3 DDR4
BL 2 4 8 8 with BG
Bank 0 Bank 4
Group 0 Group 1
Bank 0 Bank 0
Group 0
BL 2 4 8 8 with BG
Core freq. 200MHz 200MHz 200MHz 200MHz
Bank 0
Bank 1
Bank 2
Bank 4
Bank 5
Bank 6
Bank 0
Bank 1
Bank 2
Bank 0
Bank 1
Bank 2
Bank 3
Sense Amp
Bank 7
Sense Amp
32/64bit
32/64bit
32/64bit
32/64bit
Bank 3
Sense Amp
Bank 3
Sense Amp
64bit
64bit
64bit
64bit
Bank 8
Bank 9
Sense Amp
Bank 12
Bank 13
Sense Amp
Bank 4
Bank 5
Sense Amp
Bank 4
Bank 5
Sense Amp
Bank 10
Bank 11
Bank 14
Bank 15
Group 2 Group 3
Bank 6
Bank 7
Bank 6
Bank 7
Group 1
<X4/X8 16 B k ith 4 B k G > <X16 8 B k ith 2 B k G >
<X4/X8: 16 Banks with 4 Bank Group> <X16: 8 Banks with 2 Bank Group>
5. Bank Group defines tCCD / tRRD parameters differently between
Bank Group
Bank Group
p / p y
same bank group and different bank group
Data rate
1600 1866 2133 2400
(Mbps)
1600 1866 2133 2400
tCCD_L
(Diff BG)
5nCK 5nCK 6nCK 6nCK
(Diff BG)
tCCD_S
(Same BG)
4nCK 4nCK 4nCK 4nCK
Column
CMD A0 B0 B1
tCCD_S tCCD_L
A0 & B0 are in different BG, B0 & B1 are in same BG
6. Fine Granularity
Fine Granularity Refresh
Refresh
As density increases, tRFC gets longer and limits performance
y , g g p
Added new function supporting X2/X4 more frequent refresh with
shorter tRFC
Fine granularity refresh shows better performance
Fine granularity refresh shows better performance
Case: 2Gb based 2Rank system Source : IBM JEDEC material
7. DDR4 supports WRITE CRC to assure better reliability in system
CRC
CRC(Cyclic Redundancy
(Cyclic Redundancy Check)
Check)
DDR4 supports WRITE CRC to assure better reliability in system
Data bits are followed by CRC bits
<CRC data bit mapping for x8 device> <CRC data bit mapping for x4 device>
CRC polynomial is the ATM-8 HEC (X^8+X^2+X^1+1) and 8 CRC
bits are generated from 72 data+DBI bits
Same polynomial as GDDR5
In BC4 case, chopped 4UI will be treated as “1”
With CRC enabled, using burst ordering with A0:A1 is limited (Fixed to 0:0)
- 7
9. DBI
DBI & DM
& DM
DBI : With DBI Enabled via MRS, DDR4 converts Data byte lane
DQ[0 7] UI DBI Aft C ti D t # f “0” i l di DBI
DBI : With DBI Enabled via MRS, DDR4 converts Data byte lane
per UI. Therefore, it increases # of “1” which doesn’t have DC
current path in VDDQ Termination
DQ[0:7] per UI DBI After Converting Data # of “0” including DBI
0000_0000 0 1111_1111 1
0000_0001 0 1111_1110 2
0000 0011 0 1111 1100 3
0000_0011 0 1111_1100 3
0000_0111 0 1111_1000 4
0000_1111 1 0000_1111 4
0001 1111 1 0001 1111 3
0001_1111 1 0001_1111 3
0011_1111 1 0011_1111 2
0111_1111 1 0111_1111 1
1111 1111 1 1111 1111 0
_ _ 0
• DBI can be enabled separated for each READ/WRITE with MRS option
• If DM is enabled, then WRITE DBI is not supported
• For Read, CRC & DBI can be supported at the same time
, pp
10. The following is example of training @initialization based upon
High Level Training/Calibration Flow
High Level Training/Calibration Flow
The following is example of training @initialization based upon
supported function in JEDEC spec
Actual training/calibration sequence would be different depending on controller
E t d DDR4 Fl
Current Typical DDR3 System
Example
Expected DDR4 Flow
Power Up
Power Up
Power Up
MRS Setting
MRS Setting
ZQ Calibration
ZQ Calibration
ZQ Calibration
Leveling
(Read Write)
Leveling
(Read Write)
( )
IO Training
More MPR Pattern (832)
IO Training
Vref Training
Preamble Training
DQ training w/ MPR
g
11. Calibration/Training :
Calibration/Training : Vref
Vref &
& DQ
DQ
DDR4 changed data line
termination from CTT to POD to
save IO Power
save IO Power
One downside of POD in multiple DIMM
configuration, Vref level in each rank
might be different
[source: DDR4 TG]
g
Therefore, DDR4 has internal
VrefDQ which requires training
VrefDQ which requires training
from Host which requires at
initialization
f S /
Vref Step increase/decrease will be
controlled via MRS Opcode
1 Vref Step size is 0.65% of VDDQ
1 St lti l t i /d
1 Step or multiple step increase/decrease
is allowed for efficient calibration time
12. Calibration/Training : Preamble Training,
Calibration/Training : Preamble Training, MPR
MPR
DDR4 supports READ Preamble Training via MRS to have better
Host enables Rx @ 1st edge : the edge will be different
DDR4 supports READ Preamble Training via MRS to have better
fine alignment of HOST Rx enable time
With this mode, Host can detect when Host Rx should be enabled to get READ
Data from DRAM Host enables Rx @ 1 edge : the edge will be different
depending on board configuration
DDR4 assigned more MPR for DQ training and also it’s Re-
writable
DDR3 MPR is Read only
DDR3 DDR4 Note
# of MPR 8Bits (1Register) 32bits (4Register)
Predefined Pattern 01010101
MPR0: 01010101
MPR1: 00110011
MPR2: 00001111
MPR3: 00000000
13. Calibration/Training : Preamble Training,
Calibration/Training : Preamble Training, MPR
MPR
DDR4 Supports 3 way of DQ Link Training with 4 MPR as follows
1) Serial Readout : Predefined pattern or
Re-writed pattern is returned to Host Serial
2) Parallel Readout : Predefined pattern or Re-
writed pattern is returned to Host Parallel manner
DDR4 Supports 3 way of DQ Link Training with 4 MPR as follows
p
manner
3) Staggered Readout : Predefined pattern or Re-writed pattern is returned to Host Staggered manner
as following
Stagger UI0-7 UI8-15 UI16-23 UI24-31 UI32-39 UI40-47 UI48-55 UI56-63
DQ0 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3
DQ1 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0
DQ2 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1
DQ3 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2
DQ4 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3
DQ5 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0
DQ6 MPR2 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1
DQ7 MPR3 MPR0 MPR1 MPR2 MPR3 MPR0 MPR1 MPR2
14. Calibration/Training : Preamble Training,
Calibration/Training : Preamble Training, MPR
MPR
MPR can be re-writed via Address Line
Precharge All
Memory Core
Precharge All
MRS to set path to MPR
MPR0/MPR1
MRS
Write Command
-BA1/BA0 indicte the MPR Location
MPR0/MPR1
/MPR2/MPR3
MRS -BA1/BA0 indicte the MPR Location
-- A[7:0] = Data for MPR
DQs [ ]
A[7:0]
Read Command
- MR3 (A12,A11) to select Readout mode
- Serial / Parallel / Staggered
- Serial / Parallel / Staggered